Insider Brief
- IBM is integrating speech-to-text and text-to-speech capabilities from Deepgram into its watsonx Orchestrate platform, making Deepgram its first dedicated voice partner.
- The collaboration embeds real-time transcription, multilingual support and natural-sounding speech synthesis into IBM’s generative AI workflows to support conversational agents, automated customer care and voice-driven enterprise applications.
- The move reflects rising enterprise demand for scalable, low-latency voice interfaces capable of operating in complex, real-world audio environments across regulated industries.
IBM is adding enterprise voice capabilities to its generative AI stack through a new collaboration with Deepgram.
IBM said it will embed Deepgram’s speech-to-text and text-to-speech technology into its watsonx Orchestrate platform, marking Deepgram as IBM’s first dedicated voice partner. As organizations expand conversational AI deployments, the integration is designed to support enterprise-grade transcription, real-time captioning and voice-driven workflows.
“Voice is rapidly becoming the default interface between humans and technology, and enterprise deployments require a real-time platform that is accurate, low latency, and reliable at scale,” Deepgram CEO and Co-Founder Scott Stephenson said in a statement.
According to IBM and Deepgram, the combined offering will address common challenges in real-world audio environments such as background noise, diverse accents and multilingual requirements. The system supports a wide range of languages and dialects, including multiple Arabic and Indian variants, plus options for custom tuning, low-latency transcription and natural-sounding synthesized speech.
What Does the Collaboration Mean for Customers?
“This collaboration aims to help enterprise organizations accelerate their AI initiatives and reinforces IBM’s open ecosystem, bringing choice and cutting-edge voice technology to partners and customers,” noted IBM’s President of AI Technology Partnerships Nick Holda.
The companies said the integration will enable use cases such as:
- Automated customer support and call analysis
- Real-time captioning and meeting transcription
- Voice-enabled data entry in healthcare and financial services
- Conversational digital agents powered by natural speech
IBM said embedding Deepgram’s APIs into watsonx Orchestrate’s Agent Builder will allow enterprises to build voice-enabled workflows on a real-time infrastructure designed for scale. For Deepgram, the partnership expands distribution through IBM’s enterprise customer base and hybrid cloud ecosystem.