ElevenLabs Introduces Scribe, a Standalone AI Speech-to-Text Model

ElevenLabs, the AI startup recently valued at $3.3 billion after securing a $180 million funding round, has launched Scribe, its first standalone speech-to-text model. Known for its advancements in audio generation, the company is now expanding into speech recognition, positioning itself as a competitor to Gladia, Speechmatics, AssemblyAI, Deepgram, and OpenAI’s Whisper.

Mati Staniszewski, CEO of ElevenLabs, stated that Scribe was developed to improve speech detection accuracy across multiple languages. He explained that while many consider speech-to-text a solved problem, there are still significant gaps in performance for numerous languages. He emphasized that ElevenLabs’ in-house data annotation and rapid feedback processes enable the company to build more precise models.

Scribe supports over 99 languages at launch, with 25 categorized under the highest accuracy tier, boasting a word error rate below 5%. English leads with a claimed 97% accuracy rate, alongside French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese. The model has outperformed Google Gemini 2.0 Flash and Whisper Large V3 in FLEURS & Common Voice benchmark tests, demonstrating its competitive edge in speech recognition.

Designed initially as part of ElevenLabs’ conversational AI platform, Scribe is now available as a standalone product, featuring smart speaker diarization, word-level timestamps for precise subtitles, and automatic tagging of sound events such as audience laughter. Users can transcribe video content directly within the company’s AI studio, facilitating subtitle and caption generation. Currently limited to pre-recorded audio formats, a low-latency real-time version is set to launch soon, enabling use in meetings and live voice note-taking.

ElevenLabs has priced Scribe at $0.40 per hour of transcribed audio, offering a competitive rate against existing market solutions. While some rivals provide lower-cost alternatives, the company aims to differentiate itself through superior accuracy and feature integration.

Need Deeper Intelligence on the AI Market?

AI Insider's Market Intelligence platform tracks funding rounds, competitive landscapes, and technology trends across the global AI ecosystem in real time. Get the data and insights your organization needs to make informed decisions.

Related Articles

OpenAI Acquires TBPN to Expand AI Media and Communications Strategy

OpenAI has acquired Technology Business Programming Network (TBPN), marking its first acquisition of a media company as it looks to expand how artificial intelligence is

Microsoft AI Launches Multimodal Foundation Models to Expand In-House AI Capabilities

Microsoft AI has announced the release of three new multimodal foundation models designed to generate text, voice, and images, marking a continued expansion of its

VerbaFlo Announces $7M in Funding to Expand AI Leasing and Communications Platform for Student Housing and Multifamily Operators

Insider Brief PRESS RELEASE — VerbaFlo, an AI communications platform built for student housing and multifamily operators, announced it has raised a $7 million seed

Stay Updated with AI Insider

Get the latest AI funding news, market intelligence, and industry insights delivered to your inbox weekly.

Subscribe today for the latest news about the AI landscape