Cohere Transcribe: New Open-Source ASR Challenging Leaders

Cohere, the Canadian AI company, has launched an open-source Automatic Speech Recognition (ASR) model named Transcribe. Cohere claims their new model significantly outperforms existing solutions on the Hugging Face Open ASR Leaderboard. Transcribe achieves an average word error rate of 5.42%, surpassing OpenAI's Whisper Large v3 and ElevenLabs' Scribe v2. This development suggests a potential shift in the competitive landscape for speech recognition technology.

With 2 billion parameters, Transcribe also reportedly offers superior throughput compared to other models of a similar size. This means Transcribe can process audio data more rapidly than many of its counterparts, an important factor for real-time applications. The model currently supports 14 languages, including English, German, French, and Japanese.

Cohere has released Transcribe under the Apache 2.0 license, making it available on Hugging Face. Access is also provided via Cohere's API and its Model Vault. Furthermore, Transcribe is integrated into Cohere's North AI agent platform. This multi-faceted release strategy aims to broaden the adoption of their new ASR technology.

For business leaders and entrepreneurs, the implications of Cohere Transcribe are substantial. The introduction of a high-performing, open-source ASR model lowers the barrier to entry for implementing voice-enabled interfaces and transcription services. If your organization has not yet explored voice technologies, Transcribe presents a compelling option. Its combination of performance and open accessibility offers significant opportunities for optimizing business operations and enhancing customer experience.

Source: the-decoder.com →

Rate this material

★ ★ ★ ★ ★

CohereTranscribeASRspeech recognitionopen-source