Speech-to-Text (STT)

Continuous and multilingual Speech-to-Text technology to turn sentences or
discussions into written text based on machine learning model determination.

Try the Voice Development KitContact Sales Team

What is Speech-to-Text ?

Speech-to-Text is a voice technology, based on deep learning language models, that is used to transform audio signals into transcribed text.

The results are statistically determined regarding the most frequent sentence structures and word occurrence regarding the context identified.

Some use cases made real with Speech-to-Text

Speech-to-Text is the foundation of speech recognition and voice assistants that we know. This technology is designed to be paired with other solutions to produce innovative voice use cases.

Voice Transcription

Automatic voice transcription of discussions and meetings to be processed as voice dictation with specific speech recognition models.

Voice Commands

Turning voice into text to be interpreted by NLP/NLU engines to identify the user’s intent for voice commands.

Voice Messaging Systems

Fast and automatic voice transcription used for messaging applications on devices.

A technology available inside the Voice Development Kit

1. Setup your Speech-to-Text engine

Start by selecting the language you need for your project and determine the right configuration (confidence treshold for instance).

2. Upload audio or record it on the VDK

Plug your microphone in and start speaking or upload your audio files to transcribe its content into text and benchmark which solution works best.

3. Analyze results, optimize and integrate

Speech-to-text transcription will be provided in different hypothesis (whose number can be modified) to help you optimize the confidence threshold for further integration.

Language coverage

Our voice technologies are available in more than 40 languages to voice-enable your products and services wherever you need them to be deployed in the world. Here are some of them.