What is Speech-to-Text ?
Speech-to-Text is a voice technology, based on deep learning language models, that is used to transform audio signals into transcribed text.
The results are statistically determined regarding the most frequent sentence structures and word occurrence regarding the context identified.
Some use cases made real with Speech-to-Text
Speech-to-Text is the foundation of speech recognition and voice assistants that we know. This technology is designed to be paired with other solutions to produce innovative voice use cases.
Automatic voice transcription of discussions and meetings to be processed as voice dictation with specific speech recognition models.
Turning voice into text to be interpreted by NLP/NLU engines to identify the user’s intent for voice commands.
Voice Messaging Systems
Fast and automatic voice transcription used for messaging applications on devices.
A technology available inside the Voice Development Kit
1. Setup your Speech-to-Text engine
Start by selecting the language you need for your project and determine the right configuration (confidence treshold for instance).
2. Upload audio or record it on the VDK
Plug your microphone in and start speaking or upload your audio files to transcribe its content into text and benchmark which solution works best.
3. Analyze results, optimize and integrate
Speech-to-text transcription will be provided in different hypothesis (whose number can be modified) to help you optimize the confidence threshold for further integration.
Our voice technologies are available in more than 40 languages to voice-enable your products and services wherever you need them to be deployed in the world. Here are some of them.