What is Automatic Speech Recognition (ASR) ?
Automatic Speech Recognition is a technology used to create reliable voice commands. Closely related to speech-to-text, it takes the solution further with specific intent comprehension thanks to a grammar-based design.
Grammar creation allows to build a corpus of complex and field-related words or expressions that can be recognized with high accuracy.
Phonetic design is the logical extension of grammar creation in order to tailor the way words are supposed to be recognized, thus granting a higher efficiency when using complex vocabulary such as names.
Some use cases made real with Automatic Speech Recognition
Creating intuitive and innovative voice commands is the foundation of voice interactions. Automatic Speech Recognition’s possibilities are endless to reinvente processes and user experiences with value and ease of use.
Complex Names Understanding
Create specific names and complex words corpus to be recognized by the Automatic Speech Recognition engine.
Accurate Digits Recognition
Grammar creation allows to tailor the way digits are recognized by the ASR engine in order to process order reference, phone numbers or licence plate for instance.
Reliable Voice Commands
ASR allows to use voice commands for action-related tasks (supply chain, industry, maintenance, report…) on which users can truly rely on.
A technology available inside the Voice Development Kit
1. Create your grammar file
Start by using the Grammar Editor widget in the Voice Development Kit to create your own corpus of vocabulary and voice commands.
2. Test and optimize with phonetic
Try your voice commands directly on the widget, choosing the right language and adjusting the recognition with phonetic modification.
3. Export your solution and integrate it
When meeting your expected results, you can export your file that you will be able to integrate inside your device.
Our voice technologies are available in more than 40 languages to voice-enable your products and services wherever you need them to be deployed in the world. Here are some of them.
Benefits of our Automatic Speech Recognition technology
Multilingual Speech Recognition
Our ASR is able to work with 36 different languages, the most commonly spoken including their dialects in order to scale your solution worldwide.
This design allows you to define specific words, business jargon and technical vocabularies, while specializing in speech recognition for pre-defined use cases.
Low Footprint Embedded Solution
Depending on your hardware’s specifications, the integration of our Automatic Speech Recognition can be done seamlessly thanks to lightweight grammar-based models.
Privacy by Design
Being embedded, our solution is private by design, the voice datas are not transfered when using the Automatic Speech Recognition engine..
Start building your voice solution now !
Get access to the Voice Development Kit to begin the creation of your enterprise-grade voice solutions.
Please note that only businesses and organisations are able to use our technology, individual use is not yet allowed.
Thank you for your understanding.
Powered by the Voice Development Kit
Standard ports and Tools
- Android (version 6.0 API 23)
- Linux: x86_64, armv7hf, armv8
- Windows: x86_64
Functionnality code size
- Basic command & control (C&C) application: 3.2MB
- Full Fonctionality, largest accoustic models: 9.5MB
Components and relative data size per language
- Accoustics models, per language
- Gen 4 compact: 900kB
- Gen 5: approx. 4MB
- Gen 6: approx. 6MB
- GLIC – mono-lingual – General purpose transcriptions: 300-7300kB
- GLC – multi-lingual – Music collection compilation: 700-3000kB
Components and relative data size per language and total RAM usage
- Digit Recognition: 4kB / 1,25MB
- Basic C&C application 100/10,000 commands: 10-500kB / 1,3-1,8MB
- Telephony (voice-activated dialing) with grammars + SLMs, including NLU. 1350 contacts: 0,52MB / 12,6MB
- 1-shot voice destination entry POI & addresses (UDE) all USA, FST based, including NLU: 300MB / 56 MB
- Embedded dictation: 100MB / 100MB
Our ASR solution is an embedded technology that is made to be integrated into devices. To do so, these products need to meet specific criteria to handle the technology and make it work properly to perform your use case.
Frequently Asked Questions on Automatic Speech Recognition
A few things to know…
Voice biometric can be tricky since it is a complex technology. We cover some of the recurrent topics about it in order to give you insights.
Can ASR understand spelled letters and numbers ?
Our ASR can indeed indentify separated letters and numbers when they are spelled, for instance a licence plate or a customer reference.
Is Automatic Speech Recognition able to recognize specific vocabulary?
Our ASR’s design is able to understand very specific vocabulary thanks to the creation of specialized grammars.
What are the technical specifications for integrating ASR?
ASR specifications are essential for its integration. To get access to this information, please contact us.
Can Automatic Speech Recognition work in noisy environments?
Automatic Speech Recognition can work in very noisy environnements if the microphone is adapted to the noise conditions (e.g. in factories).
What type of microphone is best suited for voice recording?
ASR-friendly audio hardwares exist. The best way to find adapted microphone is to contact us in order to test different alternatives.
What is the average error rate of ASR technology?
The WER (Word Error Rate) of our ASR depends on the grammar complexity and the hardware quality. 0% errors is something we can achieve.