Automatic Speech Recognition

Create complex vocabulary voice commands with our embedded grammar-based and Private by Design Automatic Speech Recognition technology.

Try the Voice Development KitContact Sales Team

What is Automatic Speech Recognition (ASR) ?

Automatic Speech Recognition is a technology used to create reliable voice commands. Closely related to speech-to-text, it takes the solution further with specific intent comprehension thanks to a grammar-based design.

Complex Vocabulary

Grammar creation allows to build a corpus of complex and field-related words or expressions that can be recognized with high accuracy.

Phonetic Personalization

Phonetic design is the logical extension of grammar creation in order to tailor the way words are supposed to be recognized, thus granting a higher efficiency when using complex vocabulary such as names.

Some use cases made real with Automatic Speech Recognition

Creating intuitive and innovative voice commands is the foundation of voice interactions.  Automatic Speech Recognition’s possibilities are endless to reinvente processes and user experiences with value and ease of use.

Complex Names Understanding

Create specific names and complex words corpus to be recognized by the Automatic Speech Recognition engine.

Accurate Digits Recognition

Grammar creation allows to tailor the way digits are recognized by the ASR engine in order to process order reference, phone numbers or licence plate for instance.

Reliable Voice Commands

ASR allows to use voice commands for action-related tasks (supply chain, industry, maintenance, report…) on which users can truly rely on.

A technology available inside the Voice Development Kit

1. Create your grammar file

Start by using the Grammar Editor widget in the Voice Development Kit to create your own corpus of vocabulary and voice commands.

2. Test and optimize with phonetic

Try your voice commands directly on the widget, choosing the right language and adjusting the recognition with phonetic modification.

3. Export your solution and integrate it

When meeting your expected results, you can export your file that you will be able to integrate inside your device.

Language coverage

Our voice technologies are available in more than 40 languages to voice-enable your products and services wherever you need them to be deployed in the world. Here are some of them.

English (US)

English (UK)














Benefits of our Automatic Speech Recognition technology

Multilingual Speech Recognition

Our ASR is able to work with 36 different languages, the most commonly spoken including their dialects in order to scale your solution worldwide.

Grammar-based Technology

This design allows you to define specific words, business jargon and technical vocabularies, while specializing in speech recognition for pre-defined use cases.

Low Footprint Embedded Solution

Depending on your hardware’s specifications, the integration of our Automatic Speech Recognition can be done seamlessly thanks to lightweight grammar-based models.

Privacy by Design

Being embedded, our solution is private by design, the voice datas are not transfered when using the Automatic Speech Recognition engine..

Start building your voice solution now !

Get access to the Voice Development Kit to begin the creation of your enterprise-grade voice solutions.

Please note that only businesses and organisations are able to use our technology, individual use is not yet allowed.

Thank you for your understanding.

Your project has never been that close to its solution!

Browsing through our projects and technologies might have give you some insights about the possibilities you have by working with us. We can further help you to achieve your goals.

Standard ports and Tools

  • Android (version 6.0 API 23)
  • Linux: x86_64, armv7hf, armv8
  • Windows: x86_64


Functionnality code size

  • Basic command & control (C&C) application: 3.2MB
  • Full Fonctionality, largest accoustic models: 9.5MB

Components and relative data size per language

  • Accoustics models, per language
    • Gen 4 compact: 900kB
    • Gen 5: approx. 4MB
    • Gen 6: approx. 6MB
  • GLIC – mono-lingual – General purpose transcriptions: 300-7300kB
  • GLC – multi-lingual – Music collection compilation: 700-3000kB


Components and relative data size per language and total RAM usage

  • Digit Recognition: 4kB / 1,25MB
  • Basic C&C application 100/10,000 commands: 10-500kB / 1,3-1,8MB
  • Telephony (voice-activated dialing) with grammars + SLMs, including NLU. 1350 contacts: 0,52MB / 12,6MB
  • 1-shot voice destination entry POI & addresses (UDE) all USA, FST based, including NLU: 300MB / 56 MB
  • Embedded dictation: 100MB / 100MB


Tech Requirements

Our ASR solution is an embedded technology that is made to be integrated into devices. To do so, these products need to meet specific criteria to handle the technology and make it work properly to perform your use case.

Frequently Asked Questions on Automatic Speech Recognition

A few things to know…

Voice biometric can be tricky since it is a complex technology. We cover some of the recurrent topics about it in order to give you insights.

Can ASR understand spelled letters and numbers ?

Our ASR can indeed indentify separated letters and numbers when they are spelled, for instance a licence plate or a customer reference.

Is Automatic Speech Recognition able to recognize specific vocabulary?

Our ASR’s design is able to understand very specific vocabulary thanks to the creation of specialized grammars.

What are the technical specifications for integrating ASR?

ASR specifications are essential for its integration. To get access to this information, please contact us.

Can Automatic Speech Recognition work in noisy environments?

Automatic Speech Recognition can work in very noisy environnements if the microphone is adapted to the noise conditions (e.g. in factories).

What type of microphone is best suited for voice recording?

ASR-friendly audio hardwares exist. The best way to find adapted microphone is to contact us in order to test different alternatives.

What is the average error rate of ASR technology?

The WER (Word Error Rate) of our ASR depends on the grammar complexity and the hardware quality. 0% errors is something we can achieve.

More technologies to discover…

Audio Front End

Speech signal evaluation and filters to improve audio quality for voice-related solutions


Automatic generation of multilingual natural voices that runs offline on the edge

Wake-up Word

Easy tool to generate multilingual wake-up words to voice-activate any devices

Voice Biometry

Authenticate or identify users with offline text (in)dependent voice biometrics models