Build a custom voice command in 5 easy steps

Written by Nathan Janezcko

Nathan Janezcko is Technical Project Manager at Vivoka. He handles product's development as well as customer projects with the R&D and technical team.

Create | Latest | Speech Recognition

Vivoka challenges the voice assistant giants with its offline solution

Toward a “voice first” world using voice control?

Vivoka adds ID R&D Voice Biometrics to the latest version of its Voice Development Kit

1) Start by creating your new project with VDK

Choosing the type of project you want to build

To begin this tutorial, we will first start with the creation of the project. With the help of the wizard, you need to create a custom application that will allow you to use the Grammar Editor.

Setting the voice project information and languages

For the creation of the project, you will be asked for 3 pieces of information. You will have to set up the project with a name, a directory and the languages you need.

More than 50 languages are available with the Voice Development Kit. You can even create multilingual voice assistants.

Define the technologies to be developed and integrated (ASR in order to customize voice command grammar)

In this new step, you have to choose between 4 available technologies: Wake up Word (WUW), Automatic Speech Recognition (ASR), Natural Language Understand (NLU) or Text to Speech (TTS).

For this case, we will need to use ASR to work on grammar.

2) Discovering Speech Recognition Grammar Edition Plugin

We will now proceed to the grammar edition for your voice command, this step will take place in 5 steps.

Click on the file to open the Grammar Editor
Select the language of your grammar (mostly if you work on a multilingual grammar)
Write your customized grammar here
Save and Compile to use your grammar
Click on “Test” and experiment your grammar with your microphone with pre-recorded audio files

Some element useful to write a grammar:

Every lines should end with a semicolon ;

<x> -> Rule name

[x] -> Optional value

| -> “or”

!function(X, Y)

Useful functions:

!tag(X, Y); tags (X) words (Y) to make the interpretation of result way easier
!repeat(X, Y, Z); Repeat the words (X) at least Y times up to Z times
!pronounce “X” PRONAS “Y”; Replace the pronunciation of a word (X) to the pronunciation of another word (Y). You can put | PRONAS “Z” to add various pronunciations.

3) Speech Recognition Grammar Edition : How-to

<main>: I want a pizza;

Now your assistant will be able to detect an order of a pizza.

Let’s make this assistant smarter.

<main>: I want a [pizza] !tag(PIZZA_TYPE, <pizza>);

<pizza>: margherita | proscuiutto e funghi | capricciosa | vegetariana | calzone;

And now my assistant allows me to select a type of pizza from a predefined list.

To improve the detection of the pizza name when pronouncing the american way or the italian way we added the !pronounce function

!pronounce “capricciosa” PRONAS “caprikiosa” | PRONAS “caprichioza”;

This allow us to setup more expected pronunciation to a word

We improve even more our grammar to allow different verbs of action and the possibility to order a number of pizza

<main>: <verb> (a | <number>) [pizza] <pizza>;

<verb>: I (want | would like);

<number>: !tag(NUMBER, 1 | 2 | 3);

<pizza>: [pizza] !tag(PIZZA_TYPE, margherita | proscuiutto e funghi | capricciosa | vegetariana | calzone);

We also moved the !tag function to keep the <main> rule lisibility

Let’s implement the repeat function so our assistant can take the order for a whole group of person at once

<main>: <verb> !repeat((a | <number>) <pizza> [and [<verb>]], 1, *);

The !repeat function allows the repetition of a segment of our sentence.

As you can see, we added [and [<verb>]] at the end of the bracket. This segment is totally optional but allows more interaction with the assistant.

Examples of valid sentence:

I want a margherita
I would like a pizza capricciosa and 2 vegetariana
I want a margherita, 2 capricciosa and I would like a calzone

With all that food, some drinks would be appreciate

<main>: <verb> !repeat((a | <number>) (<pizza> | <drinks>) [and [<verb>]], 1, *);

<drinks>: [!tag(DRINK_FORMAT, glass | bottle) of] !tag(DRINK_TYPE, water | coca | wine | beer);

And finally, let’s add another action to our assistant, to make it more complete. I want to be able to request the remaining time for my order to be ready.

I create a new rule <time_left> and move the content of <main> to a new rule <order>

The finished and customized voice command grammar

#BNF+EM V2.1;

!grammar ASRENG-US;

!start <main>;

!pronounce “capricciosa” PRONAS “caprikiosa” | PRONAS “caprichioza”;

<main>: <order> | <time_left>;

<time_left>: How (much | many) time (left | remaining) [for (our | my) order];

<order>: <verb> !repeat((a | <number>) (<pizza> | <drinks>) [(and | with) [<verb>]], 1, *);

<verb>: I (want | would like);

<number>: !tag(NUMBER, 1 | 2 | 3);

<pizza>: [pizza] !tag(PIZZA_TYPE, margherita | proscuiutto e funghi | capricciosa | vegetariana | calzone);

<drinks>: [!tag(DRINK_FORMAT, glass | bottle) of] !tag(DRINK_TYPE, water | coca | wine | beer);

Now let’s make an order:

“I want a crapricciosa and a bottle of water.”

Since we put tag, this result is really easy to interpret.

This was a short example showing you one of the ways to think a grammar. We recommend iterative work, as it usually allows to cover a maximum of use cases very easily.

Thank you for reading this article. We hope it will help you find cool ideas for your projects and have fun building grammar!

For developers, by developers

Try our voice solutions now

1

Sign up first on the Console

Before integrating with VDK, test our online playground: Vivoka Console.

2

Develop and test your use cases

Design, create and try all of your features.

3

Submit your project

Share your project and talk about it with our expert for real integration.

Sign up on Console

It's always the right time to learn more about voice technologies and their applications

Browse our content

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Build a custom voice command in 5 easy steps

Written by Nathan Janezcko

Create | Latest | Speech Recognition

Vivoka challenges the voice assistant giants with its offline solution

Toward a “voice first” world using voice control?

Vivoka adds ID R&D Voice Biometrics to the latest version of its Voice Development Kit

1) Start by creating your new project with VDK

Choosing the type of project you want to build

Setting the voice project information and languages

Define the technologies to be developed and integrated (ASR in order to customize voice command grammar)

2) Discovering Speech Recognition Grammar Edition Plugin

3) Speech Recognition Grammar Edition : How-to

The finished and customized voice command grammar

Try our voice solutions now

Sign up first on the Console

Develop and test your use cases

Submit your project

It's always the right time to learn more about voice technologies and their applications

Large Language Models and ChatGPT

NLU model best practices to improve accuracy

The future of Warehousing: Voice Directed Warehouse Operations

5 business applications to leverage embedded NLU in your products & services

Natural Language Processing – An Overview on what makes an AI “conversational”

Vivoka challenges the voice assistant giants with its offline solution