How to Build a Speech Recognition Application

by Contributor

Building speech recognition into your applications can simplify text entry or make it easier to control text without using the keyboard or mouse. Though it can be very difficult to build a speech recognition program yourself, integrating an existing speech recognition engine into your program can be very easy, especially for those with computer programming knowledge.

Prepare Speech Recognition Software

Bundle your software with a speech recognition program, such as Dragon NaturallySpeaking or IBM ViaVoice. If you're a software developer, give the user an option to buy the software. As part of your application's installation process, have the user install the speech recognition program too.

Configure the speech recognition software. In order for your application to be able to take full advantage of speech recognition, the speech recognition program must be correctly configured. This means that microphone and language settings must be set appropriately to take optimal advantage of the speech recognition program's capabilities.

Train the speech recognition program. This may have to be done outside of your application, depending on its nature. If this is the case, most speech recognition programs include training programs and screens, or the speech recognition program can be trained on a word processor.

Integrate Text Entry

Build a text or rich-text control into your application. Many speech recognition programs will work with any other programs that have text-entry options. If all you require is text entry, your application probably won't need any modifications to work with a speech recognition program.

Include extra space in the text-entry control. Since speech recognition programs can recognize speech at a rate faster than many people can type, it may be necessary to increase the size of your text-entry controls. Allow enough space for text to be entered and reviewed in real time.

Interact via an API

Use an application-programming interface (API) to interact with the speech recognition software. Many speech recognition programs include an API for other applications to use. Using one will allow your application to have full access to all speech recognition features and give the user full control over the application through speech.

Integrate the API with your application. This can include making more than one "mode" of speech control. Create command words, such as "save file" or "create new file." When entering text, users should also be able to make corrections without having to touch the keyboard and activate rich-text features, such as bold face, italics, underlining and other font changes.

Tip

  • check Contact the company that makes the speech recognition software and ask whether there is an API available. This isn't a typical add-on with speech recognition software, but if you tell the customer-support tech that you are trying to build a speech recognition application, you will most likely be able to secure the API.

Items you will need