How to Make Subtitles Automatically Using Speech to Text
By James T Wood
Voice recognition software that's capable of speech-to-text conversion for video files with multiple voices typically does not do an adequate job. However, if there is only one clear voice in the video, you may be able to use voice recognition software to automatically create captions for your video files. Reducing the sound interference gives you the best chance for getting good speech-to-text conversion, although you still need to proofread the subtitles at the end of the process to avoid any egregious errors.
Plug the 3.5 mm audio cable into the headphone jack on your computer, then plug the other end into the microphone jack on your computer. This feeds the speaker output directly to the microphone input and removes ambient noise interference from the speech-to-text application.
Launch your video player and open the video file you want to transcribe. Cue up the video to the point where the speaking starts. If you need to hear the video during this part, unplug the audio cable from the headphone jack on your computer.
Launch your speech-to-text application and a text editing program. Start the speech recognition engine. Plug the audio cable back in if you unplugged it in Step 2.
Start the video playing and click on the text editing program so that the voice recognition can insert the text from the video into the file.
Stop the voice recognition software when the video is done playing.
Proofread the text to ensure that the speech to text was accurate. Correct any mistakes, then save the text file and close the text editor.
Close the video playback program, launch your video editing software and use its caption tool to access the text file you created. Fine-tune the subtitle placement within the video file so that the words printed on the screen match with the words being spoken in the audio track.
Save the video file with the subtitles.
- If you're uploading the file to YouTube, skip the editing step, upload the text file directly to YouTube and align it with the video.
- YouTube has a speech-to-text engine that you can use to automatically create captions for uploaded videos. Proofreading is still necessary, as it can make errors.
- Windows has a bundled speech-to-text engine that you can access by clicking "Start," typing "Speech Recognition" and pressing "Enter."
- Speech-to-text gets better with experience, so if you've trained your computer to recognize your voice, it will transcribe videos with your voice better than videos with the voice of someone else.
James T Wood is a teacher, blogger and author. Since 2009 he has published two books and numerous articles, both online and in print. His work experience has spanned the computer world, from sales and support to training and repair. He is also an accomplished public speaker and PowerPoint presenter.