Is there any danger of losing my clients to voice recognition technology?
Will transcriptionists be replaced by computers doing the same work?
Can a software program really detect all the nuances of natural speech and produce a quality transcript?
These can be daunting questions to those whose livelihood depends on podcasters, YouTubers, corporations, law enforcement, etc. NOT using computer-aided transcription.
No Quick Fixes
Who doesn’t like a quick fix and instant results, but is voice recognition a true solution?
Voice recognition technology is problematic for a number of reasons. And you don’t have to take my word for it. A quick search through a forum like KnowBrainer will convince anyone that voice recognition software is not by any means the be-all, end-all for converting speech to text.
Excellence takes time. If it’s worth transcribing at all, it’s worth transcribing (and then proofreading!) well.
When You Need the Real Thing
A real, live transcriptionist should be used when:
- The audio includes multiple voices
- Accuracy is a priority
- The transcript requires special formatting (e.g., client preferences on spacing and tabbing, parentheticals for witnesses leaving the courtroom)
- The transcript requires redaction (e.g., Social Security numbers)
- The speaker mumbles, stutters, or has a strong accent
- The speaker continues his/her thoughts in long run-on sentences (And we know that never happens 🙂 )
- You don’t want to spend the money for the software
- You’re not willing or able to dedicate the time to continuously train the software
- There are no interruptions (e.g., production talk, cross talk, coughing, dropping a lapel mike)
- Names, places, technical terms need to be researched
I recently transcribed a difficult but very interesting interview of an elderly man who had helped design the Hubble telescope. No way could a computer program have produced an acceptable transcript in that case.
Yes, there are times when voice recognition software could be useful. First let’s look at a fictitious client.
For instance, a medical researcher has a huge backlog of podcasts and just wants to get them all transcribed and posted to his site quickly for SEO purposes. He can try that, but he might get a few readers landing on his site looking for some “jeans” instead of “genes” (or a myriad of other homophones that technology can miss).
A second example would be a transcriptionist using the software. She is recovering from carpal tunnel surgery and needs to keep working, so she dictates the audio with her own voice as she listens to the audio file, and then just types as needed during proofreading.
And remember, these are exceptions, and they both involve only one speaker. You also must take into account:
- the cost of the software
- the value of your time in training the software to recognize your voice
- a real compromise on accuracy (Proofreading/editing will still be necessary.)
I can see how this might work for something like transcribing voice mails or notes just for internal use–documents that don’t have to be perfect. But if those files have errors that are a headache to decipher their meaning later on, has the client really saved money?
If you’re using the software as a transcriptionist, unless you had spent a long time training the software, I would argue that it would be easier to type it fresh than to go back and correct a lot of errors.
So don’t change career paths just yet. As long as you consistently provide your clients with excellent work, there is very little danger of their choosing a software program over you.
And if you have any doubt that there is a still a demand for manual transcription, please read Janet Shaughnessy’s post, Is There a Demand for General Transcription? (Holy Cow — YES!) at the TranscribeAnywhere blog.
Lastly, if voice recognition technology was equivalent to a live human being, wouldn’t we all be using that instead of the agony of manual transcription? 🙂