Will AI eventually replace human interpreters?
Professional conference interpreters who have been in the market long enough are probably asked – or ask themselves – the question above more than any other. I know I do, and then some variations of it:
Could the NLP and Deep Learning technologies that have so rapidly changed the written translation landscape be applied to simultaneous interpreting? Will a machine eventually put us, human interpreters, out of a job? Is this the end of human-powered conference interpreting?
My answer is always the same: given time, nothing is impossible, and the pace of technology proves naysayers wrong every day. More: with the advent of remote simultaneous interpretation (a.k.a. RSI) a massive amount of data is accumulating around oral speech segments and their equivalents in multiple other languages. This is already fueling research and experiments that will certainly short-circuit our notion of reality and the timeline we tend to associate with innovation.
So far, the few attempts at automating translation involve a multi-step and linear approach that goes more or less like this: speech to text > text to text > text to speech. AI interpreting using this technology is already possible today. In fact, it is available in your pocket as we speak. Don’t believe me? Here’s a simple experiment that will change your mind:
Pull out your smartphone and select any random language pair on Google Translate. Now click on the mic icon and say a simple sentence in your source language. Watch as the application transcribes your words and gives you the equivalent in the target language. Click on the volume icon and you will hear that phrase read back to you in the other language. Meet your first AI interpreter.
With a few simple lines of code, the steps above can easily be automated to give the impression that it is going from speech to speech. But it is still all smoke and mirrors.
This is not how real interpreters work. And this is why we will continue to rely on the talented women and men who have the ability to listen, reformulate and render in a different language complex segments of meaning. This specialized skill is not yet rivaled by AI. For a number of reasons.
The figures vary, but a great deal of what is meant by a speaker in a live speech will come from body language and other subtle cues such as intonation, with alliteration, euphemism, and irony often playing a role in communicating what a speaker intends to say. A trained human interpreter will read the audience and gather info from the thinnest slices of meaning in less time than it takes to blink an eye.
Real-time language interpretation involves more than just replacing words or looking for equivalent boxes on a shelf. Rather, it is a constant exercise in decision-making with additional layers of fact-checking and anticipation in an ever-moving flow. The interpreter can’t afford to wait for the sentence to be complete and then translate. She must render ideas as they are being shaped while accommodating half-baked sentences, speech artifacts, and occasional backtracking. Try that, Google!
Granted, machines learn fast. And they are attentive observers of our every behavior. The day will likely come when AI will push some of us out of the booth. But don’t hold your breath. And rather than dread it, we should welcome AI and put it to work as an aid to interpreters.
That’s exactly what Claudio Fantinuoli, KUDO’s Head of Innovation and top researcher in interpreting, is doing. “AI-based tools are helping many professionals – in all kinds of fields – do a better job. In interpreting, where excellence in communication is paramount, the community should now shift the focus from the fascinating question of machines replicating the task of interpreters to a more pragmatic reflection: how can machines help interpreters excel in their job?”, says Claudio.
In the next few weeks, and in the months to come, interpreters working on the KUDO platform will be introduced to a wide range of AI features in our new Interpreter Assist suite. Here’s what you can expect to see:
- NOW: AI-driven glossary-building and term extraction based on the preparation materials and select webpages. Think time savings!
- FUTURE: A popup window with terminological suggestions based on your glossaries and even your own word choices in a meeting. Think virtual boothmate!
- FUTURE: Instant unit conversions so you don’t have to do the math. Think peace of mind!
- FUTURE: Detailed post-meeting stats on aspects such as speech speed, repetitive expressions, filler words, voice pitch. Think interpreter coaching!
And that is just the beginning.
If AI is to eventually do what we do, it will take time, patience and very forgiving settings. Until such time, let us make sure AI works for us and not as us. Remember: it is not the end of interpreting, just the end of interpreting as we know it.
Suggested additional reading: