Progress is the ultimate goal; this maxim has been relatively unchallenged for decades. No matter your industry, there’s always a drive to do things better. The divisive part is how this progress is made. With Artificial Intelligence (AI) becoming more and more capable of performing tasks once delegated to humans (think self-driving cars, factory work, etc.), the future of work is questionable for some industries. But how good is AI when it comes to interpretation for real-time communication? Will AI ever be able to do the job? How does it hold up to professional human language interpreters? In 2011, inventor, author, and visionary Ray Kurzweil predicted that computers wouldn’t be able to tackle spoken language translation (or interpretation) as well as humans until 2029. Are we still on track?
For this piece, we called on four experts in language interpretation, engineering, and AI to provide insight into the future of AI and language interpretation.
Let’s start with the basics.
What is AI?
AI, or Artificial Intelligence, is a relatively new development. Coined in 1955 by Herbert Simon and Allen Newell, the term AI, at its essence, refers to a computer simulation of human intelligence (forbes.com). Today, AI is often built into computers and machines to perform human tasks with the hope of granting these devices the capacity to make calculated decisions and problem-solve.
What is language interpretation?
As long as there has been language, there’s been a need for someone to mediate communication between speakers of two different tongues. Language interpretation is precisely that– the spoken or signed method of translating a message from one language to another. It’s sometimes referred to as one of humanity’s oldest occupations, with the earliest written accounts of language interpreters dating back to 3000 BCE. Over time, language interpretation evolved from a service exclusive to kings and queens to being regularly used within governmental and intergovernmental organizations (read more about the history of interpretation here). Live simultaneous interpretation is a type of interpretation where interpreters translate what a speaker is saying in real-time, with almost no delay. More recent advancements in tech with companies like KUDO have taken this exclusive service and made it accessible to anyone anywhere.
Is AI the best option for live interpretation?
When it comes to AI performing live interpretation, the question on everyone’s mind is– will it be good enough? According to KUDO VP of Communications and all-around interpretation savant Barry Slaughter Olsen, the answer depends on the level of accuracy you’re looking for and how you define good enough:
“Good enough depends on the end-user. Is it a teenager needing to follow a video game? Accuracy may not be the most important. Is it a parliamentary meeting? Then accuracy is essential. Regarding accuracy, AI will not touch the bar compared to human interpretation.”
What are some of the most prominent language challenges that AI can’t handle?
AI would be excellent if it were just about replacing this word for that word in another language. But communication is about more than words; it’s about connection. When you think of what an AI voice sounds like, a polite, slightly monotone Alexa or Siri voice may come to mind, which is entirely understandable and pleasant enough. But ask yourself one question– how long would you be willing to listen to an artificial voice? 20 minutes? 1 hour? How long until you lose interest, or the monotonous voice eventually lulls you to sleep? Perhaps an even more critical question– can you afford to lose people’s attention?
Currently, everything but the language itself is a challenge for AI, and for now, nothing compares to human interpreters when effectively translating everything beyond just words. However, according to KUDO’s Head of Innovation and AI, Claudio Fantinuoli, AI is catching up, “AI is getting better, the voices are getting better, sometimes you can hardly tell the difference. But because it’s AI, and it lacks the ability to understand, there is a strangeness when factoring in humor, emotion, nuance, intonation, irony, sarcasm, accents, and other human nuances.”
Will AI ever efficiently do the job of a human interpreter? If so, how long until it does?
As for how long it will take before AI and human interpreters are comparable- it’s hard to say, but it’s coming. For Parham Akhavan, KUDO cofounder, former CTO, and Advisor, the road to precise and reliable AI interpretation is winding, “Progress happens in stages. It is coming; it’s inevitable. It’s naïve to say that it won’t. The thing about computers and AI is that things are not linear.” One of the biggest hurdles (besides AI’s lack of understanding) is computing power, as the translation takes a huge amount of processing in real-time. As it turns out, the road to a fully comprehensive and dynamic AI for interpretation is expensive.
“Video needs to be there for this progress to happen,” says Ewandro, former Chief Interpreter in the UN system and KUDO Chief Language Officer. “So much of our communication happens through body language. Some things are only comprehensive contextually via video.” Think about the range of emotions that can be read from just a facial expression. Pair that with humor or wit; without a human interpreter, it’s the perfect recipe for misunderstanding.
Until AI is “good enough,” Barry believes our perceptions of AI-driven language services and human-driven language services will also begin to develop, “If I get on a meeting about a sensitive topic, and I see that interpretation is being handled using AI– what am I going to think? Is there going to be a value judgment based on the type of interpretation chosen?”
Will clients accept AI interpretation?
For a moment, think about going to a website in a foreign language with the information you want to access. To translate it, you copy and paste the text into Google Translate. Depending on how badly you want that information, regardless of the grammatical and sometimes nonsensical equivalents, that translation still gives you a pretty good idea of what is being communicated. This is the degree to which Ewandro predicts AI interpretation will be accepted, “It depends on their level of usage and how much tolerance there is for mistakes.”
For now, there is still no comparison to human interpreters when it comes to simultaneous language interpretation; there is just too much nuance in how we communicate as humans. AI may eventually be able to interpret language accurately, accounting for the subtleties alluded to above. But we are just not there yet.
Are we still on pace for Kurzweil’s vision of AI delivering human-level interpretation in the next seven years? It’s hard to say. Until then, Ewandro’s sentiments mirror Kurzweil’s in that we shouldn’t be resisting new tech (including AI) but instead using these advancements to do more, “Let’s use AI to augment interpretation with tools like virtual boothmates and terminology assurance. I see a future where interpreters and AI work together.”
Tags include: global meetings, interpretation, interpreter