KUDO AI Undergoes Major Upgrade with the Integration of Large Language Models & Generative AI

(Tuesday, 15 August 2023. New York City, NY) – Following the granting of KUDO’s Patent for Systems and Methods for Automatic Speech Translation, the team has rolled out further enhancements to the quality of its AI Speech Translator through the integration of generative AI and large language models (LLM). This represents a significant upgrade in KUDO’s quality-first approach to AI language technology.

Traditionally, machine translation has been limited by its intrinsic shallow approach to the text, resulting in direct and contextually unaware translations. The resulting effect was often a “detached” translation from the real communicative event, with sentences following word-by-word the original and thus producing results that sounded less natural or even did not make sense when translated into another language. Furthermore, direct translations tend to be longer than the original, which is detrimental in the case of real-time simultaneous translation, where time constraints are paramount. By leveraging the power of generative AI and LLM, we aim to overcome these issues and provide more compact, fluent and contextual aware translations.

Large language models, such as GPT-4 or BLOOM, are trained on extensive text corpora, allowing them to manipulate and generate text in a very natural way. By utilizing an LLM, KUDO AI can now delve a step further into the deeper semantic context of the conversation. This approach enables a more contextually accurate translation, achieving a level of fluency and naturalness previously unattainable.

To achieve this goal, our R&D and engineering team spent a considerable amount of effort in integrating a LLM into our speech translation pipeline with the primary goal to make KUDO AI produce a more fluent and coherent translation, while maintaining a high level of accuracy. On top of this, LLM allows us to manipulate the incoming speech in a way that is more compact and concise, a feature which is very important in spoken language translation.

Beatrice Turano, Cognitive Scientist and Language Engineer at KUDO, describes her work as “an ever-going exploration of models, parameters, and output quality. Our objective was, and is, to make sure that the listener receives a translation that is natural, easy to follow and easy to understand. Therefore, we used the LLM to work on naturalness and understandability”.

The first release of KUDO AI with LLM aims specifically at tackling the fluency and naturalness issue of speech translation. Our internal evaluation, conducted on a representative corpus of speeches, demonstrated a significant improvement in the overall translation quality. We could measure an increase in naturalness and fluency from English into many language combinations. For example, the quality increase for Spanish is 24%, and 27% for Italian compared with the translation done without LLM.

Let’s see some examples to understand better how KUDO AI works:

Example 1
Original MT-Translation MLL-Translation
Generation after generation after generation lives trapped in the same vicious cycle, fed by prejudice and inaction. Generazione dopo generazione dopo generazione vive intrappolata nello stesso circolo vizioso, alimentato dal pregiudizio e dall’inazione.
(Generation after generation after generation lives trapped in the same vicious circle, fed by prejudice and inaction).
Il pregiudizio e l’inazione intrappolano le generazioni in un ciclo senza fine.
(Prejudice and inaction trap generations in a cycle without end).

The sentence with LLM has been subtly shortened to meet the requirements of simultaneous translation time constraints, and it conveys the repetition found in the original speech without actually using it. This highlights the LLM’s deft capability to refine the meaning within the sentence, streamlining its structure to enhance comprehension for listeners.

Example 2
Original MT-Translation MLL-Translation
Little like little, like little video clip taster for anything that you’ve done. Un piccolo, piccolo, piccolo video clip taster per qualsiasi cosa abbiate fatto.
(Little like little, like little video clip taster for anything that you’ve done.).
Un piccolo video di esempio di qualsiasi cosa abbiate fatto.
(A tiny video clip sample of anything you’ve done.)

The original text is replete with spoken language disfluencies, such as repetitions, and includes colloquial expressions. The MT-Translation grapples with both of these aspects, while the LLM-Translation succeeds in delivering a semantically accurate and fluid translation.

“The integration of a LLM into a Speech Translation System marks a paradigm change for the domain“, says Claudio Fantinuoli, CTO and KUDO AI Translator designer. “This is not only an incredible achievement from a theoretical perspective – because it demonstrates how generative AI can improve speech translation – but it is also remarkable for an engineering perspective, since we are doing it in real-time”.

In the first phase, the LLM feature will be first released to selected clients. This will allow our team to monitor and evaluate the performances of the new KUDO AI in real-life, safeguarding its use in the first weeks after launch.

While we are roll out this first version of KUDO AI with LLM, the R&D team is continuing to innovate by leveraging the ability of the LLM to consider the wider communicative context to improve translation quality. This gives us the possibility to translate a speech in real-time instructing the engine with what has been said until that moment, as well as contextual information about the specific meeting. This further improves quality (average 37% improvement over several language against the version without LLM), making the overall experience even more natural.

In summary, the integration of a large language model into our Speech Translator presents a major stride in overcoming the limitations of linear, direct translation. This technical innovation facilitates a more contextually accurate and natural translation, promising an enhanced user experience.

About KUDO

KUDO is the world leader in providing real-time multilingual solutions that enable people to communicate effortlessly in any language⁠—on any platform. Their network of 12,000 professional language interpreters, combined with their ground-breaking Speech Translator, empower organizations of all sizes to collaborate more efficiently, with greater inclusivity, and on an international scale. KUDO Inc. is a New-York based technology start-up founded and managed by language and conferencing industry insiders seeking to create a world in which everyone has the power to understand and be understood in their own language. More info at


