(Tuesday, 15 August 2023. New York City, NY) – Following the granting of KUDO’s Patent for Systems and Methods for Automatic Speech Translation, the team has rolled out further enhancements to the quality of its AI Speech Translator through the integration of generative AI and large language models (LLM). This represents a significant upgrade in KUDO’s quality-first approach to AI language technology.
Traditionally, machine translation has been limited by its intrinsic shallow approach to the text, resulting in direct and contextually unaware translations. The resulting effect was often a “detached” translation from the real communicative event, with sentences following word-by-word the original and thus producing results that sounded less natural or even did not make sense when translated into another language. Furthermore, direct translations tend to be longer than the original, which is detrimental in the case of real-time simultaneous translation, where time constraints are paramount. By leveraging the power of generative AI and LLM, we aim to overcome these issues and provide more compact, fluent and contextual aware translations.
Large language models, such as GPT-4 or BLOOM, are trained on extensive text corpora, allowing them to manipulate and generate text in a very natural way. By utilizing an LLM, KUDO AI can now delve a step further into the deeper semantic context of the conversation. This approach enables a more contextually accurate translation, achieving a level of fluency and naturalness previously unattainable.
To achieve this goal, our R&D and engineering team spent a considerable amount of effort in integrating a LLM into our speech translation pipeline with the primary goal to make KUDO AI produce a more fluent and coherent translation, while maintaining a high level of accuracy. On top of this, LLM allows us to manipulate the incoming speech in a way that is more compact and concise, a feature which is very important in spoken language translation.
Beatrice Turano, Cognitive Scientist and Language Engineer at KUDO, describes her work as “an ever-going exploration of models, parameters, and output quality. Our objective was, and is, to make sure that the listener receives a translation that is natural, easy to follow and easy to understand. Therefore, we used the LLM to work on naturalness and understandability”.
The first release of KUDO AI with LLM aims specifically at tackling the fluency and naturalness issue of speech translation. Our internal evaluation, conducted on a representative corpus of speeches, demonstrated a significant improvement in the overall translation quality. We could measure an increase in naturalness and fluency from English into many language combinations. For example, the quality increase for Spanish is 24%, and 27% for Italian compared with the translation done without LLM.
Let’s see some examples to understand better how KUDO AI works: