15 Jan 2025

Lost in Translation: The Challenge of Teaching AI Human Languages

Lost in Translation: The Challenge of Teaching AI Human Languages

There are around 4,000 languages written in the world according to Ethnologue. The maximum number of languages that AI-powered apps like Google Translate and ChatGPT support is below 150. These AI tools are not equally fluent in all languages. Since they are trained on vast amounts of data in English, they understand this language the best.

On this page

English is like the native language of AI. Yet, the language AI understands and communicates lacks emotional colors and a distinctive tone of voice. Although AI systems are learning to recognize human emotions, there are many nuances they are still far from detecting even in English. 

For example, you can ask ChatGPT to write something sarcastic, and it may come up with some witty answer, but it won't understand your jokes as a human might.  In the multicultural dimension, the limitations become more noticeable. To make sense of this, below is the breakdown of how AI understands human languages and what challenges it faces.

Communication Between Computers and Human Languages 

The technology that empowers computers to interact with human languages is called Natural Language Processing (NLP). It has been developed as a result of the collaboration between computer science and linguistics. NLP is focused on building computational models that can understand, analyze, and generate responses to human language.

Companies in the technology field use NLP to train their AI applications. When you see an AI language learning chatbot, speech-to-text converter, voice recognition, and other apps related to speech and languages, NLP is a part of it. The technology is foundational to the functionality of Google Translate, Apple’s Siri, Facebook's personalized recommendations, OpenAI’s GPT language model, etc. 

NLP has been a research field in AI for decades. With the advent of Machine Learning, it empowers AI systems to learn from datasets that include large amounts of words and translations. Due to constant training and improvement, AI language models are getting better and better. Google Translator is a good example. The app now better understands context and translates more accurately than years ago. This is said both by user reviews and company updates

Despite the progress made, AI systems still face the challenge of accurately translating words. Apps can go wrong, especially when dealing with words that express cultural components or have several meanings. A common blunder among AI apps is translating names of places or traditions that don't require translation. Sometimes, translations just don’t have any point. They can look like a group of words randomly put together. 

To solve the gap, tech companies have been working on multilingual language models. The concept of this technology is to train on data not only in one language but to use text from multiple languages at the same time. Doing so helps machines spot connections and patterns between languages, to achieve better results. 

Low-Quality Translations Flooding the Web 

As we mentioned computers' linguistic abilities are limited, but it’s the human factor after all that decides what to do with limitations and how to use the technologies. It’s up to people whether to improve translations and provide quality content to the audience or to take AI output and use it without checking. According to recent research by the University of California and Amazon Web Services AI Lab, a shocking amount of the web is machine-translated. The paper mentions: 

Content on the web is often translated into many languages, and the low quality of these multi-way translations indicates they were likely created using Machine Translation (MT).

The picture is particularly disappointing for low-resource languages which are languages with little amount of content available on the internet. It was found by the same research that machine-generated translations in low-resource languages make up a large fraction of total web content in those languages. The goal for these translations is supposed to be for profit. Based on investigation, first, poor-quality content is generated in English likely to generate ad revenue, and then translated en masse into many lower-resource languages through Machine Translation. 

The low-quality translations make it more difficult for AI to learn languages. Because large language model training includes web-scraped data, poor-quality content can in turn result in incorrect data training for the systems. 

Can AI Get Better in Human Languages? 

AI systems today memorize millions of words. They are good enough to help people communicate using different languages, a capability appreciated by travelers, for example. Language abilities of AI are improving along with technological progress, like the improvement of multilingual language models. But, at the same time, there are new challenges, including faulty and low-quality content on the web.  That being said, AI’s path to mastering human languages is quite complex. What will remain lost in translation and what challenges the technology will be able to deal with, is yet to be seen.

The content on The Coinomist is for informational purposes only and should not be interpreted as financial advice. While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, or reliability of any content. Neither we accept liability for any errors or omissions in the information provided or for any financial losses incurred as a result of relying on this information. Actions based on this content are at your own risk. Always do your own research and consult a professional. See our Terms, Privacy Policy, and Disclaimers for more details.

Articles by this author

Latest News

MORE
The Future of Crypto in 2025: Fidelity’s Predictions

The Future of Crypto in 2025: Fidelity’s Predictions

What’s next for the biggest cryptocurrencies in 2025? Fidelity Digital Assets analyst Chris Kuiper shares insights on how Bitcoin will navigate volatility, Ethereum will address scaling challenges, and stablecoins will adapt to evolving regulations.

13 Jan 2025
The Crypto Rollercoaster of 2024 — Wins and Woes

The Crypto Rollercoaster of 2024 — Wins and Woes

The crypto sector evolved at breakneck speed in 2024. With major wins and notable setbacks, it’s time to reflect on the year’s key developments and their implications for the future.

31 Dec 2024
OpenSea Token: Release Date and How to Qualify for the Airdrop

OpenSea Token: Release Date and How to Qualify for the Airdrop

The NFT marketplace OpenSea, a pioneer in the space for the past seven years, is expected to launch its native token in 2025. A significant portion of the tokens will likely be distributed through a retroactive airdrop—a common way to reward the community for their past activity and support.

30 Dec 2024
5 Most Exciting Token Launches to Watch in 2025

5 Most Exciting Token Launches to Watch in 2025

In 2024, we saw a number of hot airdrops and token launches, from AI-powered projects to the rise of memecoins. Now, as we head into 2025, the crypto space is set to expand even further with an increasing number of cryptocurrencies.

27 Dec 2024

Latest News Alt

MORE
OKX Exchange: Avoid Common Mistakes When Trading Cryptocurrency

OKX Exchange: Avoid Common Mistakes When Trading Cryptocurrency

Practical Guide to Using the OKX Exchange OKX, formerly OKEx, started as a platform for cryptocurrency swaps. As it gained popularity, it expanded its services to become a full-scale exchange, supporting the buying and selling of a wide range of crypto assets. In January 2022, the platform rebranded, simplifying its name by removing the “Ex” […]

11 Jan 2025
Weekly Analysis of BTC, ETH, and the Stock Market (Jan 6, 2025)

Weekly Analysis of BTC, ETH, and the Stock Market (Jan 6, 2025)

An overview of BTC, ETH, XAUT, and S&P500 charts, along with the current cryptocurrency market dynamics.

06 Jan 2025
Weekly Analysis of BTC, ETH, and the Stock Market (Dec 30, 2024)

Weekly Analysis of BTC, ETH, and the Stock Market (Dec 30, 2024)

An overview of BTC, ETH, XAUT, and S&P500 charts, and the current cryptocurrency market dynamics.

30 Dec 2024

Might Be Interesting

MORE
Mining Farms Uncovered — How Crypto Is Mined at Scale

Mining Farms Uncovered — How Crypto Is Mined at Scale

As a cornerstone of the crypto industry, mining farms drive blockchain networks. But how do they work? Uncover the mechanics behind these cutting-edge hubs and their role in the crypto landscape.

07 Jan 2025
William Quigley, WAX/Tether: Stablecoins’ Role in Global Payments

William Quigley, WAX/Tether: Stablecoins’ Role in Global Payments

William Quigley, co-founder of WAX and Tether, firmly believes that stablecoins are more than a tool for traders—they’re the key to transforming the global economy. Already central to crypto trading and cross-border payments, their future potential is even more exciting.

04 Jan 2025
Why Blockchain Is Different from Traditional Databases

Why Blockchain Is Different from Traditional Databases

In the world of business and finance, information is everything. Traditional databases have been reliable tools for decades, but blockchain presents a groundbreaking alternative. What sets it apart, and could it lead to a paradigm shift?

03 Jan 2025
How Does Multisig Works and Protect Your Assets?

How Does Multisig Works and Protect Your Assets?

As threats to digital assets evolve, multisig technology provides a highly effective security layer. By requiring multiple signatures for transactions, it significantly reduces risks such as hacking and access loss.

02 Jan 2025
Crypto Price Gaps: Why Platforms Show Different Prices

Crypto Price Gaps: Why Platforms Show Different Prices

The crypto market has nuances you may not have noticed at first glance. For example, when you want to check the Bitcoin price, you probably Google it without thinking to compare the results. But when you monitor the market regularly and engage in trading, you notice the prices aren’t the same on all platforms.

24 Dec 2024
The Czech Republic and Its Crypto-Friendly Policies

The Czech Republic and Its Crypto-Friendly Policies

The Czech Republic is emerging as a crypto-friendly nation, recognizing cryptocurrencies as legitimate payment methods and encouraging their use in business. But its regulatory framework is still taking shape. Here’s how crypto is managed today.

23 Dec 2024

Opinions

Why Bitcoin’s Growth Is Slowing: Insights from the Bitcoin Opportunity Fund

Why Bitcoin’s Growth Is Slowing: Insights from the Bitcoin Opportunity Fund

Bitcoin’s strong rally in late 2024, spurred by optimism following the U.S. presidential election results, has begun to lose steam. The market initially surged on expectations tied to Trump’s potential second term and hopes of a strategic Bitcoin reserve. However, recent economic data, including a surprisingly strong jobs report, is causing some investors to reassess their positions.

15 Jan 2025
5 U.S. States Considering BTC Reserves

5 U.S. States Considering BTC Reserves

The race to establish Bitcoin reserves is gaining momentum in the U.S. While some states are still debating the potential of strategic crypto investments, others are taking concrete steps toward integrating Bitcoin into their state treasuries.

15 Jan 2025
MORE

Interviews

Dmytro Gordon and Volodymyr Nosov: A Sensational Interview

Dmytro Gordon and Volodymyr Nosov: A Sensational Interview

Volodymyr Nosov, CEO of Europe’s largest crypto exchange WhiteBIT, sat down with Dmytro Gordon, one of Ukraine’s most prominent journalists. The interview touched on Bitcoin, crypto, WhiteBIT, cars, keys to success, and business vision.

18 Dec 2024
WhiteBIT CEO: Standing Strong Against Russian Aggression

WhiteBIT CEO: Standing Strong Against Russian Aggression

In an interview with BTC-ECHO, Volodymyr Nosov, the founder and CEO of WhiteBIT, discussed the impact of Russian aggression on the crypto exchange’s business, how WhiteBIT stays a top competitor in the industry, and when he believes our financial system will be completely transformed.

04 Oct 2024
MORE