Close Menu
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Facebook X (Twitter) WhatsApp
Trending
  • Highlights: Woad in control on professional debut at the Scottish Open
  • Jessica Alba, Danny Ramirez Kiss in New Photos
  • Multiple people injured in stabbing at Michigan Walmart, police say
  • Johnny Depp joins Alice Cooper on stage for surprise Ozzy Osbourne tribute
  • Amazon gears up to launch its own satellite internet network
  • Highlights: Littler's brilliant comeback against Rock seals spot in final
  • Couple Names Baby After Taylor Swift, Travis Kelce
  • Yankees captain Aaron Judge to go on injured list with flexor strain
Get Your Free Email Account
Facebook X (Twitter) WhatsApp
Baynard Media
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Baynard Media
Home»Lifestyle»Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapse
Lifestyle

Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapse

EditorBy EditorNovember 16, 2024No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

Generative artificial intelligence (AI) systems may be able to produce some eye-opening results but new research shows they don’t have a coherent understanding of the world and real rules.

In a new study published to the arXiv preprint database, scientists with MIT, Harvard and Cornell found that the large language models (LLMs), like GPT-4 or Anthropic’s Claude 3 Opus, fail to produce underlying models that accurately represent the real world.

When tasked with providing turn-by-turn driving directions in New York City, for example, LLMs delivered them with near-100% accuracy. But the underlying maps used were full of non-existent streets and routes when the scientists extracted them.

The researchers found that when unexpected changes were added to a directive (such as detours and closed streets), the accuracy of directions the LLMs gave plummeted. In some cases, it resulted in total failure. As such, it raises concerns that AI systems deployed in a real-world situation, say in a driverless car, could malfunction when presented with dynamic environments or tasks.

Related: AI ‘can stunt the skills necessary for independent self-creation’: Relying on algorithms could reshape your entire identity without you realizing

“One hope is that, because LLMs can accomplish all these amazing things in language, maybe we could use these same tools in other parts of science, as well. But the question of whether LLMs are learning coherent world models is very important if we want to use these techniques to make new discoveries,” said senior author Ashesh Rambachan, assistant professor of economics and a principal investigator in the MIT Laboratory for Information and Decision Systems (LIDS), in a statement.

Tricky transformers

The crux of generative AIs is based on the ability of LLMs to learn from vast amounts of data and parameters in parallel. In order to do this they rely on transformer models, which are the underlying set of neural networks that process data and enable the self-learning aspect of LLMs. This process creates a so-called “world model” which a trained LLM can then use to infer answers and produce outputs to queries and tasks.

Get the world’s most fascinating discoveries delivered straight to your inbox.

One such theoretical use of world models would be taking data from taxi trips across a city to generate a map without needing to painstakingly plot every route, as is required by current navigation tools. But if that map isn’t accurate, deviations made to a route would cause AI-based navigation to underperform or fail.

To assess the accuracy and coherence of transformer LLMs when it comes to understanding real-world rules and environments, the researchers tested them using a class of problems called deterministic finite automations (DFAs). These are problems with a sequence of states such as rules of a game or intersections in a route on the way to a destination. In this case, the researchers used DFAs drawn from the board game Othello and navigation through the streets of New York.

To test the transformers with DFAs, the researchers looked at two metrics. The first was “sequence determination,” which assesses if a transformer LLM has formed a coherent world model if it saw two different states of the same thing: two Othello boards or one map of a city with road closures and another without. The second metric was “sequence compression” — a sequence (in this case an ordered list of data points used to generate outputs) which should show that an LLM with a coherent world model can understand that two identical states, (say two Othello boards that are exactly the same) have the same sequence of possible steps to follow.

Relying on LLMs is risky business

Two common classes of LLMs were tested on these metrics. One was trained on data generated from randomly produced sequences while the other on data generated by following strategic processes.

Transformers trained on random data formed a more accurate world model, the scientists found, This was possibly due to the LLM seeing a wider variety of possible steps. Lead author Keyon Vafa, a researcher at Harvard, explained in a statement: “In Othello, if you see two random computers playing rather than championship players, in theory you’d see the full set of possible moves, even the bad moves championship players wouldn’t make.” By seeing more of the possible moves, even if they’re bad, the LLMs were theoretically better prepared to adapt to random changes.

However, despite generating valid Othello moves and accurate directions, only one transformer generated a coherent world model for Othello, and neither type produced an accurate map of New York. When the researchers introduced things like detours, all the navigation models used by the LLMs failed.

“I was surprised by how quickly the performance deteriorated as soon as we added a detour. If we close just 1 percent of the possible streets, accuracy immediately plummets from nearly 100 percent to just 67 percent,” added Vafa.

This shows that different approaches to the use of LLMs are needed to produce accurate world models, the researchers said. What these approaches could be isn’t clear, but it does highlight the fragility of transformer LLMs when faced with dynamic environments.

“Often, we see these models do impressive things and think they must have understood something about the world,” concluded Rambachan. “I hope we can convince people that this is a question to think very carefully about, and we don’t have to rely on our own intuitions to answer it.”

Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleCan You Solve This Tricky Leg Riddle? You Have 3 Cows, 2 Dogs, and 1 Cat. How Many Legs Do You Have?
Next Article Turkey v Wales: Nations League – live | Nations League
Editor
  • Website

Related Posts

Lifestyle

Why is heart cancer so rare?

July 26, 2025
Lifestyle

Scientists detect gargantuan ‘pimple’ that has plagued a star for at least 7 years

July 26, 2025
Lifestyle

Astronomers discover strange solar system body dancing in sync with Neptune: ‘Like finding a hidden rhythm in a song’

July 26, 2025
Add A Comment

Comments are closed.

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Recent Posts
  • Highlights: Woad in control on professional debut at the Scottish Open
  • Jessica Alba, Danny Ramirez Kiss in New Photos
  • Multiple people injured in stabbing at Michigan Walmart, police say
  • Johnny Depp joins Alice Cooper on stage for surprise Ozzy Osbourne tribute
  • Amazon gears up to launch its own satellite internet network
calendar
July 2025
M T W T F S S
 123456
78910111213
14151617181920
21222324252627
28293031  
« May    
Recent Posts
  • Highlights: Woad in control on professional debut at the Scottish Open
  • Jessica Alba, Danny Ramirez Kiss in New Photos
  • Multiple people injured in stabbing at Michigan Walmart, police say
About

Welcome to Baynard Media, your trusted source for a diverse range of news and insights. We are committed to delivering timely, reliable, and thought-provoking content that keeps you informed
and inspired

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Facebook X (Twitter) Pinterest WhatsApp
  • Contact Us
  • About Us
  • Privacy Policy
  • Disclaimer
  • UNSUBSCRIBE
© 2025 copyrights reserved

Type above and press Enter to search. Press Esc to cancel.