Close Menu
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Facebook X (Twitter) WhatsApp
Trending
  • Can you replace a laptop with an iPad? Apple’s tablets are basically MacBooks.
  • Hungarian GP: Oscar Piastri edges out Lando Norris in Practice Three as Max Verstappen struggles ahead of qualifying | F1 News
  • Dresses for Just $69 + More
  • Teacher suspected of killing Arkansas hikers alarmed parents with ‘odd’ behavior
  • NASA unveils 9 stunning snapshots of the cosmos in X-ray vision: Space photo of the week
  • Darien Gap migrant crossings plummet from 82,000 to just 10 under Trump
  • I reviewed the 4 best streaming devices for 2025
  • Hungarian GP: McLaren rivals look to hit back in final practice LIVE!
Get Your Free Email Account
Facebook X (Twitter) WhatsApp
Baynard Media
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Baynard Media
Home»Lifestyle»Large language models can be squeezed onto your phone — rather than needing 1000s of servers to run — after breakthrough
Lifestyle

Large language models can be squeezed onto your phone — rather than needing 1000s of servers to run — after breakthrough

EditorBy EditorDecember 5, 2024No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

Powerful artificial intelligence (AI) models like ChatGPT need copious amounts of power to run so they are usually housed in vast data centers. But a new breakthrough could compress these AI models so they fit onto a smartphone or laptop.

A new algorithm, dubbed Calibration Aware Low precision Decomposition with Low Rank Adaptation (CALDERA), compresses the massive amounts of data needed to run a large language model (LLM) by trimming redundancies in the code and reducing the precision of its layers of information.

This leaner LLM performs with accuracy and nuance at slightly lower levels than the uncompressed version, scientists said in a study published May 24 to the preprint database arXiv, ahead of a presentation at the Conference on Neural Information Processing Systems (NeurIPS) in December.

“Any time you can reduce the computational complexity, storage and bandwidth requirements of using AI models, you can enable AI on devices and systems that otherwise couldn’t handle such compute- and memory-intensive tasks,” study co-author Andrea Goldsmith, professor of electrical and computer engineering at Princeton University, said in a statement.

Whenever someone uses ChatGPT (to take one popular example) on their phone or laptop, any request made is sent to huge, remote servers, where the data is processed at a great environmental and financial cost, the scientists said in the study. This is because AI models of this size consume large amounts of processing power as they tap into hundreds, if not thousands, of components such as graphics processing units (GPUs). Therefore, to perform these requests using the single GPU on a small device, the size and scope of the AI model must be compressed.

Related: Mathematicians devised novel problems to challenge advanced AIs’ reasoning skills — and they failed almost every test

To compress an LLM, CALDERA combines two techniques. The first technique is “low-precision,” which reduces the number of bits (1s and 0s of data) used to store information, which speeds up storage and processing while improving energy efficiency, the scientists said. The second, called “low-rank,” refers to reducing redundancies in the learnable parameters used in training LLMs.

Get the world’s most fascinating discoveries delivered straight to your inbox.

“We proposed a generic algorithm for compressing large data sets or large matrices. And then we realized that nowadays, it’s not just the data sets that are large, but the models being deployed are also getting large. So, we could also use our algorithm to compress these models,” study co-author Rajarshi Saha, a doctoral student at Stanford University, said in the statement. “Using both of these properties together, we are able to get much more compression than either of these techniques can achieve individually.”

The team tested the algorithm on Meta’s open-source Llama 2 and Llama 3 models and registered an improvement of up to 5% against existing compression algorithms that use just one of the two techniques. The results could pave the way for LLMs to be stored and run on smartphones or laptops in the future, in instances where privacy is paramount and when maximum precision is not necessary.

However, the scientists cautioned that LLMs are not optimized to run efficiently on such devices.

“You won’t be happy if you are running an LLM and your phone drains out of charge in an hour. But I wouldn’t say that there’s one single technique that solves all the problems,” Saha said in the statement. “What we propose in this paper is one technique that is used in combination with techniques proposed in prior works. And I think this combination will enable us to use LLMs on mobile devices more efficiently and get more accurate results.”

Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleTravis Kelce bristles at watching Taylor Swift Christmas movie favorite: ‘Don’t f—ing torture me’
Next Article Best Beats Studio Pro headphones deal: Save $150 at Best Buy
Editor
  • Website

Related Posts

Lifestyle

NASA unveils 9 stunning snapshots of the cosmos in X-ray vision: Space photo of the week

August 2, 2025
Lifestyle

How do frogs breathe and drink through their skin?

August 2, 2025
Lifestyle

Science news this week: A magnitude 8.8 megaquake and whether we should — and can — stop AI

August 2, 2025
Add A Comment

Comments are closed.

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Recent Posts
  • Can you replace a laptop with an iPad? Apple’s tablets are basically MacBooks.
  • Hungarian GP: Oscar Piastri edges out Lando Norris in Practice Three as Max Verstappen struggles ahead of qualifying | F1 News
  • Dresses for Just $69 + More
  • Teacher suspected of killing Arkansas hikers alarmed parents with ‘odd’ behavior
  • NASA unveils 9 stunning snapshots of the cosmos in X-ray vision: Space photo of the week
calendar
August 2025
M T W T F S S
 123
45678910
11121314151617
18192021222324
25262728293031
« Jul    
Recent Posts
  • Can you replace a laptop with an iPad? Apple’s tablets are basically MacBooks.
  • Hungarian GP: Oscar Piastri edges out Lando Norris in Practice Three as Max Verstappen struggles ahead of qualifying | F1 News
  • Dresses for Just $69 + More
About

Welcome to Baynard Media, your trusted source for a diverse range of news and insights. We are committed to delivering timely, reliable, and thought-provoking content that keeps you informed
and inspired

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Facebook X (Twitter) Pinterest WhatsApp
  • Contact Us
  • About Us
  • Privacy Policy
  • Disclaimer
  • UNSUBSCRIBE
© 2025 copyrights reserved

Type above and press Enter to search. Press Esc to cancel.