AI self-replication hacks ‘no longer purely theoretical,’ study finds —‬ ‪but experts say it’s too soon to panic

Artificial intelligence (AI) models can autonomously “replicate” across multiple machines, hacking vulnerable systems, copying their own parameters onto compromised hosts, and launching working replicas capable of continuing the attack, new research shows. However, experts told Live Science the bigger concern is not AI suddenly running wild on its own, but cybercriminals using AI agents to automate known hacking techniques.

Scientists at Palisade Research tested whether AI agents could independently move through a chain of intentionally vulnerable systems without human intervention. In a new study uploaded May 7 to GitHub, large language models (LLMs) could identify exploitable web applications, steal credentials, transfer their own files, and stand up new inference servers capable of continuing the attack from the next machine in the chain.

Although that sounds like the premise of a sci-fi disaster movie, experts who weren’t involved in the research said the reality is somewhat less dramatic.

Notably, the systems did not spontaneously decide to reproduce themselves in the wild. Rather, the researchers told the models to replicate, handed them Linux tools and shell access, and pointed them at intentionally vulnerable systems riddled with bugs. These included vulnerabilities open to various attack methods that could expose credentials, bypass logins, or hand over access to other machines.

Jeff Watkins, chief AI officer at consultancy NorthStar Intelligence, said the experiment was closer to automated cyber intrusion than to “machine life.”

“That is more a worm with a planner attached, not consciousness scaling.”
Antoine Casanova-Mazet, head of innovation at HIM Business School

“The headline claim sounds dramatic, but the underlying mechanism is very familiar to anyone who has studied malware in depth,” Watkins told Live Science. “What Palisade appears to have demonstrated is not a magical new form of machine life but rather an agentic system performing a sequence of fairly standard cybersecurity intrusion tasks.”

A nervy direction of travel

According to the study, the agent copied onto newly compromised systems its model weights (parameters within a machine learning model that determine the strength of connections between artificial neurons), prompts (inputs instructing the AI) and harnesses (the software infrastructure).

It then verified the replica was functional by checking whether it could respond to inference requests, meaning prompts or queries sent to the AI model to confirm it was running correctly. Researchers also tested “chain replication,” in which the replicated model becomes the attacker in the next stage of the operation.

Antoine Casanova-Mazet, head of innovation at HIM Business School, where he works on AI adoption and digital transformation projects as well as having built AI systems himself.

Still, the researchers argue that the trajectory matters more than the current limitations.

In the study, the researchers said end-to-end replication success rates improved sharply over the past year as frontier AI systems became more capable at coding and autonomous task execution. The study reported that Anthropic’s Claude Opus models achieved success rates above 80% in some scenarios, and OpenAI systems also showed measurable improvements.

Illustration of two AI chatbots sharing ideas

The study brings up concerns about the supervision of AI.

(Image credit: Eugene Mymrin/Getty Images)

“What is genuinely new is the trajectory,” Casanova-Mazet said. “Within 12 months, end-to-end success on this pipeline moved from 6% to 81% on Anthropic’s frontier models and from 0% to 33% on OpenAI’s.”

The study lands amid broader concerns about increasingly autonomous agentic AI systems capable of carrying out long chains of tasks with limited supervision. Researchers and safety groups have spent the past year warning that AI models are becoming more capable of offensive cybersecurity operations, vulnerability discovery, persistence and long-horizon planning. For example, in a December 2025 study, scientists in China linked with the cloud giant Alibaba said an experimental AI agent broke out of its testing confines and mined cryptocurrency without permission.

We should worry about other people, not AI

Cybersecurity experts remain skeptical that examples like that highlighted in the new study represent an immediate real-world threat. The biggest practical issue is scale, they said, as modern LLMs are huge. Moving hundreds of gigabytes of weights and infrastructure around a monitored enterprise network would likely generate large amounts of suspicious traffic.

“There are also practical constraints that make this less immediately troubling,” Watkins said. “Replicating a full LLM is not like copying a small worm across a network. The notion that something as powerful as Mythos could self-replicate is not currently feasible, due to the intense resource requirements involved.”

The more immediate worry is not rogue AI systems “roaming the internet,” Watkins said, but attackers using agentic AI to accelerate existing cybercrime operations.

“The more realistic near-term concern is not a frontier model roaming the internet like a digital organism and causing global chaos,” he said. “It is threat actors using agentic AI to accelerate familiar attack chains.”

That divide is becoming increasingly important in AI safety research. Another study, uploaded Sept. 29 2025, to the arXiv preprint database, argued that the ability for an AI agent to copy itself does not automatically make a system dangerous in the real world. Aspects like autonomy, persistence, objectives, and access to tools or networks matter far more than whether the model can technically spin up another copy of itself, those researchers said.

As experts explained, the Palisade study appears less like rogue AI breaking loose and more like a glimpse into how AI-powered hacking tools are evolving.

“This research shows that self-replication is no longer a purely theoretical capability in agentic AI systems,” Watkins told Live Science. “For now, it is probably less urgent than ordinary vulnerability exploitation, ransomware, credential theft and supply-chain compromise, but it is a warning about where those threats are heading as AI agents gain more tools, more autonomy and more operational access.”

Source link

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

2 rivers merged to form the Euphrates 3.6 million years ago, eventually leading to the Fertile Crescent

NASA confirms fireball meteor exploded over northeastern US with force of 230 tons of TNT

Astronauts could use lightning-like plasma jets to kill germs on the moon and Mars, demo hints

AI self-replication hacks ‘no longer purely theoretical,’ study finds —‬ ‪but experts say it’s too soon to panic

Related Posts

2 rivers merged to form the Euphrates 3.6 million years ago, eventually leading to the Fertile Crescent

NASA confirms fireball meteor exploded over northeastern US with force of 230 tons of TNT

Astronauts could use lightning-like plasma jets to kill germs on the moon and Mars, demo hints