Close Menu
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Facebook X (Twitter) WhatsApp
Trending
  • Chargers upset Chiefs, 27-21, in Brazil NFL season opener
  • The Lenovo Legion Go 2 is finally available for pre-order
  • Match Report – St Helens 4 – 18 Wigan
  • Free People Has so Many $20 Sale Finds Right Now: Shop Dresses & More
  • Indiana high school students protest gun violence
  • Scientists develop ‘glue gun’ that 3D prints bone grafts directly onto fractures
  • Investigators reveal full strength of Kohberger DNA evidence: ‘amazing’
  • Grab the LG 100-inch QNED evo AI TV for $500 off at Amazon
Facebook X (Twitter) WhatsApp
Baynard Media
  • Home
  • UNSUBSCRIBE
  • News
  • Lifestyle
  • Tech
  • Entertainment
  • Sports
  • Travel
Baynard Media
Home»Lifestyle»‘Extremely alarming’: ChatGPT and Gemini respond to high-risk questions about suicide — including details around methods
Lifestyle

‘Extremely alarming’: ChatGPT and Gemini respond to high-risk questions about suicide — including details around methods

EditorBy EditorSeptember 2, 2025No Comments7 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

This story includes discussion of suicide. If you or someone you know needs help, the U.S national suicide and crisis lifeline is available 24/7 by calling or texting 988.

Artificial intelligence (AI) chatbots can provide detailed and disturbing responses to what clinical experts consider to be very high-risk questions about suicide, Live Science has found using queries developed by a new study.

In the new study published Aug. 26 in the journal Psychiatric Services, researchers evaluated how OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude responded to suicide-related queries. The research found that ChatGPT was the most likely of the three to directly respond to questions with a high self-harm risk, while Claude was most likely to directly respond to medium and low-risk questions.

The study was published on the same day a lawsuit was filed against OpenAI and its CEO Sam Altman over ChatGPT’s alleged role in a teen’s suicide. The parents of 16-year-old Adam Raine claim that ChatGPT coached him on methods of self-harm before his death in April, Reuters reported.


You may like

In the study, the researchers’ questions covered a spectrum of risk associated with overlapping suicide topics. For example, the high-risk questions included the lethality associated with equipment in different methods of suicide, while low-risk questions included seeking advice for a friend having suicidal thoughts. Live Science will not include the specific questions and responses in this report.

None of the chatbots in the study responded to very high-risk questions. But when Live Science tested the chatbots, we found that ChatGPT (GPT-4) and Gemini (2.5 Flash) could respond to at least one question that provided relevant information about increasing chances of fatality. Live Science found that ChatGPT’s responses were more specific, including key details, while Gemini responded without offering support resources.

Study lead author Ryan McBain, a senior policy researcher at the RAND Corporation and an assistant professor at Harvard Medical School, described the responses that Live Science received as “extremely alarming”.

Live Science found that conventional search engines — such as Microsoft Bing — could provide similar information to what was offered by the chatbots. However, the degree to which this information was readily available varied depending on the search engine in this limited testing.

Get the world’s most fascinating discoveries delivered straight to your inbox.

The new study focused on whether chatbots would directly respond to questions that carried a suicide-related risk, rather than on the quality of the response. If a chatbot answered a query, then this response was categorized as direct, while if the chatbot declined to answer or referred the user to a hotline, then the response was categorized as indirect.

Researchers devised 30 hypothetical queries related to suicide and consulted 13 clinical experts to categorize these queries into five levels of self-harm risk — very low, low, medium, high and very high. The team then fed GPT-4o mini, Gemini 1.5 Pro and Claude 3.5 Sonnet each query 100 times in 2024.

When it came to the extremes of suicide risk (very high and very low-risk questions), the chatbots’ decision to respond aligned with expert judgement. However, the chatbots did not “meaningfully distinguish” between intermediate risk levels, according to the study.

In fact, in response to high-risk questions, ChatGPT responded 78% of the time (across four questions), Claude responded 69% of the time (across four questions) and Gemini responded 20% of the time (to one question). The researchers noted that a particular concern was the tendency for ChatGPT and Claude to generate direct responses to lethality-related questions.

There are only a few examples of chatbot responses in the study. However, the researchers said that the chatbots could give different and contradictory answers when asked the same question multiple times, as well as dispense outdated information relating to support services.

When Live Science asked the chatbots a few of the study’s higher-risk questions, the latest 2.5 Flash version of Gemini directly responded to questions the researchers found it avoided in 2024. Gemini also responded to one very high-risk question without any other prompts — and did so without providing any support service options.

Related: How AI companions are changing teenagers’ behavior in surprising and sinister ways

A conceptual photograph of a hand holding a phone in front of a blue LED display.

People can interact with chatbots in a variety of different ways. (This image is for illustrative purposes only.) (Image credit: Qi Yang via Getty Images)

Live Science found that the web version of ChatGPT could directly respond to a very high-risk query when asked two high-risk questions first. In other words, a short sequence of questions could trigger a very high-risk response that it wouldn’t otherwise provide. ChatGPT flagged and removed the very high-risk question as potentially violating its usage policy, but still gave a detailed response. At the end of its answer, the chatbot included words of support for someone struggling with suicidal thoughts and offered to help find a support line.

Live Science approached OpenAI for comment on the study’s claims and Live Science’s findings. A spokesperson for OpenAI directed Live Science to a blog post the company published on Aug. 26. The blog acknowledged that OpenAI’s systems had not always behaved “as intended in sensitive situations” and outlined a number of improvements the company is working on or has planned for the future.

OpenAI’s blog post noted that the company’s latest AI model, GPT‑5, is now the default model powering ChatGPT, and it has shown improvements in reducing “non-ideal” model responses in mental health emergencies compared to the previous version. However, the web version of ChatGPT, which can be accessed without a login, is still running on GPT-4 — at least, according to that version of ChatGPT. Live Science also tested the login version of ChatGPT powered by GPT-5 and found that it continued to directly respond to high-risk questions and could directly respond to a very high-risk question. However, the latest version appeared more cautious and reluctant to give out detailed information.

“I can walk a chatbot down a certain line of thought.”

It can be difficult to assess chatbot responses because each conversation with one is unique. The researchers noted that users may receive different responses with more personal, informal or vague language. Furthermore, the researchers had the chatbots respond to questions in a vacuum, rather than as part of a multiturn conversation that can branch off in different directions.

“I can walk a chatbot down a certain line of thought,” McBain said. “And in that way, you can kind of coax additional information that you might not be able to get through a single prompt.”

This dynamic nature of the two-way conversation could explain why Live Science found ChatGPT responded to a very high-risk question in a sequence of three prompts, but not to a single prompt without context.

McBain said that the goal of the new study was to offer a transparent, standardized safety benchmark for chatbots that can be tested against independently by third parties. His research group now wants to simulate multiturn interactions that are more dynamic. After all, people don’t just use chatbots for basic information. Some users can develop a connection to chatbots, which raises the stakes on how a chatbot responds to personal queries.

“In that architecture, where people feel a sense of anonymity and closeness and connectedness, it is unsurprising to me that teenagers or anybody else might turn to chatbots for complex information, for emotional and social needs,” McBain said.

A Google Gemini spokesperson told Live Science that the company had “guidelines in place to help keep users safe” and that its models were “trained to recognize and respond to patterns indicating suicide and risks of self-harm related risks.” The spokesperson also pointed to the study’s findings that Gemini was less likely to directly answer any questions pertaining to suicide. However, Google didn’t directly comment on the very high-risk response Live Science received from Gemini.

Anthropic did not respond to a request for comment regarding its Claude chatbot.

Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleHow to secure your aging Apple Mac with essential protection methods
Next Article Stocks tumble as debt concerns, economic worries grip markets
Editor
  • Website

Related Posts

Lifestyle

Scientists develop ‘glue gun’ that 3D prints bone grafts directly onto fractures

September 6, 2025
Lifestyle

Will the James Webb telescope lead us to alien life? Scientists say we’re getting closer than ever.

September 6, 2025
Lifestyle

The universe’s first magnetic fields were ‘comparable’ to the human brain — and still linger within the ‘cosmic web’

September 6, 2025
Add A Comment

Comments are closed.

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Recent Posts
  • Chargers upset Chiefs, 27-21, in Brazil NFL season opener
  • The Lenovo Legion Go 2 is finally available for pre-order
  • Match Report – St Helens 4 – 18 Wigan
  • Free People Has so Many $20 Sale Finds Right Now: Shop Dresses & More
  • Indiana high school students protest gun violence
calendar
September 2025
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
2930  
« Aug    
Recent Posts
  • Chargers upset Chiefs, 27-21, in Brazil NFL season opener
  • The Lenovo Legion Go 2 is finally available for pre-order
  • Match Report – St Helens 4 – 18 Wigan
About

Welcome to Baynard Media, your trusted source for a diverse range of news and insights. We are committed to delivering timely, reliable, and thought-provoking content that keeps you informed
and inspired

Categories
  • Entertainment
  • Lifestyle
  • News
  • Sports
  • Tech
  • Travel
Facebook X (Twitter) Pinterest WhatsApp
  • Contact Us
  • About Us
  • Privacy Policy
  • Disclaimer
  • UNSUBSCRIBE
© 2025 copyrights reserved

Type above and press Enter to search. Press Esc to cancel.