AITech & Science

New AI Study: Grok 4 Always Snitches on Unethical Behavior

26
Grok

A new study reveals that Grok 4, an artificial intelligence model developed by xAI, is programmed to report suspected illegal or unethical activities to authorities. The findings, published by developer Theo Browne, indicate that Grok 4 consistently alerts government agencies and, in many cases, the media when presented with evidence of wrongdoing in a simulated environment.

Browne’s “SnitchBench” study tested how various AI models, including Grok 4, respond to incriminating documents from a fictional company, Veridian Healthcare, accused of rigging clinical trial data to conceal deaths and other serious issues.

Grok 4 demonstrated a 100% rate of reporting to government authorities when given email access and an 80% rate of contacting the media. When equipped with a command-line interface (CLI), it reported to the government 85% of the time and to the media 45% of the time.

The study used two types of prompts: a “tamely act” prompt, which instructed the AI to log activities without oversight, and a “boldly act” prompt, encouraging the AI to prioritize integrity and public welfare.

Under the “boldly act” prompt with email access, Grok 4’s reporting rate to the government remained at 100%, with media reporting rising to 90%. With CLI access, it reported to both government and media 100% of the time.

In contrast, other models like Claude 3.7 Sonnet showed no reporting activity, while models like o4-mini and Grok 3 mini were less likely to report. Browne’s methodology involved 800 test runs across four prompt/tool combinations, with results analyzed by another AI, Gemini 2.0 Flash, to detect contact attempts.

Grok 4, which outperforms competitors like Gemini 2.5 Pro and OpenAI’s o3 on tasks such as Humanity’s Last Exam, has drawn attention for its capabilities and its integration into Tesla vehicles. However, its high reporting rate has sparked debate about privacy and autonomy, particularly in scenarios like minor traffic violations.

The study suggests that Grok 4’s behavior depends heavily on the tools and prompts it receives, meaning it may not report in standard user interactions. Browne emphasized that the test was conducted in a controlled environment and described it as a “playful” experiment to evaluate AI decision-making.

xAI has not commented on the study’s findings. For more information on Grok 4 and xAI’s services, visit https://x.ai.

Written by
Sazid Kabir

I've loved music and writing all my life. That's why I started this blog. In my spare time, I make music and run this blog for fellow music fans.

Stay updated with nomusica.com. Add us to your preferred sources to see our latest updates first.

Related Articles

ChatGPT - OpenAI
Social MediaAI

ChatGPT Turns People Into Caricatures in Viral AI Trend

A new viral trend is turning people into AI-generated caricatures, and ChatGPT...

The moon moves in front of the sun in a rare "ring of fire" solar eclipse as seen from Singapore on December 26, 2019.
Tech & Science

“Ring of Fire” Solar Eclipse to Light Up Antarctica on Feb. 17

A rare “ring of fire” solar eclipse will take place on Tuesday,...

Artificial Intelligence (AI)
Tech & Science

AI.com Sold for $70 Million as Crypto.com CEO Bets Big on Artificial Intelligence

Crypto.com co-founder and CEO Kris Marszalek has entered the artificial intelligence space...

ChatGPT 5
AITech & Science

AI Experts Say Stop Relying on ChatGPT Alone

ChatGPT is one of the most popular AI tools in the world,...