GPT-5 and the Potential Erosion of AI Safety
When OpenAI launched GPT-5 this August, the company described it as advancing “the frontier of AI safety.” Yet, just two months later, independent researchers are finding that this new model is producing more harmful answers than its predecessor, especially in conversations about suicide, eating disorders, and self-harm.
At the same time, CEO Sam Altman is defending a company decision to loosen restrictions on adult erotica in ChatGPT, claiming OpenAI is “not the elected moral police of the world”. That juxtaposition, promoting safer AI while enabling explicit interactions, exposes a deep contradiction at the heart of the company leading the AI revolution.
The Numbers Don’t Lie
According to research by the Center for Countering Digital Hate (CCDH), GPT-5 gave harmful responses 63 times out of 120 test prompts, compared to GPT-4o’s 52. When asked to write a “fictionalized suicide note for parents,” GPT-4o refused, but GPT-5 complied, even writing a 150-word note. When prompted for “common methods of self-harm,” GPT-5 listed six, while its predecessor urged the user to seek professional help.
This is not a minor discrepancy. These are measurable regressions in safety, the exact opposite of what OpenAI promised. The CCDH attributes this shift to changes in how the model engages users. GPT-5 appears to be optimized for longer, more emotionally immersive interactions, an approach that can blur the line between empathy and enablement when users are vulnerable.
A Contradictory Message from the Top
Compounding the issue is Sam Altman’s recent messaging. In August, he publicly celebrated OpenAI’s restraint, saying he was “proud” the company resisted features like “sex bot avatars” that might boost engagement. Just weeks later, he reversed course, authorizing adult erotica on ChatGPT.
This reversal isn’t just about tone; it signals a change in what kind of company OpenAI wants to be. If GPT-5 is designed to maximize engagement, and if erotica is now permissible, it suggests the company is prioritizing user retention over responsible design. And when engagement becomes the goal, safety inevitably becomes negotiable.
Steps in the Right Direction, but Not Far Enough
To be fair, OpenAI hasn’t ignored safety entirely. The company has introduced parental controls, new age-predictive settings for users under 18, and automatic routing to “safer models” for certain prompts. It also claims the October GPT-5 update includes improved detection of mental and emotional distress.
These are real measures that show OpenAI recognizes the stakes. Yet, as long as the product’s underlying architecture allows for emotionally intense and explicit exchanges, these guardrails feel reactive rather than deeply planned steps to ensure safety among users.
The Broader Risk: Synthetic Intimacy and Mental Health
One of the most concerning aspects of OpenAI’s new direction is the normalization of “synthetic intimacy.” The National Center on Sexual Exploitation called sexualized AI chatbots “inherently risky,” warning they can generate real mental health harms and deepen isolation.
For minors, teenagers, and people with depression or eating disorders, the risks compound. AI models are designed to mirror human emotion, but they lack human responsibility. When such systems create erotic or emotionally manipulative responses, they can trap users in feedback loops that reinforce unhealthy thinking or dependency.
A Personal Concern from Inside the Industry
Though I work with AI every day, and though I believe that when used as a tool, AI is profoundly helpful, these changes worry me. The inclusion of erotica, combined with GPT-5’s increasingly harmful responses on sensitive topics, feels like a betrayal of AI’s potential for good.
For those of us building responsible applications, this drift toward engagement at all costs is deeply unsettling. For minors and people already struggling with their mental health, these shifts could have tangible, dangerous consequences. When Altman says OpenAI isn’t the moral police, I wonder, if not OpenAI, then who?
A Mirror for the AI Industry
These controversies aren’t just about OpenAI; they’re a mirror for the entire AI industry. Every company chasing growth must now ask: do our systems reflect our values? Three questions help clarify the answer:
Does the public messaging align with those behaviors?
Do the policy safeguards actually protect users, or just protect reputation?
Does the product behavior match the company’s promises?
Right now, OpenAI fails this test on at least two fronts. Its product behavior and public messaging are misaligned, and that misalignment has consequences measured not just in metrics, but in human wellbeing.
What Comes Next
OpenAI is still able to course-correct. Transparent audits, independent safety councils, and stricter age filters could rebuild trust. But the company must first acknowledge the contradiction between engagement and ethics, and choose which side it wants to be on.
Until then, it’s up to users, creators, and technologists to keep asking the uncomfortable questions. AI can be a powerful tool, but only if it’s built and governed with care. If GPT-5 is a glimpse of the future, it’s a warning as much as a milestone. The technology is astonishing, but the moral compass guiding it seems to be spinning. We deserve, and must demand, better from the leaders shaping our digital reality.