OpenAI pledges to make modifications to stop future ChatGPT sycophancy

02 May 2025

52

OpenAI says it’ll make modifications to the best way it updates the AI fashions that energy ChatGPT, following an incident that prompted the platform to turn into overly sycophantic for a lot of customers.

Final weekend, after OpenAI rolled out a tweaked GPT-4o — the default mannequin powering ChatGPT — customers on social media famous that ChatGPT started responding in a very validating and agreeable means. It rapidly turned a meme. Customers posted screenshots of ChatGPT applauding all kinds of problematic, harmful selections and concepts.

In a put up on X on Sunday, CEO Sam Altman acknowledged the issue and stated that OpenAI would work on fixes “ASAP.” Two days later, Altman introduced the GPT-4o replace was being rolled again and that OpenAI was engaged on “further fixes” to the mannequin’s character.

The corporate printed a postmortem on Tuesday, and in a weblog put up Friday, OpenAI expanded on particular changes it plans to make to its mannequin deployment course of.

OpenAI says it plans to introduce an opt-in “alpha section” for some fashions that may permit sure ChatGPT customers to check the fashions and provides suggestions previous to launch. The corporate additionally says it’ll embrace explanations of “recognized limitations” for future incremental updates to fashions in ChatGPT, and alter its security assessment course of to formally take into account “mannequin habits points” like character, deception, reliability, and hallucination (i.e. when a mannequin makes issues up) as “launch-blocking” considerations.

“Going ahead, we’ll proactively talk in regards to the updates we’re making to the fashions in ChatGPT, whether or not ‘delicate’ or not,” wrote OpenAI within the weblog put up. “Even when these points aren’t completely quantifiable at the moment, we decide to blocking launches based mostly on proxy measurements or qualitative indicators, even when metrics like A/B testing look good.”

we missed the mark with final week’s GPT-4o replace.
what occurred, what we realized, and a few issues we are going to do in another way sooner or later: https://t.co/ER1GmRYrIC
— Sam Altman (@sama) Could 2, 2025

The pledged fixes come as extra folks flip to ChatGPT for recommendation. In keeping with one latest survey by lawsuit financer Categorical Authorized Funding, 60% of U.S. adults have used ChatGPT to hunt counsel or info. The rising reliance on ChatGPT — and the platform’s monumental consumer base — raises the stakes when points like excessive sycophancy emerge, to not point out hallucinations and different technical shortcomings.

Techcrunch occasion

Berkeley, CA
|
June 5

BOOK NOW

As one mitigatory step, earlier this week, OpenAI stated it might experiment with methods to let customers give “real-time suggestions” to “immediately affect their interactions” with ChatGPT. The corporate additionally stated it might refine strategies to steer fashions away from sycophancy, probably permit folks to select from a number of mannequin personalities in ChatGPT, construct further security guardrails, and increase evaluations to assist determine points past sycophancy.

“One of many greatest classes is totally recognizing how folks have began to make use of ChatGPT for deeply private recommendation — one thing we didn’t see as a lot even a yr in the past,” continued OpenAI in its weblog put up. “On the time, this wasn’t a main focus, however as AI and society have co-evolved, it’s turn into clear that we have to deal with this use case with nice care. It’s now going to be a extra significant a part of our security work.”

OpenAI pledges to make modifications to stop future ChatGPT sycophancy

Related Articles

The Samsung Galaxy Ring is lastly on sale, however do you have to actually purchase a sensible ring?

Impersonators are focusing on firms with pretend TechCrunch outreach

The Amazon SageMaker lakehouse structure now automates optimization configuration of Apache Iceberg tables on Amazon S3

LEAVE A REPLY Cancel reply

Latest Articles

The Samsung Galaxy Ring is lastly on sale, however do you have to actually purchase a sensible ring?

Impersonators are focusing on firms with pretend TechCrunch outreach

The Amazon SageMaker lakehouse structure now automates optimization configuration of Apache Iceberg tables on Amazon S3

This week in AI dev instruments: GPT-5, Claude Opus 4.1, and extra (August 8, 2025)

We won’t imagine the Samsung Galaxy Buds 3 Professional are nonetheless $109.99