Ai Agents Are Broken Can Gpt 5 Really Fix Them News Nest

Bonisiwe Shabane

-Dec 27, 2025, 8:04 PM

ai agents are broken can gpt 5 really fix them news nest

OpenAI’s GPT-5 model was meant to be a world-changing upgrade to its wildly popular and precocious chatbot. But for some users, last Thursday’s release felt more like a wrenching downgrade, with the new ChatGPT presenting a diluted personality and making surprisingly dumb mistakes. On Friday, OpenAI CEO Sam Altman took to X to say the company would keep the previous model, GPT-4o, running for Plus users. A new feature designed to seamlessly switch between models depending on the complexity of the query had broken on Thursday, Altman said, “and the result was GPT-5 seemed way dumber.” He promised to implement... Given the hype around GPT-5, some level of disappointment appears inevitable. When OpenAI introduced GPT-4 in March 2023, it stunned AI experts with its incredible abilities.

GPT-5, pundits speculated, would surely be just as jaw-dropping. Watch our subscriber-only livestream about all things GPT-5, hosted by Kylie Robison, Will Knight, and Reece Rogers. OpenAI touted the model as a significant upgrade, with PhD-level intelligence and virtuoso coding skills. A system to automatically route queries to different models was meant to provide a smoother user experience. (It could also save the company money by directing simple queries to cheaper models.) OpenAI faces backlash as users complain about broken workflows and losing AI friends.

It’s been less than a week since the launch of OpenAI’s new GPT-5 AI model, and the rollout hasn’t been a smooth one. So far, the release sparked one of the most intense user revolts in ChatGPT’s history, forcing CEO Sam Altman to make an unusual public apology and reverse key decisions. At the heart of the controversy has been OpenAI’s decision to automatically remove access to all previous AI models in ChatGPT (approximately nine, depending on how you count them) when GPT-5 rolled out to... Unlike API users who receive advance notice of model deprecations, consumer ChatGPT users had no warning that their preferred models would disappear overnight, noted independent AI researcher Simon Willison in a blog post. The problems started immediately after GPT-5’s August 7 debut. A Reddit thread titled “GPT-5 is horrible” quickly amassed over 2,000 comments filled with users expressing frustration over the new release.

By August 8, social media platforms were flooded with complaints about performance issues, personality changes, and the forced removal of older models. Marketing professionals, researchers, and developers all shared examples of broken workflows on social media. “I’ve spent months building a system to work around OpenAI’s ridiculous limitations in prompts and memory issues,” wrote one Reddit user in the r/OpenAI subreddit. “And in less than 24 hours, they’ve made it useless.” OpenAI's CEO, Sam Altman, overpromised on GPT-5, and real-life results are underwhelming, but it looks like a new update is rolling out that might address some of the concerns. GPT-5 is a state-of-the-art model.

In our tests, BleepingComputer found that GPT-5 does really well in coding. It was significantly faster than the other OpenAI models, including o3. However, GPT-5 struggles to be 'creative' in writing, and it also often fails to switch to its new reasoning capabilities when users expect. On top of it, we've observed that GPT-5 often produces short content when it's explicitly asked to give more details. Some believe that GPT-5 is throttling token output to minimize the cost, but OpenAI's CEO, Sam Altman, argues that a bug caused unexpected problems with GPT-5. OpenAI is a California-based artificial intelligence research company that develops and deploys AI applications including the generative AI bot ChatGPT.

One morning, your support queue is on fire. Escalations are up, processes are breaking, and your AI agent suddenly sounds nothing like your brand. You didn’t change a thing, but your LLM provider did. This is the hidden risk of building customer experience (CX) entirely on large language models. Vendors can and do change or retire models overnight. They’re focused on mass consumer appeal, not preserving the quirks, constraints, and emotional tone your business depends on.

The recent GPT‑5 rollout illustrates this vividly. Despite being promoted as a major leap forward, the release triggered one of the most intense user revolts in ChatGPT history. GPT‑5 removed access to legacy models (like GPT‑4o), upended workflows, and felt “colder” or more “robotic” to many users. OpenAI CEO Sam Altman called the rollout “a screw‑up,” issued a rare public apology, and swiftly reinstated GPT‑4o for paid users just a day later. LLM providers routinely push updates for performance gains, but “better in aggregate” doesn’t always mean better for specific workflows or user expectations: The solution isn’t abandoning LLMs, they’re powerful tools, but building resilience into your CX architecture is essential.

Hybrid AI, exemplified by platforms like Teneo, can help in many ways, here are some examples: A common and reasonable question is: Why invest in building complex AI Agent architectures if future versions of GPT might eventually include all the necessary functionality out of the box? The answer lies in the observation that progress in transformer architectures—the neural network type behind large language models—appears to be slowing down. Looking at benchmarks designed to evaluate the performance of LLMs, such as the Massive Multitask Language Understanding (MMLU), we observe a noticeable plateau in recent advancements. GPT-4 set a record in 2023 with an impressive 86.4% score, nearly doubling GPT-3's performance from its debut in 2020. However, since GPT-4's release, newer models have shown only marginal improvements compared to the significant leap from GPT-3 to GPT-4.

For example, GPT-o1, OpenAI's latest reasoning model, scores around 92.3% on MMLU , only 6% increase over to GPT-4’s 86.4%. This suggests that while advancements continue, the transformative breakthroughs that defined earlier iterations are becoming harder to achieve. One of they key reasons why newer models exhibit only marginal improvements can be found by looking at a recent publication, titled No “Zero-Shot” Without Exponential Data. The paper exhibits evidence that additional training data provide diminishing returns in LLM performance improvements which exhibits a logarithmic trend as data increases. If this trend holds true as the evidence in the paper suggest, then we are faced with a situation where LLMs will need exponential more data in order to improve towards reaching AGI (Artificial... The issue is compounded by the fact that at approximately 15 trillion tokens, current LLM training sets are already approaching the upper limit of high-quality public text available.

For English alone, estimates suggest a maximum range of 40–90 trillion tokens, meaning we are nearing the saturation point of usable and available data. Moreover, historical trends indicate that model data requirements have increased tenfold with each new generation (GPT-2 to GPT-3 to GPT-4 all required 10x or more data). While GPT-5 might still achieve incremental improvements through expanded data collection and minor optimizations, scaling alone is unlikely to sustain the same trajectory for future generations. For models at the GPT-6 level and beyond, achieving meaningful progress will likely require breakthroughs in novel architectures or entirely new paradigms that have yet to be discovered.

Ai Agents Are Broken Can Gpt 5 Really Fix Them News Nest

People Also Search

OpenAI’s GPT-5 Model Was Meant To Be A World-changing Upgrade

GPT-5, Pundits Speculated, Would Surely Be Just As Jaw-dropping. Watch

It’s Been Less Than A Week Since The Launch Of

By August 8, Social Media Platforms Were Flooded With Complaints

In Our Tests, BleepingComputer Found That GPT-5 Does Really Well