- Overclocked
- Posts
- ChatGPT might be regressing. Here's why it's happening.
ChatGPT might be regressing. Here's why it's happening.
This week, we're looking at OpenAI’s update to it’s flagship AI model and the darker side of the AI boom. Plus, check out YouTube’s new AI tool for dynamic video backgrounds and the latest scoop on AI news around the world. Let’s get into it!
In today’s newsletter:
👉 ChatGPT is getting worse
💻 Kenyan AI trainers face exploitation
📺 Netflix’s “disrespectful” use of AI
💼 Amazon invests $8 billion in Anthropic
🗣️ Apple is developing a new Siri
🍦 Get the scoop on the latest news
ChatGPT’s New Update to 4o Model: A Big Step Back?
OpenAI’s latest update to its flagship model, GPT-4o, launched on November 20 and promised to push the boundaries of generative AI. With claims of more natural, engaging writing and enhanced capabilities for working with uploaded files, the new release was expected to be a major leap forward. However, just days after its debut, users are noticing cracks in the façade, raising concerns about whether this model truly delivers on its promises—or if it’s a step in the wrong direction.
GPT-4o got an update 🎉
The model’s creative writing ability has leveled up–more natural, engaging, and tailored writing to improve relevance & readability.
It’s also better at working with uploaded files, providing deeper insights & more thorough responses.
— OpenAI (@OpenAI)
6:01 PM • Nov 20, 2024
What’s New with GPT-4o?
According to OpenAI, GPT-4o is designed to excel in creative writing, offering "more natural, engaging, and tailored" outputs. The model is also said to improve on understanding and analyzing uploaded files, providing users with deeper insights and thorough responses. These enhancements sounded great on paper, but users report that in practice, things feel… off.
User Backlash: Strange Behavior and Weak Reasoning
Early adopters have flagged several quirks in the new model:
Reasoning Weaknesses: GPT-4o seems to struggle with complex tasks and logical reasoning, areas where its predecessor excelled.
Inconsistent Outputs: While some users praise its ability to write with flair, others find its responses oddly disjointed or overly simplistic.
Regression in Advanced Tasks: Tasks requiring deep analysis, multi-step reasoning, or nuanced understanding seem to suffer.
Wait - is the new GPT-4o a smaller and less intelligent model?
We have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o.
GPT-4o (Nov) vs GPT-4o (Aug):
➤… x.com/i/web/status/1…— Artificial Analysis (@ArtificialAnlys)
3:07 PM • Nov 21, 2024
Why is this happening? One theory centers on OpenAI’s introduction of the o1-preview model, explicitly designed for reasoning tasks. This new division of labor may have left GPT-4o’s reasoning capabilities underwhelming, sparking frustration among users expecting an all-in-one powerhouse.
The other elephant in the room is cost. OpenAI has not confirmed this, but some speculate that GPT-4o’s design may have been influenced by efforts to reduce operational expenses. Larger, more powerful models are notoriously expensive to run, and dialing back on computational intensity could have been a factor in its development.
But if cost-cutting measures did play a role, it raises an important question: at what point does efficiency compromise quality?
In my experience, for every day use, ChatGPT has actually grown less useful and more easily confused since the 4o update.
Perhaps it’s better at other things that I don’t do but in the past several months I’ve found myself needing to direct it, correct it, or simply give up on… x.com/i/web/status/1…
— Chad Aaron Hall (@readordieaslave)
2:02 PM • Nov 20, 2024
Have you tried GPT-4o? Are its creative writing skills worth the trade-off in reasoning and complex task performance?
Learn how to make AI work for you
AI won’t take your job, but a person using AI might. That’s why 800,000+ professionals read The Rundown AI – the free newsletter that keeps you updated on the latest AI news and teaches you how to use it in just 5 minutes a day.
Exploring the Shadows of AI 💻
In the city of Nairobi, Kenya, a vital yet often overlooked workforce is diligently shaping the future of AI. These workers, known as "humans in the loop," play a crucial role in training AI systems but face conditions that have recently been spotlighted in a 60 Minutes exposé. This is their story, reflecting both their hopes and hardships.
Naftali Wambalo, a father and mathematics graduate, is emblematic of many young Kenyans grappling with a high unemployment rate, which peaks at 67%. His journey into the AI sector was fueled by the promise of a stable and rewarding career. However, the reality proved starkly different, plunging him into an environment some liken to "modern-day slavery."
These workers spend countless hours in front of computer screens, labeling complex image and video data to train AI in recognizing objects, facial expressions, and emotions. This often includes exposure to highly disturbing content, such as graphic violence and other distressing scenes that are damaging to their mental health. Naftali describes the severe emotional toll this work takes, "I looked at people being slaughtered, people engaging in sexual activity with animals."
Compounding these challenges is the economic exploitation these workers face. They earn between $1.50 to $2 per hour, starkly contrasted with the $12.50 per hour that companies like OpenAI pay to the outsourcing firms like SAMA, which employ them. This glaring disparity, coupled with the lack of job security and short-term contracts, paints a grim picture of exploitation under the guise of opportunity.
Nearly 200 digital workers initiated a lawsuit against SAMA and Meta, citing "unreasonable working conditions" that have led to serious psychiatric issues. Their fight for justice is not just a legal battle but a moral challenge to the global tech industry, urging it to reassess the human cost of its advances.
This unfolding story raises a critical question that resonates globally: In our pursuit of a smarter future, have we neglected our core humanity?
Bests and Busts
Here's a look at this week's AI highlights and lowlights:
⭐ Best: AI Detects Woman’s Breast Cancer After Routine Screening Missed It
AI successfully identified early breast cancer in Sheila Tooth, a 68-year-old from the UK, after her mammogram was initially deemed normal by radiologists. The AI system, Mammography Intelligent Assessment, detected cancerous cells invisible to the human eye, enabling early treatment and highlighting AI's potential to enhance cancer detection accuracy and outcomes.
💩 Bust: Riot Blasts Netflix's "Disrespectful" AI Use on League of Legends Arcane Screen
Netflix faced backlash after using AI to expand Arcane's promotional art, prompting Riot Games to intervene. Alex Shahmiri, brand lead for Riot Games Music, called the AI-extended image "disrespectful" to the show's artists and confirmed its removal from Netflix's platform. The incident highlights Riot's strict stance against AI use in Arcane-related content, emphasizing the importance of human artistry in the League of Legends spinoff series.
Photo by Netflix
The Scoop 🍦
💼 Amazon Invests Another $4B In AI Firm Anthropic
Amazon has increased its investment in AI startup Anthropic by $4 billion, bringing the total to $8 billion, as part of an expanded partnership. This move designates Amazon Web Services as Anthropic's primary training and cloud provider, leveraging AWS's Trainium and Inferentia chips to enhance Anthropic's AI models like Claude.
🎥 YouTube's New AI Tool Will Create Backgrounds For Videos
YouTube has enhanced its Dream Screen AI tools for Shorts, allowing users to generate video backgrounds from text prompts using Google DeepMind's Veo model. This feature, available in select regions, offers creators cinematic 1080p backgrounds, setting YouTube apart from rivals like TikTok by providing dynamic video rather than static image backgrounds.
once upon a time... a princess had extremely high standards 💁♀️
made by @adrianbliss with Dream Screen and powered by Veo, Google DeepMind’s newest and most capable generative video model 🔜 coming soon to YouTube Shorts #Ad#DreamScreenAI
— YouTube Creators (@YouTubeCreators)
11:26 PM • Sep 19, 2024
🗣️ Apple Is Planning a Major AI Overhaul to Siri, Report Says
Apple is reportedly developing a new version of Siri, called "Siri LLM," which will utilize advanced large language models to make interactions more conversational and intuitive. This overhaul aims to compete with AI services like ChatGPT and Google Gemini, with plans for integration into iOS 19 and macOS 16 by 2026, enhancing Siri's capabilities while maintaining Apple's privacy-first approach.
📱 The End of Google Fit? Fitbit Set to Replace It on Future Android Phones
Fitbit is poised to replace Google Fit as the default fitness app on Android phones, starting with the Oppo Find X8 series. This marks the first time a non-Google Android phone has featured Fitbit as the pre-installed fitness app, signaling a potential shift for future Android devices. Google's acquisition of Fitbit in 2021 is now being leveraged to integrate Fitbit's technology more deeply into the Android ecosystem.
📃 Stanford Professor Includes Fake AI Citations in Filing on Deepfake Bill
Jeff Hancock, a Stanford professor and founding director of the Stanford Social Media Lab, is accused of using AI-generated fake citations in a legal argument supporting Minnesota's proposed deepfake bill. The affidavit reportedly contains references to non-existent studies, raising concerns about the credibility of the expert testimony.
Stay tuned for more exciting insights and tools in next week’s edition. Until then, keep overclocking your potential!
Zoe from Overclocked