Open-source AI models are crushing benchmarks and challenging the closed-source world, while Mark Zuckerberg says the future isn’t one giant model—it’s billions of personal superintelligences. Let’s dive in ⬇️
In today’s newsletter ↓
🆚 Open-source challengers aim past GPT-4
🧠 Zuckerberg plots a personal AI for everyone
💼 Big Tech’s AI–core business blur sparks antitrust talk
🏀 AI technologists negotiate $100 million contracts
🔬 Weekly Challenge: Stress-test models on OpenRouter
Open-source has never looked so formidable. In the span of two weeks, three heavyweight contenders dropped code that edges closer to closed-source titans.
Beijing-based Zhipu released GLM 4.5-9B weights under an Apache-2.0 license. Benchmarks in English and Chinese show 77 percent on MMLU—roughly GPT-3.5 territory—while context windows stretch to 256 k tokens, enough to swallow a full textbook. The catch: generation still lags in reasoning depth, and the largest 130 B-parameter checkpoint remains research-only.
A surprise repo called Horizon Alpha-36B appeared on Hugging Face with a model card referencing “OpenAI alignment weights.” Researchers quickly noted a GPT-style tokenizer and strong code synthesis scores. OpenAI hasn’t claimed the drop, but speculation is rampant that the company quietly seeded a community test bed before GPT-5. Until provenance is confirmed, major platforms are sandboxing Horizon-Alpha to watch for license or safety red flags.
Shanghai’s Moonshot pushed Kimi K2-MoE, a mixture-of-experts model boasting 1 million-token context and 88 percent on GSM8K math—within two points of Gemini 2.5-Pro. Early adopters love Kimi’s long-document Q&A but report slower first-token latency outside mainland China.
Sovereignty: Chinese labs pitch open models as a hedge against U.S. chip sanctions.
Cost curves: Self-hosting GLM 4.5 on eight H100s costs ≈$1.20 per million tokens—half the price of GPT-4-o via API.
Innovation loops: Open weights let startups fine-tune for niche domains (legal Spanish, quantum chemistry) without waiting for closed vendors.
Yet limitations remain: scattered documentation, inconsistent safety filters, and no guarantee of long-term support. For developers, the decision is no longer open vs. closed but “Which blend of cost, capability, and governance fits my stack?” The arms race just turned cooperative—and chaotic.
In a manifesto titled “Personal Superintelligence,” Mark Zuckerberg laid out Meta’s most ambitious vision yet: an AI that “knows us deeply, understands our goals, and helps us achieve them.” Unlike rivals chasing a monolithic AGI, Meta says the future belongs to billions of individualized models running on-device and in a privacy-aware cloud.
Today Mark shared Meta’s vision for the future of personal superintelligence for everyone.
Read his full letter here: meta.com/superintellige…— AI at Meta (@AIatMeta)
1:06 PM • Jul 30, 2025
Here’s why Meta is uniquely positioned to make a foothold in the ‘personal superintelligence’ race:
Context-rich hardware: Meta’s next-gen smart-glasses will use cameras and multimodal sensors to feed real-time context into a local Llama variant.
Edge inference: A pared-down “ego-model” will run on-device, with heavy reasoning off-loaded to data-center LLMs tuned to a user’s preferences.
Open ecosystem: While acknowledging safety trade-offs, Meta reiterates its commitment to open-sourcing most research checkpoints—arguing “a free society requires visible code.”
Regarding infrastructure, Meta is one of the best positioned tech companies to take advantage of compute, talent, and spend. From buying a 49% stake in Scale to luring countless AI researchers and scientists away from top companies like Apple, OpenAI, and others, Zuckerberg seems to be digging in deep - both in his pockets and while breaking ground on new data centers.
Privacy drift: Glasses that “see what we see” intensify fears of always-on surveillance.
Safety scaling: Building guardrails for billions of unique models is uncharted territory.
Economic shakeup: If everyone wields a negotiation-savvy AI, does anyone have leverage?
Still, the upside is hard to ignore. Imagine an assistant that drafts a pitch in your voice, books travel around your calendar quirks, and nudges you off doom-scrolling—without ever phoning home to a centralized brain. Meta believes getting there first could reshape the consumer-tech pecking order for the next decade.
Goal: Figure out which open-source (or semi-open) model actually fits your daily workflow—without burning hours of guess-and-check.
Here’s what to do:
🔎 Pick Your Fighters – open OpenRouter.ai and bookmark three engines with very different DNA (e.g., Llama-3-70B, Mixtral-8x22B, and Kimi K2-MoE).
✍️ Craft a Nuanced Prompt – start with a tough but realistic task:
“Draft a polite refund email that cites EU consumer-rights regulation 2011/83/EU and offers a store credit alternative.”
⚡ Run Them in Parallel – note first-token latency, full-response time, word count, and whether the model refuses, hallucinates law, or nails the citation.
🔄 Switch Contexts – recycle the same trio of models on:
• legal Spanish summary
• a 12-line haiku epic
• Python code to sort a 2 GB CSV
📊 Scorecard It – rate each run 1-5 on speed, accuracy, tone, and cost per 1K tokens.
That’s it for the latest in AI news this week! Open-source fireworks and Meta’s personal-AI gamble prove the frontier is as much about philosophy as horsepower. Which side are you on?Hit reply and share.
Zoe from Overclocked