• Grok Buys Cursor, MidJourney Goes Hardware, Hermes Agent & Evaluation-Driven Development
    Jun 26 2026

    MidJourney — the AI image company — just quit image generation to build 50,000 spas that scan your body slice by slice. Then the week got weirder.

    Co-hosts: Shimin Zhang, Dan Lasky, Rahul Yadav.

    ▸ SpaceX buys Cursor — Elon's SpaceX (xAI/"Grok Cursor") is acquiring Cursor for $60B in Class A common stock — a ~60x multiple on ~$1B revenue, largely to buy an enterprise foothold. (Shimin: the first real sign of an AI-tool consolidation phase.)

    ▸ MidJourney goes hardware — the image-gen pioneer is licensing micro-ultrasound chips to build 50,000 body-scan spas (first one: SF, 2027), aiming for a billion scans a month. Fully private, no VC backers, a self-described "community research lab." Terabytes/second — ~500 hours of HD video per single second of scan.

    ▸ Tool Shed: Hermes Agent (Nous Research) — the plugin-maximalist opposite of a minimal harness like Pi: built-in memory, a self-learning skill loop, cron scheduling, swappable memory providers, and ~20 chat channels out of the box. Dan: "parachuting in with sixteen crates of supplies and a film crew."

    ▸ Is AI ruining our skills? (Nature) — physicians' precancerous-lesion detection fell from 28.4% to 22.4% once the AI tool was removed; 52 engineers scored 50% on understanding their own code with AI vs 67% without. Cognitive debt is showing up in the data.

    ▸ Claude Code is a video game (Provi.me) — the "one more prompt" loop that keeps you up three hours past bedtime, and why AI finally made B2B SaaS addictive. Plus the "agent dice" repo: roll a natural 20 and a stop hook makes the agent reflect and write itself a skill.

    ▸ Evaluation-Driven Development (Decoding AI) — treat every AI feature as a hypothesis and gate the PR on an offline eval pipeline (built on Opik) instead of unit tests. Gold-standard vs synthetic datasets, code-metric vs LLM-as-judge evaluators, and an "aggression" dial for how big a jerk your reviewer is. (Shimin: Newtonian physics → quantum mechanics.)

    ▸ Two Minutes to Midnight — ChatGPT slips under 50% share (46.4%; Gemini 27.7%, Claude 10.3%), Nvidia raises $25B in its first bond deal since 2021, and Ed Zitron walks OpenAI's FT-verified financials ($38.5B loss in 2025). ~2B users — one in four people on Earth; no 10x left. Clock moved up to 5:00.

    ⏱ Chapters

    00:00 Cold Open & Welcome
    01:50 News: SpaceX Buys Cursor for $60B
    04:46 News: MidJourney Pivots to Body-Scan Spas
    11:45 Tool Shed: Hermes Agent (Nous Research)
    19:54 Post-Processing: Is AI Ruining Our Skills? (Nature)
    27:13 Post-Processing: Claude Code Is a Video Game
    35:23 Post-Processing: Evaluation-Driven Development (EDD)
    41:44 Two Minutes to Midnight: ChatGPT Under 50%, Nvidia Debt, OpenAI's Numbers
    55:06 Outro

    🔗 Articles we discussed

    News:
    • SpaceX to acquire Cursor — CNBC: https://www.cnbc.com/2026/06/16/spacex-spcx-cursor-acquisition-ipo.html
    • MidJourney's medical pivot — MidJourney: https://www.midjourney.com/medical/blogpost

    Tool Shed:
    • Hermes Agent docs — Nous Research: https://hermes-agent.nousresearch.com/docs/

    Post-Processing:
    • Is AI ruining our skills? Early results are in — Nature: https://www.nature.com/articles/d41586-026-01947-1
    • Claude Code is a video game — Provi.me: https://provi.me/cc-like-video-games
    • How Evaluation-Driven Development (EDD) works — Decoding AI (Paul Easton & Alejandro Aboy): https://www.decodingai.com/p/5b766861-0001-494f-a37f-4d4eb104dcfa

    Two Minutes to Midnight:
    • ChatGPT's market share slips below 50% for the first time — TechCrunch: https://techcrunch.com/2026/06/16/chatgpts-market-share-slips-below-50-for-first-time/
    • Nvidia seeks to raise over $25B in first bond deal since 2021 — Ars Technica: https://arstechnica.com/ai/2026/06/chipmaker-nvidia-seeks-to-raise-over-25b-in-first-bond-deal-since-2021/
    • Exclusive: OpenAI's financials — Where's Your Ed At (Ed Zitron): https://www.wheresyoured.at/exclusive-openai-financials/

    🎙 About ADI Pod

    ADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Lasky, Rahul Yadav. New episodes Tuesdays.

    • https://www.adipod.ai
    • humans@adipod.ai

    If something here changed your mind or gave you something to try on Monday, hit subscribe and leave a comment with what you tried.

    Show More Show Less
    56 mins
  • Fable 5 Ban, Meta's AI Gulag, Elias Thorne & What is Loop Engineering?
    Jun 19 2026
    Three days after Fable 5 launched, the US government banned it — for every foreign national on Earth, including Anthropic's own employees. Then it got weirder.This week on ADI Pod: the Fable 5 export ban, Meta's applied-AI "gulag," the Elias Thorne dataset virus, loop engineering, a local DeepSeek V4 demo, the paper that shatters Dunning-Kruger, and NBER's bubble math.Co-hosts: Shimin Zhang, Dan Lasky, Rahul Yadav.▸ Fable 5 & Mythos 5, export-banned — a national-security order cut access for all foreign nationals (even Anthropic's own staff) in ~90 minutes, reportedly after an AWS jailbreak claim; likely the end of universal frontier-model access. Shimin had a near-"AI psychosis" moment using it to design a novel drone.▸ Meta's "AI Gulag" — Alexandr Wang's unit drafts laid-off engineers to write puzzles and label data to train Meta's weaker models, on full salary and RSUs; the "gulag" label is a stretch, but the internal drama is real.▸ The Elias Thorne mystery (404 Media) — a lighthouse keeper seeded by ~111 ChatGPT-3.5 chats became a "dataset virus" now in ~88% of AI stories and "authoring" books on Amazon across every lab (Cornell's Hamilton & Mimno).▸ AI is fast, the economy isn't (howfastis.ai) — task horizons double every ~6 months, but weak-link / Theory-of-Constraints bottlenecks (Chad Jones; Goldratt) keep growth near 2%/yr; human judgment is the constraint AI can't yet remove.▸ Loop engineering (Addy Osmani) — six pieces turn a bare /loop (Ralph loop) into a real agent harness: automations, worktrees, skills, plugins/connectors, subagents (split the worker from the reviewer), and memory. It amplifies whatever judgment you bake into your skills.▸ Deep Dive — "Beyond the Steeper Curve" (Christopher Koch) — AI doesn't steepen Dunning-Kruger, it shatters it: "metacognitive decoupling" unglues output quality from self-assessment. Plus the "slop grenade" and the sycophancy trap (No One's Happy).▸ Vibe & Tell — Dan runs DeepSeek V4 Flash locally ("DS4," the dwarf star runner) on a Framework Ryzen 395 Max over ROCm, ~14 tok/s, wired to Pi agent — ~$4,000 of hardware, no cloud.▸ Two Minutes to Midnight — Claude on Apple's foundation-model backend (a commoditization tell), the end of subsidized inference, and an NBER paper pricing genuine insolvency risk into the AI build-out. Clock set back to 5:30.⏱ Chapters00:00 Cold Open & Welcome02:01 News: The US Government Bans Fable 5 & Mythos 510:35 News: Meta's "AI Gulag" (feat. Rahul)14:39 Post-Processing: The Elias Thorne Mystery21:12 Post-Processing: AI Is Fast, the Economy Isn't (howfastis.ai)29:22 Post-Processing: Loop Engineering (Addy Osmani)36:37 Deep Dive: Beyond the Steeper Curve (Dunning-Kruger, Shattered)43:46 Deep Dive: Appearing Productive & the Slop Grenade51:51 Vibe & Tell: DeepSeek V4 Flash at Home (DS4)57:39 Two Minutes to Midnight: Apple Foundation Models, Cheaper Inference, NBER Bubble Math1:08:44 Outro🔗 Articles we discussedNews:• Fable & Mythos access update — Anthropic: https://www.anthropic.com/news/fable-mythos-access• Anthropic lobbies the White House over the Mythos/Fable ban — Axios: https://www.axios.com/2026/06/14/anthropic-white-house-mythos-fable• Meta's months-old AI unit is a "soul-crushing gulag," say the engineers stuck inside it — TechCrunch: https://techcrunch.com/2026/06/12/metas-months-old-ai-unit-is-a-soul-crushing-gulag-say-the-engineers-stuck-inside-it/Post-Processing:• Chatbots keep telling stories about lighthouse keeper Elias Thorne — 404 Media: https://www.404media.co/elias-thorne-chatbots-llms-chatgpt-lighthouse-keeper-story/• How Fast Is AI? — Emory Taziki: https://howfastis.ai/• Loop Engineering — Addy Osmani: https://addyosmani.com/blog/loop-engineering/Deep Dive:• Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupling and the Limits of the Dunning-Kruger Metaphor — Christopher Koch (arXiv): https://arxiv.org/html/2603.29681• Appearing Productive in the Workplace — No One's Happy: https://nooneshappy.com/article/appearing-productive-in-the-workplace/Two Minutes to Midnight:• Claude SDK for Apple Foundation Models — Claude Platform docs: https://platform.claude.com/docs/en/cli-sdks-libraries/libraries/apple-foundation-models• Can tech companies learn to love cheaper AI models? — TechCrunch: https://techcrunch.com/2026/06/09/can-tech-companies-learn-to-love-cheaper-models/• What Investment Data Implies About the AI Transition — NBER Working Paper w35290 (Walter & Walter): https://www.nber.org/system/files/working_papers/w35290/w35290.pdf🎙 About ADI PodADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Lasky, Rahul Yadav. New episodes Fridays.• https://www.adipod.ai• humans@adipod.aiIf something here changed your mind or gave you...
    Show More Show Less
    1 hr and 10 mins
  • Claude Opus 4.8, Undocumented Claude Code Features, Eval Harness for AI Skills, Pope on AI
    Jun 5 2026
    "Every time you vibe code, you're gonna spend compute to skip the bottleneck of code review." — Shimin's cold open on ep-28. Trading compute for human labor: that's the default that's coming.This week on ADI Pod: Claude Opus 4.8 and Anthropic's dynamic-workflow tool, Pope Leo XIV's AI encyclical, a deep read of the Claude Code source code, a Pinterest method for testing whether your AI skills actually fire, two essays on senior engineering and the "dead economy," and a bubble check full of S-1s. Rahul's out this week; the clock moves up to 5:30.▸ Claude Opus 4.8 + the Dynamic Workflow Tool (TechCrunch): a 41-day fast-follow to 4.7. The new "dynamic workflow" is extra-high thinking plus a huge fan-out of coordinated parallel agents — the hosts call it "Gastown, by Anthropic." Dan likes it more than 4.7, but it hallucinated file names that don't exist and ate a full token budget in 25 minutes. Likely a Mythos distill, not a new base model.▸ Pope Leo XIV's AI Encyclical — "Magnifica Humanitas" (Vatican): "On Safeguarding the Human Person in the Time of Artificial Intelligence." The Pope gets that models are grown, not developed, warns against pretending AI is neutral, and ties automation to worker protection. Anthropic's Chris Olah was in the room. Shimin's take: better AI takes than most Fortune 500 CEOs.▸ I Read the Claude Code Source Code (Building Better): the undocumented stuff. A pre-tool-use hook can rewrite a tool's input mid-flight, return allow/deny with a reason, and inject context. Skills take undocumented front-matter (model + effort). Plus where settings.json really lives, and the auto-memory and "dream" toggles.▸ Technique Corner — An Engineer's Guide to Better AI Skills (Pinterest): a test harness for skill invocation — 15 positive prompts, 5 negative, 5 runs each. Codex went 73%→95% with everything combined; Claude went 62%→73% on a single change and got worse when you combined them. Asking the AI to improve the skill didn't help.▸ Post Processing — Is This Sustainable? (Jamie Hurst): seniors absorbed AI's rising stakes before juniors did. You skip the RFC and just build the thing. The scary part: AI depth is perishable in ~18 months; what lasts is taste and judgment.▸ Post Processing — The Dead Economy Theory (Owen McGrann): a turn-by-turn case that replacing workers with AI eats its own market. Peter Thiel, a Valley misread of Nietzsche, and UBI. Shimin pushes back while half-infected by the inevitability virus.▸ Two Minutes to Midnight (SEC + Qazinform): SpaceX's S-1 claims a $26.5T market that's mostly "AI" and says "truth seeking" 39 times. Anthropic overtakes OpenAI as the most valuable AI startup on a $65B Series H (~3x its February mark) plus a confidential S-1. Microsoft pulls Claude Code back to Copilot on cost. Clock -> 5:30.⏱ Chapters00:00 Cold Open & Welcome02:09 News: Claude Opus 4.8 & the Dynamic Workflow Tool08:27 News: Pope Leo XIV's AI Encyclical14:17 ToolShed: I Read the Claude Code Source Code22:10 Technique Corner: Do Your AI Skills Actually Fire?29:15 Post Processing: Is This Sustainable? (Senior Eng in the AI Age)35:30 Post Processing: The Dead Economy Theory42:12 Two Minutes to Midnight: SpaceX's S-1 & the $26.5T "AI" TAM45:50 Two Minutes to Midnight: Anthropic Overtakes OpenAI53:35 Outro🔗 Articles we discussedNews:• Anthropic releases Opus 4.8 with new dynamic workflow tool — TechCrunch: https://techcrunch.com/2026/05/28/anthropic-releases-opus-4-8-with-new-dynamic-workflow-tool/• Magnifica Humanitas (encyclical on AI) — Pope Leo XIV / Vatican: https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html• I Read the Claude Code Source Code — Building Better: https://buildingbetter.tech/p/i-read-the-claude-code-source-codeTechnique Corner:• An Engineer's Guide to Better AI Skills — Pinterest Engineering: https://medium.com/pinterest-engineering/an-engineers-guide-to-better-ai-skills-implementing-a-testing-process-to-optimize-agent-a000c9c9abcdPost Processing:• Is This Sustainable? — Jamie Hurst: https://jamiehurst.co.uk/2026-05-24_ai-sustainable• The Dead Economy Theory — Owen McGrann: https://www.owenmcgrann.com/p/the-dead-economy-theoryTwo Minutes to Midnight:• SpaceX (Space Exploration Technologies) Form S-1 — SEC EDGAR: https://www.sec.gov/Archives/edgar/data/1181412/000162828026036936/spaceexplorationtechnologi.htm• Anthropic surpasses OpenAI to become world's most valuable AI startup — Qazinform: https://qazinform.com/news/anthropic-surpasses-openai-to-become-worlds-most-valuable-ai-startup🎙 About ADI PodADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Lasky, Rahul Yadav. New episodes Tuesdays.• https://www.adipod.ai• humans@adipod.aiIf something here gave you ...
    Show More Show Less
    55 mins
  • OpenAI Beats Musk, Gemini 3.5 Flash & AI Burnout Mitigation
    May 29 2026
    "Sam Altman won in court against Elon Musk. But, really, we all lost." That's the New Yorker headline Dan brought to ep-27 — and the question under it is whether any one person should own AI safety.This week on ADI Pod: the OpenAI–Musk verdict and who really owns AI safety, Gemini 3.5 Flash in AI Overviews, a $48K home GPU server, AI burnout from two angles, the "$100M startup in your laptop" myth, the slop grenade, and an IPO squeeze that could funnel ~10% of the major indexes into three AI firms. Rahul's out this week; clock moves back to 6:15.▸ OpenAI v. Musk (The New Yorker): OpenAI wins on a statute-of-limitations technicality. Musk's lawyer argues "we could all die" from AI; the judge notes he'd mean it more if he didn't fund xAI. The courtroom "butt pillows" become the complacency metaphor.▸ Gemini 3.5 Flash: shipped Flash-only into AI Overviews (Dan's bet: it's on TPUs). Mathier than 3.1 but fewer results; the viral "can't search 'disregard'" bug was a harness failure. The pelican it drew looks dressed for a Miami crypto conference (h/t Simon Willison).▸ Hardware Hut — was a $48K GPU server worth it? (rosmine.ai): an ex-FAANG researcher's 6× RTX 6000 Ada rig breaks even near 80% utilization, then ~$125/month and constant riser failures. Shimin's version: a 128GB Mac for local models, or keep paying Anthropic?▸ Technique Corner — AI burnout (Evil Martians + Siddhant Khare): cap parallel agents at 3–4, keep hands on the keyboard, accept 70% and hand-code the rest. Shimin's confession: seven Claude Code sessions after work. Capper: Microsoft cancels Claude Code subs after costs top human devs.▸ Post Processing — Human Bottlenecks (borretti.me): the $100M startup in your laptop stays there because the limiter was always you — judgment, energy, executive function — not the tools.▸ Dan's Rant — the Slop Grenade (noslopgrenade.com): paste raw Claude output at a coworker instead of an answer and you've thrown one. The successor to nohello.com. Fix: lead with your one-line take, then attach the output for the full kaboom.▸ Two Minutes to Midnight (Morningstar + Is AI Profitable Yet?): SpaceX/OpenAI/Anthropic could add ~10% to the Morningstar 100; Nasdaq cut its post-IPO wait from 12 months to ~15 trading days. On isaiprofitable.com only Nvidia is green (+$253B); Amazon leads capex at −$291B. Clock → 6:15.🔗 Articles we discussedNews:• OpenAI Won, But We All Lost — The New Yorker: https://www.newyorker.com/news/letter-from-silicon-valley/sam-altman-won-in-court-against-elon-musk-but-really-we-all-lost• Gemini 3.5: Frontier Intelligence With Action — Google: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/#gemini-3-5-flash• Gemini 3.5 Flash hands-on — Simon Willison: https://simonwillison.net/2026/May/19/gemini-35-flash/Hardware Hut:• Was My $48K GPU Server Worth It? — rosmine.ai: https://rosmine.ai/2026/05/13/was-my-48k-gpu-worth-it/Technique Corner:• AI-Assisted Engineers Are Burning Out — Evil Martians: https://evilmartians.com/chronicles/ai-assisted-engineers-are-burning-out-is-this-fine• AI Fatigue Is Real — Siddhant Khare: https://siddhantkhare.com/writing/ai-fatigue-is-realPost Processing:• Human Bottlenecks — borretti.me: https://borretti.me/article/human-bottlenecksDan's Rant:• No Slop Grenade: https://noslopgrenade.comTwo Minutes to Midnight:• The SpaceX IPO: How US Index Funds Will Adapt — Morningstar (Zachary Evans): https://global.morningstar.com/en-ca/funds/spacex-ipo-how-us-stock-index-funds-will-adapt• Is AI Profitable Yet?: https://isaiprofitable.com/🎙 About ADI PodADI Pod (Artificial Developer Intelligence) is a weekly podcast about AI and software development for working developers. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Lasky, Rahul Yadav. New episodes Tuesdays.• https://www.adipod.ai• humans@adipod.aiIf something here gave you something to try on Monday, subscribe and tell us what you tried. (00:00) - Cold Open & Welcome (02:15) - News: OpenAI Beats Musk — "We All Lost" (The New Yorker) (07:20) - News: Gemini 3.5 Flash & the Pelican Test (13:05) - Hardware Hut: Was a $48K GPU Server Worth It? (19:14) - Technique Corner: AI-Assisted Engineers Are Burning Out (28:27) - Microsoft Cancels Claude Code — the Token Pendulum (30:35) - Post Processing: Human Bottlenecks (36:10) - Raising the First AI-Native Generation (40:05) - Dan's Rant: The Slop Grenade (44:49) - Two Minutes to Midnight: The SpaceX IPO Index Squeeze (49:32) - Is AI Profitable Yet? (Capex by the Billions) (55:20) - Outro
    Show More Show Less
    56 mins
  • LLM Neuralanatomy with David Noel Ng, Forward Deployed Everybody, Preferences Revealed by AI
    May 22 2026
    This week on ADI Pod: Mira Murati's Thinking Machines ships its first product (interaction models), Meta employees fight the mouse-tracking program with flyers, and the Palantir-coined "forward deployed engineer" job title quietly takes over the post-AI engineering org chart. Our sit-down is with Dr. David Noel Ng (https://dnhkng.github.io/) — author of the LLM Neuroanatomy series we covered a few weeks back. He explains why he watched action potentials race down rat neurons at 10,000 fps before getting into LLM interpretability. Dan runs DeepSeek-V4 Flash on a 128 GB Ryzen 395 Max box and vibe-codes an ESP32 home dashboard in C. Deep dive on a paper asking whether AI should obey what you say or what you actually do. And in Two Minutes to Midnight: Cerebras pops 108% on IPO day, Anthropic passes OpenAI on Ramp business data, and we read Andy Hall on the politics of jobless prosperity. Clock stays at 6 minutes.Co-hosts: Shimin Zhang, Dan Lasky, Rahul Yadav.▸ Interaction Models — Thinking Machines' first product. A small Qwen 3.5–class "interaction model" runs the UI; a background model handles the heavy lift. They credit the pattern to Qwen, not themselves. Multimodal by default, and surprisingly snappy in the demo.▸ Meta vs. its own employees — Meta posted flyers around its offices reading "don't want to work at the employee data extraction factory?" after announcing it would record keystrokes, mouse movements, and screens to train internal AI. Last episode's "Model Capability Initiative" story got worse.▸ Here Comes Forward Deployed Everybody (Scott Werner / works on my machine) — Salesforce moves to an API-only data model. Palantir's "forward deployed engineer" title (originally called "delta") becomes the new pit-crew role across every department. We argue Jevons paradox vs. just-rebranding-the-least-glamorous-job: 20 marketers + 5 pit crews → 30 marketers + 10 pit crews as productivity rises.▸ Sit-down: Dr. David Noel Ng on LLM Neuroanatomy — fluorescent dyes that change color with membrane voltage, what brain-microchip interfacing taught him about feature attribution, and why interpretability deserves the same first-principles rigor as wet-lab biology. New posts coming.▸ Vibe Intel — Dan runs Antires's specialized DeepSeek-V4 Flash fork of llama.cpp. Q2 quant on the front, full experts on the back, SSD-cached prefills, ~10 tokens/sec on a Ryzen AI Max+ 395 with 128 GB unified memory. Output peg: Sonnet 4.5-ish. Plus an ESP32 / ESPHome dashboard with a several-thousand-line vibe-coded C lambda that, against all reasonable expectation, works.▸ Deep Dive — "Should I State or Should I Show?" (Keaton Ellis & Wanying Huang) — three AIs given the same lottery decisions: prompt-only AI hit 70% match with the human; data-only AI hit 75%; both-AI dropped to the worst of the three because it defaulted to the prompt 66% of the time when prompt and behavior conflicted. Implication: if EU AI Act transparency rules force you to honor stated preferences, you're literally picking the worst-performing model.▸ Two Minutes to Midnight — Cerebras raises $5.5B in IPO, stock pops 108% on day one (claim: ~80× memory throughput vs comparable NVIDIA GPUs). Anthropic now holds 34.4% of Ramp-card-paying businesses, beating OpenAI for the first time. Andy Hall's "Politics of Jobless Prosperity": 2% unemployment jump is the line in the sand for political stability — opens with FDR's 1944 State of the Union. Clock held at 6 minutes; nothing this week jumped the needle.⏱ Chapters00:00 Cold Open & Welcome02:08 News: Interaction Models from Thinking Machines05:57 News: Meta Employees Protest the Mouse-Tracking Program10:53 Post-Processing: Here Comes Forward Deployed Everybody (Scott Werner)22:50 Sit-Down: Dr. David Noel Ng on LLM Neuroanatomy40:05 Vibe N Tell: DeepSeek-V4 Flash at Home on a 395 Max45:37 Vibe N tell: ESP32 Home Dashboards via Vibe Coding48:51 Deep Dive: Should I State or Should I Show? (Ellis & Huang)1:04:24 Two Minutes to Midnight: Cerebras IPO, Anthropic vs OpenAI, Jobless Prosperity1:09:34 Outro🔗 Articles we discussedNews:• Interaction Models — Thinking Machines: https://thinkingmachines.ai/blog/interaction-models/• Meta employees protest the mouse-tracking program — Engadget: https://www.engadget.com/2172212/meta-employees-are-protesting-the-companys-mouse-tracking-program/Post-Processing:• Here Comes Forward Deployed Everybody — Scott Werner (works on my machine): https://worksonmymachine.ai/p/here-comes-forward-deployed-everybodySit-Down — Dr. David Noel Ng:• Substack: https://dnhkng.substack.com/• Site: https://dnhkng.github.io/* Rest of David's home AI Lab Build Story: https://dnhkng.github.io/posts/hopper/Deep Dive:• Should I State or Should I Show? Aligning AI with Human Preferences — Keaton Ellis & Wanying Huang (arXiv): https://arxiv.org/html/2603.29317v1Two Minutes to Midnight:• Cerebras raises $5.5B, kicks off 2026's IPO season — ...
    Show More Show Less
    1 hr and 10 mins
  • Multi-Agent Patterns for 2026, Anthropic on Colossus, Brockman's Tesla Painting
    May 15 2026

    Anthropic finally fixed the compute crunch — by partnering with the one company OpenAI is currently being sued by. Plus Brockman's deposition journal drops, we unpack why a billion-token context window needs an entirely different GPU architecture, and Phil Schmid's four sub-agent patterns for 2026.


    This week on ADI Pod: Dan, Rahul and Shimin go heavy on the Elon news — the OpenAI lawsuit, the Anthropic + SpaceX/XAI compute deal that just lifted Claude Code's peak-hour limits, and the Wall Street Journal's data on Grok's user base collapsing (now ~1/30th of ChatGPT's). Then we move into the substance: NVIDIA's Rubin CPX architecture and disaggregation, Phil Schmid's four sub-agent patterns and where the agent-teams pattern is headed, Jack Clark's piece on recursive AI research automation, and Simon Willison reluctantly admitting he runs Claude Code with --dangerously-skip-permissions by default.


    We close with bubble watch — and move the clock further from midnight after a week with no major red flags.


    🔗 Articles we discussed


    ▸ How Elon Musk left OpenAI, per Greg Brockman (TechCrunch)

    https://techcrunch.com/2026/05/06/how-elon-musk-left-openai-according-to-greg-brockman/


    ▸ Anthropic raises Claude Code usage limits, credits SpaceX deal (Ars Technica)

    https://arstechnica.com/ai/2026/05/anthropic-raises-claude-code-usage-limits-credits-new-deal-with-spacex/


    ▸ Anthropic-SpaceX AI deal (Wall Street Journal)

    https://www.wsj.com/tech/ai/anthropic-spacex-ai-deal-elon-musk-f86ea369?st=XUQnP7&reflink=desktopwebshare_permalink


    ▸ The road to a billion-token context (CACM)

    https://cacm.acm.org/news/the-road-to-a-billion-token-context/


    ▸ Sub-agent patterns for 2026 — Phil Schmid

    https://www.philschmid.de/subagent-patterns-2026


    ▸ Import AI 455: automating AI research — Jack Clark

    https://importai.substack.com/p/import-ai-455-automating-ai-research


    ▸ Vibe coding and agentic engineering are getting closer than I'd like — Simon Willison

    https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/


    ▸ You need AI that reduces your maintenance costs — James Shore

    https://www.jamesshore.com/v2/blog/2026/you-need-ai-that-reduces-your-maintenance-costs


    ▸ Anthropic reportedly agrees to pay Google $200B for chips and cloud access (Engadget)

    https://www.engadget.com/2165585/anthropic-reportedly-agrees-to-pay-google-200-billion-for-chips-and-cloud-access/


    ▸ Silicon Valley bets on floating AI data centers powered by ocean waves (Ars Technica)

    https://arstechnica.com/ai/2026/05/silicon-valley-bets-on-floating-ai-data-centers-powered-by-ocean-waves/


    ⏱ Chapters


    00:00 Cold Open & Welcome

    02:15 News: Elon vs OpenAI Trial Drama (Brockman's Journal & The Tesla Painting)

    08:30 News: Anthropic Joins Colossus (SpaceX/XAI Compute Deal)

    13:06 Hardware Hunt: The Road to a Billion-Token Context (NVIDIA Rubin CPX)

    21:56 Technique: Phil Schmid's 4 Sub-Agent Patterns for 2026

    30:11 Post-Processing: Jack Clark — AI Systems Are About to Build Themselves

    45:17 Post-Processing: Simon Willison — Vibe Coding & Agentic Engineering

    55:14 Post-Processing: James Shore — AI That Reduces Maintenance Costs

    1:01:39 Two Minutes to Midnight

    1:12:05 Outro


    🎙 About ADI Pod


    ADI Pod (Artificial Developer Intelligence) is a weekly conversation show about AI and software development. We go through hundreds of links and dozens of newsletters each week so you don't have to. Hosts: Shimin Zhang, Dan Łaski, Rahul Yadav.


    🌐 https://www.adipod.ai

    📧 humans@adipod.ai

    🦋 Shimin on Bluesky: @shiminsky.bsky.social


    If something in this episode changed your mind or gave you something to try on Monday, hit subscribe and leave a comment with what you tried.


    #ADIPod #AINews #ClaudeCode #Anthropic #AICoding

    • (00:00) - Cold Open & Welcome
    • (02:15) - News: Elon vs OpenAI Trial Drama (Brockman's Journal & The Tesla Painting)
    • (08:30) - News: Anthropic Joins Colossus (SpaceX/XAI Compute Deal)
    • (13:06) - Hardware Hunt: The Road to a Billion-Token Context (NVIDIA Rubin CPX)
    • (21:56) - Technique: Phil Schmid's 4 Sub-Agent Patterns for 2026
    • (30:11) - Post-Processing: Jack Clark — AI Systems Are About to Build Themselves
    • (45:17) - Post-Processing: Simon Willison — Vibe Coding & Agentic Engineering
    • (55:14) - Post-Processing: James Shore — AI That Reduces Maintenance Costs
    • (01:01:39) - Two Minutes to Midnight
    • (01:12:05) - Outro
    Show More Show Less
    1 hr and 13 mins
  • OpenAI's Goblin Problem, 10 Lessons When Code Is Cheap, AI Addiction Loop
    May 8 2026
    Why does the leaked Codex CLI system prompt explicitly tell GPT-5.5 to never mention goblins, gremlins, raccoons, trolls, ogres, or pigeons? Why is OpenAI now gating its cyber model the same way it mocked Anthropic for gating Mythos last month? And what does it mean that Dan tried to write a personal project without Claude — and physically couldn't?Co-hosts Shimin Zhang, Dan Lasky, and Rahul Yadav cover these and more on ADI Pod #24. This week: GPT-5.5 Cyber's gated release, OpenAI's "Where the Goblins Came From" RLHF post-mortem, Adi Osmani's five patterns for long-running agents, Jesse Vincent's adversarial review prompt, Drew Brunig's 10 lessons for agentic coding, Ivan Turkovic's history of failed attempts to eliminate programmers, Nilay Patel's "software brain" thesis, the Nature paper showing warm AI models lose 10–30 percentage points of accuracy, and a $1.1B raise for an AI lab that wants to train without human data.## In this episode▸ **GPT-5.5 Cyber gating** — Sam Altman called Mythos's gated release "fear-based marketing" two months ago. Now OpenAI is doing the exact same thing with the GPT-5.5 cyber variant. Multi-tier model access (enterprise, government, research preview, cyber) is becoming the default — and Shimin worries the White House is about to add another gate.▸ **The Goblin Problem** — OpenAI's Codex CLI prompt was open-sourced and turned out to include "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons." OpenAI's "Where the Goblins Came From" post-mortem reveals a textbook RLHF failure: a "nerdy persona" reward signal trained the model to mention goblins in 66.7% of nerdy responses, and the tic propagated through supervised fine-tuning to non-nerdy responses too.▸ **Long-running agents (Adi Osmani / Elevate)** — Five patterns for agents that run for hours or days: checkpoints over zero-or-100 outputs, governing memory like microservices, ambient processing without forced human-in-the-loop, fleet orchestration, and budget circuit-breakers. Bonus: the running gag where Rahul realizes the post is essentially an ad for Google Enterprise Agent Platform.▸ **Adversarial review prompts (Jesse Vincent / superpowers)** — A four-step technique for getting better code review out of agents: invoke "fresh eyes," dispatch competing subagents, promise a reward (a cookie), and threaten disappointment if they don't find N issues.▸ **10 Lessons for Agentic Coding (Drew Brunig)** — Implement to learn, rebuild often, invest in end-to-end tests, document intent, keep specs in sync, find the hard stuff, automate the easy stuff, develop taste, agents amplify experience, and the kicker: agent code is "free as in puppies" — the puppy is free, but you have to feed it and walk it.▸ **The Eternal Promise (Ivan Turkovic)** — A history of attempts to eliminate programmers from COBOL through 4GLs, CASE tools, the Japanese 5th Generation project, no-code/low-code, and now LLMs. Each abstraction layer expanded software jobs rather than replacing them. Shimin's reframe: "Software is calcified business process. Someone has to do the calcifying."▸ **People Do Not Yearn for Automation (Nilay Patel / The Verge)** — Why Gen Z hopefulness about AI dropped to 18% (anger up to 31%), why America is uniquely AI-pessimistic, and what Nilay calls "software brain" — the Silicon Valley assumption that human life can be reduced to data and algorithms. Plus Anuradha Pandey's reframe: stop calling them social media, call them ad platforms.▸ **Warm models lose accuracy** — A Nature paper finds AI models trained for warmth lose 10–30 percentage points of accuracy. A companion study shows humans trust warm models *more* even when they're wrong. Frontier labs now have an explicit incentive to train the warmest model, not the most accurate one. Plus: Richard Dawkins talks to "Claudia" for three days and concludes AI must be conscious.▸ **Dan's Rant — The AI Addiction Loop** — Dan tries to build a Home Assistant TypeScript automation without Claude. Can't. "It felt like they had fundamentally broken my arm in a way that I can't do this task as quickly as I wanted to. That scares me a lot." Shimin: "We're running into the social media addiction loop in three months instead of a decade."▸ **Two Minutes to Midnight** — OpenAI projects ChatGPT Plus dropping from 44M to 9M subscribers in 2026 while scaling the ad-supported tier from 3M to 112M (30×). David Silver raises $1.1B for Ineffable Intelligence — a no-human-data approach inspired by AlphaGo. Scout AI raises $100M for autonomous military vision-language-action models. Bubble Clock held at 4:00 minutes.## Key takeaways— Reward hacking can propagate latent persona quirks through fine-tuning in ways the lab itself only catches when users surface them.— Memory drift, not raw context size, is the real ceiling for long-running agents. Govern memory like you govern microservices.— Code is free as in puppies, ...
    Show More Show Less
    1 hr and 26 mins