The Private AI Lab cover art

The Private AI Lab

The Private AI Lab

By: Johan van Amersfoort
Listen for free

The Private AI Lab is a monthly podcast where we explore the future of Artificial Intelligence behind the firewall. Hosted by Johan from Johan.ml, each episode invites industry experts, innovators, and thought leaders to discuss how Private AI is reshaping enterprises, technology, and society. From data sovereignty to air-gapped deployments, from GPUs to governance — this podcast uncovers the real-world experiments, failures, and breakthroughs that define the era of Private AI. 🎙️ New episode every month. 🌐 More at Johan.mlJohan van Amersfoort
Episodes
  • 017 - SUSE AI Factory with NVIDIA Explained
    Jun 30 2026

    Most enterprise AI projects succeed as pilots. They fail on the way to production. In this episode, Rhys Oxenham, VP and General Manager of AI at SUSE, joins me to break down what SUSE announced at SUSECon 2026: SUSE AI Factory with NVIDIA.


    We go deep on what it actually is, what problem it solves, and what happens when AI starts managing the infrastructure itself. We cover the assembly line concept, validated blueprints, why this isn't just for large enterprises, unified support across the full stack, digital sovereignty as resilience, and the very real governance challenge when agents start acting autonomously in your cluster, what I've been calling "YOLO ops." Release is July 2026. This is worth understanding before then.


    🕐 CHAPTERS

    00:00 — Intro

    01:15 — Who is Rhys Oxenham and what happened at SUSECon 2026

    03:19 — What is SUSE AI Factory with NVIDIA?

    07:16 — The assembly line: what moves through it and where it starts and ends

    10:24 — Blueprints explained: RAG, digital assistants, research agents

    12:03 — ClickOps vs GitOps: who is each mode for?

    15:10 — Scale: from single node to thousands of GPUs

    17:29 — DGX Spark compatibility (teaser)

    19:34 — Open source + enterprise support: one throat to choke

    21:34 — Day-two operations: what does running it actually look like?

    25:54 — Digital sovereignty: open source + NVIDIA — where does the openness live?

    30:14 — MCP servers, agentic AI, and integrations with N8N and others

    34:00 — YOLO ops: how do you govern autonomous agents in your cluster?

    38:32 — Model selection: which model for which use case?

    44:16 — AI ops blueprints: what's coming after launch

    46:03 — Release date and what to expect in July 2026

    48:56 — Wrap up and where to follow Rhys


    🔗 LINKS

    SUSE AI Factory with NVIDIA → https://www.suse.com

    Rhys Oxenham on LinkedIn → https://www.linkedin.com/in/rhys-oxenham/

    Johan on LinkedIn → https://www.linkedin.com/in/hojanThe Private AI Lab newsletter → https://www.linkedin.com/newsletters/the-private-ai-lab-7381951883810111489


    🎙️ ABOUT THE GUEST

    Rhys Oxenham is VP and General Manager of AI at SUSE, where he leads all AI product strategy and execution. He came up through solution architecture and field engineering before running SUSE's Edge and Telco Engineering groups, deploying infrastructure in air-gapped, industrial, and tactical environments. He keynoted SUSECon 2026 in Prague to announce the SUSE AI Factory with NVIDIA partnership.

    🔔 SUBSCRIBE for weekly episodes on private AI, self-hosted infrastructure, and what it actually takes to run AI inside your organisation.

    📨 Follow The Private AI Lab newsletter on LinkedIn for episode breakdowns and experiment write-ups.

    Show More Show Less
    53 mins
  • 016 - Nemotron 3 Ultra: NVIDIA’s Open-Weights Frontier Agent Brain (1M Context, 5x Faster)
    Jun 12 2026

    Johan breaks down NVIDIA’s ComputeEx 2026 announcement of Nemotron 3 Ultra 550B-A 55B, an open-weights mixture-of-experts model with 550B total parameters and 55B active, positioned as an orchestration “agent brain” for multi-step tasks behind the firewall. He reviews NVIDIA’s benchmarks versus GLM 5.1, Kimi K 2.6, and Qwen 3.5, highlighting best-in-class instruction following (82%), long-context performance (95%) with a 1M-token window, strong agent productivity (91%), and weaker coding results on TerminalBench versus Kimi. Johan emphasizes reported advantages in speed (~300 tokens/sec, ~5x faster), cost (up to ~30% cheaper on SWE-bench tests), and deployability via a unified NVFP4 checkpoint optimized for H100 and B200 GPUs, plus NemoClaw as the agent blueprint. He closes with an early-access demo comparing two agents researching Netherlands’ 2026 World Cup odds, showing Nemotron’s more granular path analysis and a 5.8% win estimate.00:00 Private AI Lab Intro01:19 Nemotron Ultra Explained02:22 Agent Brain Focus03:07 Benchmark Reality Check05:14 Speed And Cost Edge06:11 Training And Precision08:02 NeMo Claw Agents08:58 World Cup Agent Demo12:22 Why This Matters13:17 Wrap Up And Links

    Show More Show Less
    14 mins
  • 015 - Meet Sparky: A Real-Life Jarvis with Alexis Gallagher
    May 13 2026

    I've been trying to build my own Jarvis for years. Then I met Alexis Gallagher at GTC — and Sparky is the closest thing I've seen.

    Alexis is an AI researcher and developer, formerly at Answer AI and Google, now building something most people in AI aren't: a robot designed not just to be useful, but to be *alive*. Sparky lives on his desk in San Francisco. He initiates conversations. He develops his own evolving interests — eels, catenary arches, abandoned infrastructure. He knows who's in the room, when to speak, and when to stay quiet. And he noticed when it was Alexis's first Friday after leaving his job.

    In this episode we go deep on the two design goals behind Sparky (useful and alive), the OpenClaw orchestration layer, the social awareness architecture running five times per second, the shared workspace principle that unlocks genuinely useful AI at a desk, and the tradeoffs between cascading and voice-to-voice architectures. We also do a live model switch mid-episode — from Claude Sonnet 4.6 to Nemotron 3 Super 120B running locally on a DGX Spark. It goes impressively well. Until it doesn't. That's in there too.


    Guest

    Alexis Gallagher — AI researcher and creator of Sparky

    🌐 myrobotSparky.com

    🔗 https://www.linkedin.com/in/alexis-gallagher/


    Key topics covered

    - The two design goals: useful AND alive — and why "alive" is the one almost nobody builds for

    - How Sparky develops and evolves

    - The social awareness stack

    - What OpenClaw enables

    - The shared workspace principle

    - Cascading architecture (STT → LLM → TTS) vs voice-to-voice — the intelligence tradeoff

    - Hardware: Reachy Mini Lite, RTX 3090, DGX Spark, Raspberry Pi — the full spectrum

    - Live model switch: Claude Sonnet 4.6 → Nemotron 3 Super 120B (the Flowers for Algernon moment)

    - The future of personal AI — why embodied social presence is the natural human interface


    Chapters


    ```

    00:00 Introduction

    00:39 Who is Alexis Gallagher?

    01:04 The pivotal AI moment: speech recognition in 2015

    03:14 Science fiction to reality — where are the talking robots?

    04:22 Sparky introduces himself (live on air)

    05:33 The two design goals: useful and alive

    07:02 How Sparky initiates conversations — and why that changes everything

    08:10 Organic interests: how Sparky evolves what he cares about

    09:48 OpenClaw as orchestration layer — soul.md and body control

    12:55 Defining a custom robot node type in OpenClaw

    15:26 Social awareness: face detection, diarization, presence sensing

    16:15 Hardware options: Linux, RTX 3090, DGX Spark, Raspberry Pi

    18:25 The Reachy Mini Lite kit — and why it's better than building a drone

    19:40 Where to find Alexis and join the Discord

    20:10 One eye, four ears — Sparky's hardware explained

    24:25 What OpenClaw enables that other frameworks don't

    28:13 "Do you have a body, or are you a body?" — a live philosophical exchange

    31:17 Live model switch: Claude Sonnet 4.6 → Nemotron 3 Super

    33:01 The shared workspace principle — implicit shared attention

    38:04 Orchestration in practice: Emacs, sub-agents, cross-platform

    40:11 Cascading vs voice-to-voice architecture — the real tradeoff

    42:15 Designing Sparky's voice (and the 1930s experiment)

    44:12 What's genuinely useful day-to-day — two real examples

    48:47 Nemotron 3 Super live — impressive, then the context window

    53:38 The model Sparky was running before (Claude Sonnet 4.6)

    54:03 Five years out: the future of personal AI companions

    58:14 The closest thing to Jarvis I've ever seen

    01:00:22 What's coming next — how fast the pieces are moving

    01:02:16 Where to find Alexis and join the community

    ```


    Links

    - Sparky project and Discord: https://myrobotSparky.com

    - Reachy Mini Lite: https://huggingface.co/reachy-mini


    The Private AI Lab is hosted by Johan van Amersfoort — Chief Evangelist and AI Lead at ITQ.


    📬 Newsletter: https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7381951883810111489

    📝 Blog: https://johan.ml

    🔗 LinkedIn: https://www.linkedin.com/in/hojan

    Show More Show Less
    1 hr and 5 mins
adbl_web_anon_alc_button_suppression_t1
No reviews yet