Platform Engineering Podcast cover art

Platform Engineering Podcast

Platform Engineering Podcast

By: Cory O'Daniel CEO of Massdriver
Listen for free

The Platform Engineering Podcast is a show about the real work of building and running internal platforms — hosted by Cory O’Daniel, longtime infrastructure and software engineer, and CEO/cofounder of Massdriver. Each episode features candid conversations with the engineers, leads, and builders shaping platform engineering today. Topics range from org structure and team ownership to infrastructure design, developer experience, and the tradeoffs behind every “it depends.” Cory brings two decades of experience building platforms — and now spends his time thinking about how teams scale infrastructure without creating bottlenecks or burning out ops. This podcast isn’t about trends. It’s about how platform engineering actually works inside real companies. Whether you're deep into Terraform/OpenTofu modules, building golden paths, or just trying to keep your platform from becoming a dumpster fire — you’ll probably find something useful here.Copyright 2025 | All Rights Reserved | Massdriver, Inc. Career Success Economics Politics & Government
Episodes
  • What Do Service Meshes Actually Solve? (William Morgan, Buoyant/Linkerd)
    Jun 24 2026

    Network calls fail in ways function calls never do - and once a monolith becomes microservices, reliability problems show up fast: retries amplify load, latency spikes cascade, and “what talks to what?” becomes hard to answer.

    William Morgan, co-creator of Linkerd and the person who coined “service mesh,” breaks down what service meshes actually solve for platform teams running Kubernetes at scale. The conversation focuses on practical outcomes: improving reliability between services, getting uniform observability without rewriting every app, and handling gaps Kubernetes doesn’t cover well - like gRPC/HTTP2 load balancing and cross-environment communication.

    Key topics

    • Why reliability is the first “microservices tax” (timeouts, retries, backoff, cascading failure)
    • What Kubernetes does not solve at the networking layer—and where a service mesh fits
    • gRPC/HTTP2 load balancing problems and why L4 balancing can fall short
    • Service-to-service visibility: understanding traffic flows and performance without per-app instrumentation
    • Cost and resilience tradeoffs with multi-AZ Kubernetes on AWS (and how zonal-aware balancing can help)
    • Whether developers should ever need to interact with service mesh configuration
    • Where zero trust and policy controls belong: platform guardrails vs application ownership

    Guest: William Morgan, CEO at Buoyant, Co-Creator of Linkerd

    William Morgan brings a unique take on platform engineering, security, and traffic management in cloud native environments. William’s the mind behind Linkerd, the CNCF graduate service mesh born to make security, observability, and reliability "just work" for modern apps without all that heavy overhead. With roots as an infrastructure engineer at Twitter, where he was hands-on in the shift to microservices, and experience at Microsoft, Powerset, Adap.tv, and MITRE, William understands operational complexity better than most. His perspective on reducing unpredictable cloud spend with features like Linkerd’s High Availability Zonal Load Balancing is timely for any team wrestling with multi-AZ cloud bills.

    William has hands-on knowledge of MCP, the protocol now critical for securing enterprise AI traffic. He also has strong views on sustainable open source business models, having contributed to open source for over 20 years.

    William Morgan, BlueSky

    Buoyant, Website

    Buoyant, LinkedIn

    Buoyant, YouTube

    Linkerd, GitHub

    Links to interesting things from this episode:

    • The Service Mesh Landscape

    Show More Show Less
    56 mins
  • Continuous Integration at Agentic Velocity with CircleCI’s Rob Zuber
    Jun 10 2026

    When code gets cheaper to produce, feedback becomes the limiting factor - CI, reviews, and the handoffs between tools can quietly slow everything down.

    Rob Zuber breaks down what platform engineers are seeing as teams adopt AI-assisted development: more branch builds, new failure modes, and growing pressure to shorten the loop between “change made” and “change validated.” He focuses on how CI can evolve from a human-first dashboard into a system that agents can interact with directly through APIs, CLIs, and MCP-style interfaces - so fixes can happen faster and with less waiting on manual triage.

    Along the way, Rob and Cory dig into practical questions engineering leaders are wrestling with: how PR review becomes the next major bottleneck, what “agent experience” means in a delivery pipeline, why speed isn’t only about faster compute (it’s also about doing less unnecessary work), and how teams can share learnings so “agentic velocity” doesn’t only benefit a few power users.

    If you’re building or running the systems that ship software, this is a clear look at where CI fits in an AI-accelerated workflow, and what needs to change to keep delivery safe, fast, and sustainable.

    Guest: Rob Zuber, Chief Technology Officer at CircleCI

    Rob Zuber is a 20-year veteran of software startups, a four-time founder, and three-time CTO. Since joining CircleCI, Rob has seen the company through its Series F funding and delivered on product innovation at scale while leading a team of 300+ engineers who are distributed around the globe.

    CircleCI, Website

    CircleCI, LinkedIn

    CircleCI, GitHub

    Links to interesting things from this episode:

    • “The Confident Commit” podcast
    • Wardley Mapping
    • “How one programmer broke the internet by deleting a tiny piece of code.”

    Show More Show Less
    50 mins
  • Durable Execution for Real‑World Failures with Temporal’s Cornelia Davis
    May 27 2026

    A lot of infrastructure and automation fails for ordinary reasons: rate limits, flaky networks, partial permissions, long-running jobs, and retries that vanish when the process restarts. Durable execution is a way to design systems that keep going anyway - without rebuilding a maze of queues, cron jobs, and manual cleanup.

    Cornelia Davis breaks down how durable execution works in practice: writing “normal” code while the runtime provides durable retries, state management, and the ability to pause work, wait for a human or external change (like a quota increase), and resume right where things left off. The conversation connects these ideas to platform engineering realities - Terraform workflows, long provisioning times, and “orphan” resources - and explains how Temporal workflows and activities help teams model failure handling as a first-class part of the system.

    You’ll also hear why this approach is showing up in AI engineering: long-running agent workflows, frequent rate limiting, and the need to avoid re-running expensive LLM calls when something breaks near the end.

    Guest: Cornelia Davis, Developer Advocate at Temporal Technologies and author of “Cloud Native Patterns”

    Cornelia Davis is a Developer Advocate at Temporal, where she brings more than three decades of experience as a software technologist to help engineers build resilient, scalable systems. Known for her pragmatic blend of hands-on coding, technical strategy, and customer collaboration, Cornelia is passionate about helping developers unlock the full potential of modern cloud-native architectures. Previously, she served as VP of Technology at Pivotal, where she played a key role in shaping Cloud Foundry and enabling enterprise cloud transformations. Whether she’s writing code, presenting at conferences, or whiteboarding with teams, Cornelia is driven by a singular goal: empowering developers to build better software. Outside of tech, she recharges on the yoga mat or in the kitchen, where she brings the same creativity and focus to her practice.

    Temporal, Website

    Temporal, GitHub

    Temporal Community, GitHub

    Temporal’s AI-assisted development tools

    Links to interesting things from this episode:

    • Temporal Developer Skill
    • “Cloud Native Patterns” by Cornelia Davis

    Show More Show Less
    46 mins
adbl_web_anon_alc_button_suppression_t1
No reviews yet