• How One Engineer Recalled a Wrong Production Deploy in 8 Seconds
    Jun 30 2026
    In this episode, Lucas and Luna dive into the story of an engineer at a mid-size SaaS company who accidentally deployed a breaking change to production on a Friday afternoon. Instead of a panicked 20-minute scramble to revert via Git, she had already set up a one-line rollback script — a single shell command that restored the previous deployment image in under 10 seconds. The hosts break down how she built a pre-deployment safety net: an immutable release tag, a pre-push hook that verified the tag existed in the container registry, and a simple 'rollback.sh' that ran 'kubectl set image' from a known-good manifest. They discuss why most teams focus on deployment speed but ignore rollback speed, and how this engineer's approach — treating rollback as a first-class operation — saved her team 40 cumulative hours over the next quarter. Luna questions whether the script could handle database migrations; Lucas explains the pattern of 'expand-contract' migrations that separate schema changes from code deployments. The episode closes on a forward-looking note about chaos engineering and deliberately testing rollback paths. #RollbackScript #ProductionDeploy #IncidentResponse #DevOps #Kubernetes #ShellScripting #EngineeringCulture #SiteReliability #ContinuousDeployment #ImmutableReleases #PrePushHook #DatabaseMigrations #ExpandContract #ChaosEngineering #FexingoBusiness #BusinessPodcast #SoftwareEngineeringPodcast #Technology Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    7 mins
  • How an Engineer Reduced Docker Image Size by 95 Percent
    Jun 29 2026
    In episode 81 of The Software Engineering Podcast, Lucas and Luna explore how a senior DevOps engineer at a mid-sized SaaS company shrunk Docker images from 1.8 GB to under 90 MB — cutting deployment times by 70 percent and saving $12,000 a year in storage costs. They walk through the specific techniques used: multi-stage builds, Alpine base images, distroless runtimes, and layer optimisation. The conversation also touches on how these principles apply beyond containers, from CI pipelines to serverless deployments. If you've ever wondered why your Docker images are bloated or how to shrink them without breaking anything, this episode gives you a concrete playbook. #Docker #ContainerOptimisation #DevOps #MultiStageBuilds #AlpineLinux #DistrolessImages #LayerCaching #CI_CD #CloudCostOptimisation #SoftwareEngineering #Technology #FexingoBusiness #BusinessPodcast #EngineeringBestPractices #SaaSInfrastructure #DockerImageSize #SeniorDevOps #DeploymentSpeed Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    9 mins
  • How One Engineer Cut Logging Costs 90 Percent Without Losing Observability
    Jun 29 2026
    Episode 80 of The Software Engineering Podcast dives into a specific cost-optimization story: how a senior engineer at a mid-size fintech company reduced their cloud logging bill by 90 percent — from $80,000 per month to under $8,000 — without sacrificing the signal their on-call team relied on. Lucas and Luna walk through the technical decisions: switching from structured JSON logging to a custom binary format with protobuf, implementing a two-tier retention policy that kept high-cardinality metrics hot for only 24 hours, and writing a smart sampling layer that preserved 100 percent of error traces while dropping 95 percent of repetitive success logs. They discuss the trade-offs — longer query times on cold storage, the learning curve for the team, and the initial pushback from developers used to grep-friendly logs. The episode ends with a practical framework any team can adapt: measure your log volume per service, identify the noisiest sources, and ask whether every field in every log line earns its storage cost. #SoftwareEngineering #CloudCosts #Observability #Logging #Fintech #Protobuf #BinaryFormat #Sampling #RetentionPolicy #CostOptimization #EngineeringPodcast #FexingoBusiness #BusinessPodcast #Technology #LogManagement #Scalability #DevOps #SiteReliability Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    9 mins
  • How One Engineer Repaired a Corrupt Git Repository Without Losing History
    Jun 28 2026
    When a junior developer force-pushed over a shared branch and corrupted the entire git history, most teams would panic. In this episode, Lucas and Luna break down how one engineer rescued a 14-month-old codebase using git reflog, filter-repo, and careful cherry-picking. They walk through the specific commands, the decision tree for when to rewrite history versus when to accept it, and the single backup practice that saved the team from losing 900 commits. If you've ever wondered what to do when git itself seems broken, this is the episode for you. #Git #VersionControl #GitReflog #GitFilterRepo #CodeRecovery #EngineeringBestPractices #DevOps #SourceControl #SoftwareEngineering #Technology #FexingoBusiness #BusinessPodcast #CodeRescue #HistoryRewrite #CherryPick #ForcePush #Debugging #DeveloperTools Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    8 mins
  • How One Engineer Debugged a Kubernetes Pod Eviction That Wiped 5000 Jobs
    Jun 28 2026
    In this episode of The Software Engineering Podcast with Fexingo, Lucas and Luna dive into a production nightmare: a Kubernetes cluster that silently evicted over 5000 batch jobs over three weekends. They walk through how one engineer at a data processing startup traced the root cause to a subtle interaction between kubelet resource reservation defaults and a misconfigured eviction threshold. Learn how she used Prometheus metrics, a custom admission webhook, and a prioritization framework to prevent it from happening again. A masterclass in debugging distributed systems under pressure. #Kubernetes #PodEviction #DevOps #SiteReliabilityEngineering #DistributedSystems #BatchProcessing #Prometheus #AdmissionWebhook #DataProcessing #ProductionDebugging #CloudNative #SRE #EngineeringResilience #IncidentResponse #FexingoBusiness #BusinessPodcast #Technology #SoftwareEngineering Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    8 mins
  • How One Engineer Fixed a Sidekiq Memory Bloat That Crunched Servers Every 72 Hours
    Jun 27 2026
    Episode 77 of The Software Engineering Podcast digs into a deceptively simple bug: a Sidekiq worker that ballooned in memory every 72 hours, forcing the ops team to restart it manually. Lucas and Luna walk through how one engineer discovered the culprit—a cached ActiveRecord relation that never cleared—and how a single call to `.reload` cut memory usage by 80 percent. They discuss lazy evaluation pitfalls in Ruby, the importance of profiling in production, and why a ten-line fix can save a team six figures in infrastructure costs. If you've ever fought a memory leak that only shows up after days of uptime, this episode is for you. #Sidekiq #RubyOnRails #MemoryLeak #BackgroundJobs #ActiveRecord #LazyEvaluation #RubyMemoryProfiling #DerailedBenchmarks #MemoryBloat #72HourBug #ProductionDebugging #EngineeringStory #SoftwareEngineering #Technology #FexingoBusiness #BusinessPodcast #CodeQuality #PerformanceOptimization Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    8 mins
  • How One Engineer Handled a Database Migration With Zero Downtime Using Flyway
    Jun 27 2026
    In episode 76 of The Software Engineering Podcast, Lucas and Luna dive into the story of a senior engineer at a mid-sized e-commerce company who migrated a critical PostgreSQL database from a single instance to a replicated cluster without any downtime. The migration involved 200 GB of data, 50 tables, and a tight deadline. The engineer used Flyway for schema versioning, pglogical for replication, and a careful cutover strategy that included read-only mode, dual writes, and a final switch. They walk through the step-by-step approach, the pitfalls that were avoided (like schema drift and replication lag), and the key lesson: safe migrations are about orchestration, not just tools. If you've ever dreaded a database migration, this episode is for you. #DatabaseMigration #Flyway #PostgreSQL #ZeroDowntime #Engineering #Technology #SoftwareEngineering #pglogical #SchemaVersioning #Ecommerce #DevOps #DataEngineering #Production #MigrationStrategy #Database #FexingoBusiness #TechPodcast #Code Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    5 mins
  • How One Engineer Rewrote a Legacy Database Without Downtime
    Jun 26 2026
    Episode 75 of The Software Engineering Podcast tells the true story of a senior engineer at a mid-sized logistics firm who migrated a 15-year-old PostgreSQL monolith to a sharded, horizontally-scalable architecture without a single minute of planned downtime. The project touched 4.3 million lines of trigger code and 2,800 stored procedures. By combining logical replication, a write-ahead log change data capture pipeline, and a phased cutover with canary reads, the team moved 12 terabytes of data incrementally over six weeks. The episode breaks down the exact strategy: how they avoided dual-write complexity, handled schema drift, and rolled back within 90 seconds when a hot-spot partition caused latency spikes. Lucas and Luna discuss the tradeoffs between trigger-based replication versus streaming replication, why they chose NOT to use an ORM abstraction layer, and what happened when a foreign key constraint broke the CDC pipeline at 2 AM. This is a deep, practical look at legacy database modernisation for engineers facing similar migrations. #LegacyDatabaseMigration #PostgreSQL #DatabaseSharding #ChangeDataCapture #ZeroDowntimeMigration #LogicalReplication #SoftwareEngineering #TechPodcast #DatabaseArchitecture #ProductionEngineering #LucasAndLuna #FexingoBusiness #BusinessPodcast #EngineeringBestPractices #DataMigration #Postgres #WriteAheadLog #CanaryDeployments Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    8 mins