• How Data Teams Are Using Data Profiling for Pipeline Quality
    Jul 4 2026
    In Episode 90 of The Data Business Podcast, Lucas and Luna dive into the emerging practice of proactive data profiling—running automated checks on raw data before it enters transformation pipelines. They use a concrete case: a mid-sized e-commerce company that reduced data breakage incidents by 60 percent in three months by implementing pre-ingestion profiling checks using an open-source framework called Great Expectations. The hosts discuss why traditional monitor-and-fix approaches are insufficient, how profiling shifts data teams from reactive firefighting to preventive quality control, and the specific metrics teams should track (null rate thresholds, distribution drift, schema change detection). They also touch on the trade-offs: added latency versus prevented downstream chaos, and how this fits into the broader data contract movement covered in earlier episodes. A practical, example-driven look at a technique that's quietly becoming table stakes for serious data teams. #DataProfiling #DataQuality #GreatExpectations #PipelineQuality #DataEngineering #ProactiveMonitoring #DataObservability #SchemaDrift #DataValidation #ETL #DataInfrastructure #BusinessIntelligence #DataTeam #PreIngestion #DataGovernance #FexingoBusiness #BusinessPodcast #DataBusiness Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    12 mins
  • How Data Teams Are Using Cost Attribution for Cloud Spend
    Jul 3 2026
    Episode 89 of The Data Business Podcast dives into a topic every data team grapples with: cloud cost attribution. Lucas and Luna break down how modern data teams are moving beyond simple aggregate billing to granular cost allocation per pipeline, per query, and per user. They explore a real-world case where a mid-market fintech used Snowflake's resource monitors and cost allocation tags to cut its data warehouse bill by 30% in three months. Along the way, they discuss the tension between engineering autonomy and financial accountability, the pitfalls of chargeback models, and why the most effective approach is a hybrid of showback and budget-based guardrails. If you're a data leader trying to justify cloud spend to the finance team, this episode offers a practical framework for turning cost attribution from a blame game into a collaboration tool. #CloudCost #CostAttribution #Snowflake #FinOps #DataEngineering #Showback #Chargeback #CloudSpend #DataInfrastructure #CostOptimization #DataTeam #BusinessAndTechnology #Analytics #FexingoBusiness #BusinessPodcast #DataBusiness #Podcast #TechOps Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    7 mins
  • How Data Teams Are Using Data Observability for Trust
    Jul 3 2026
    Episode 88 of The Data Business Podcast explores how data teams are adopting data observability to rebuild trust in their pipelines. Lucas and Luna dive into a specific case: a mid-sized e-commerce company that reduced data incident detection time from hours to under four minutes using a three-pillar observability stack—freshness, volume, and schema monitoring. They discuss the cost of bad data, the tension between speed and reliability, and why observability is becoming a non-negotiable layer for modern data platforms. The episode also covers how observability tools integrate with existing data catalogs and lineage systems, and why treating data pipelines like software systems changes the game for data engineers. No fluff, just a concrete look at one team's journey from firefighting to proactive data quality management. #DataObservability #DataQuality #DataTrust #DataPipelines #DataEngineering #DataCatalogs #DataLineage #FreshnessMonitoring #VolumeMonitoring #SchemaMonitoring #DataIncidents #MonteCarloData #FexingoBusiness #BusinessPodcast #BusinessAndTechnology #DataInfrastructure #Analytics #InformationProducts Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    11 mins
  • How Data Teams Are Using dbt for Cost Optimization
    Jul 2 2026
    Episode 87 of The Data Business Podcast explores how data teams are leveraging dbt not just for transformation but for cost optimization. Lucas kicks off with a specific example: a mid-market fintech company that reduced its Snowflake compute bill by 18% within two quarters by implementing dbt models designed to monitor and flag inefficient queries. Luna digs into the tactical details — how the team set up cost-per-model tracking using dbt's metadata and Snowflake's query history. They discuss the challenges of attribution when queries span multiple models, the role of materialization strategy in cost control, and why some teams are building custom dbt packages to surface cost metrics in their BI dashboards. Lucas and Luna also explore the tension between developer autonomy and cost governance, and how dbt's lineage features help teams understand the cost impact of upstream changes. The episode gives listeners a concrete framework for tying dbt models to cloud cost observability, with practical advice on where to start — even for teams already deep into their dbt implementation. #dbt #CostOptimization #Snowflake #DataTeams #CloudCosts #DataEngineering #FinOps #DataInfrastructure #Analytics #SQL #DataTransformation #Metadata #CostGovernance #Materialization #Lineage #BusinessAndTechnology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    11 mins
  • How Data Teams Are Using Data Mesh for Domain Ownership
    Jul 2 2026
    Episode 86 of The Data Business Podcast explores how data teams are adopting data mesh principles to distribute ownership across domains. Lucas and Luna examine a real case: a mid-sized e-commerce company that moved from a centralized data team to domain-specific data products, cutting time-to-insight by 40 percent. They discuss the role of data contracts in enabling this shift, the challenges of federated governance, and how this approach changes cost allocation. The hosts also touch on the cultural shift required for domain teams to take ownership of their data products. If you're running a data team or building data infrastructure, this episode offers practical lessons on scaling data responsibilities without creating bottlenecks. #DataMesh #DomainOwnership #DataProducts #DataContracts #FederatedGovernance #DataArchitecture #DataEngineering #Analytics #DataInfrastructure #ECommerce #CostAllocation #DataCulture #DataStrategy #BusinessTechnology #Business #Technology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    9 mins
  • How Data Teams Are Using Reverse ETL for Operational Analytics
    Jul 1 2026
    Reverse ETL is becoming a critical tool for data teams looking to push insights from their warehouse into operational systems like CRMs, ad platforms, and support tools. In this episode, Lucas and Luna dig into a real case: how a mid-market e-commerce company used reverse ETL to sync customer churn predictions straight into Salesforce, reducing manual data exports and improving retention campaign response rates by 22 percent. They unpack the architecture, common pitfalls like data latency and schema mismatches, and why this pattern is different from traditional ETL. If your team is debating whether to build or buy a reverse ETL pipeline, this one's for you. #ReverseETL #OperationalAnalytics #DataEngineering #DataInfrastructure #DataPipelines #CustomerChurn #Salesforce #ETL #DataWarehouse #BusinessIntelligence #DataTeam #Retention #Analytics #DataOps #DataScience #BusinessAndTechnology #FexingoBusiness #BusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    8 mins
  • Why Data Teams Are Adopting Observable Pipelines for Trust
    Jul 1 2026
    Most data teams trust their dashboards until a numbers mismatch costs the company real money. In this episode, Lucas and Luna explore how a mid-market fintech called SynapsePay built observable pipelines that capture every transformation step — not just whether the job ran, but exactly how the data changed. We walk through the team's early struggles with duplicate transaction records, their shift from job-level monitoring to row-level observability, and the concrete metrics they now track: data freshness scores, distribution drift alerts, and schema change detection. Lucas explains why traditional data quality checks catch only the problems you already know to look for, while observability surfaces the silent anomalies that break downstream models. Luna pushes back on whether small teams can afford the tooling — and Lucas shares how one open-source approach using dbt and Great Expectations plus a lightweight event log kept their monthly compute bill under $400. If you manage or work on a data team, this episode gives you a practical framework for moving from 'the pipeline ran' to 'the pipeline ran correctly.' #DataObservability #PipelineTrust #SynapsePay #Fintech #DataEngineering #DataQuality #ObservablePipelines #RowLevelLineage #dbt #GreatExpectations #DataFreshness #SchemaDetection #DistributionDrift #DataCulture #BusinessAndTechnology #FexingoBusiness #BusinessPodcast #DataTeam Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    9 mins
  • How Data Teams Are Using Vector Embeddings for Semantic Search
    Jun 30 2026
    Episode 83 of The Data Business Podcast dives into the practical uses of vector embeddings for semantic search in enterprise data environments. Lucas and Luna explore how companies like Shopify have leveraged embeddings to power product discovery and internal knowledge retrieval, reducing search-to-purchase time by 12 percent. They break down the technical trade-offs between dense and sparse embeddings, the cost of storing high-dimensional vectors, and why data teams are now embedding everything from customer support tickets to internal documentation. With a focus on real-world implementation details including approximate nearest neighbor algorithms and vector database choices, this episode equips operators and builders with a clear framework for deciding whether semantic search is worth the infrastructure investment. #VectorEmbeddings #SemanticSearch #DataInfrastructure #MachineLearning #Shopify #ApproximateNearestNeighbor #VectorDatabase #Pinecone #Milvus #DenseEmbeddings #SparseEmbeddings #NaturalLanguageProcessing #DataEngineering #BusinessTechnology #SearchOptimization #FexingoBusiness #BusinessPodcast #TheDataBusinessPodcast Keep every episode free: buymeacoffee.com/fexingo
    Show More Show Less
    10 mins