Beyond the Prompt: Building Robustness in the Age of AI Agents

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to basket failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Unfollow podcast failed

Beyond the Prompt: Building Robustness in the Age of AI Agents

Listen for free

View show details

Episode Overview

In this episode, we dive into the insights of data scientist and entrepreneur Luiz Felipe Mendes as he explores the shifting landscape of Artificial Intelligence. We move beyond simple LLM prompts to discuss the rise of AI Agents—autonomous programs that don't just talk but act. We also tackle the critical need for ML Prediction Robustness, examining why large-scale systems like those at Meta and iFood require more than just good engineering to stay reliable.

Key Discussion Points

Defining the AI Agent: Understanding how agents differ from standard chatbots by using external APIs and iterative loops to achieve complex goals.
Agentic Workflows: A look at Andrew Ng’s theories on "agentic workflows," where AI systems use feedback loops—such as one agent writing code while another tests it—to improve quality autonomously.
The "Reality Check" on Autonomy: A candid discussion on the current limitations of agents, including their struggles with long-term task tracking, limited context windows, and the ongoing necessity of human supervision.
The Pillar of Robustness: Why technically "functional" models can still fail in production due to the stochastic nature of data.
Engineering for Reliability: A breakdown of Meta’s approach to robustness, focusing on four critical areas:
- Model & Feature Robustness: Detecting anomalies (like a car priced at 10 reais) before they break a system.
- Label & Prediction Robustness: Ensuring distributions remain consistent over time.
- ML Interpretability: Using tools like SHAP values to peer inside the "black box" of complex models.

Major Takeaways

Iterative vs. Direct: The power of AI today lies in "agentic" workflows that allow for self-correction.
Constant Vigilance: ML systems are core components of modern products and require continuous monitoring of features, labels, and predictions to remain robust.

Resources Mentioned

Luiz Felipe Mendes’ "Weekly Readings" series.
Andrew Ng’s lecture on AI Agentic Workflows.
MIT Technology Review: "What are AI agents?".
Meta’s engineering blog on ML prediction robustness.

This podcast was generated based on these show posts

https://lfomendes.medium.com/weekly-reading-ai-agents-8414e387bfd8

https://lfomendes.medium.com/weekly-reading-metas-approach-to-machine-learning-prediction-robustness-fae46957cf41

No reviews yet