エピソード

  • Mechanism design: Building smarter AI agents from the fundamentals, Part 1
    2025/05/20

    What if we've been approaching AI agents all wrong? While the tech world obsesses over larger language models (LLMs) and prompt engineering, there'a a foundational approach that could revolutionize how we build trustworthy AI systems: mechanism design.

    This episode kicks off an exciting series where we're building AI agents "the hard way"—using principles from game theory and microeconomics to create systems with predictable, governable behavior. Rather than hoping an LLM can magically handle complex multi-step processes like booking travel, Sid and Andrew explore how to design the rules of the game so that even self-interested agents produce optimal outcomes.

    Drawing from our conversation with Dr. Michael Zargum (Episode 32), we break down why LLM-based agents struggle with transparency and governance. The "surface area" for errors expands dramatically when you can't explain how decisions are made across multiple steps. Instead, mechanism design creates clear states with defined optimization parameters at each stage—making the entire system more reliable and accountable.

    We explore the famous Prisoner's Dilemma to illustrate how individual incentives can work against collective benefits without proper system design. Then we introduce the Vickrey-Clark-Groves mechanism, which ensures AI agents truthfully reveal preferences and actively participate in multi-step processes—critical properties for enterprise applications.

    Beyond technical advantages, this approach offers something profound: a way to preserve humanity in increasingly automated systems. By explicitly designing for values, fairness, and social welfare, we're not just building better agents—we're ensuring AI serves human needs rather than replacing human thought.

    Subscribe now to follow our journey as we build an agentic travel system from first principles, applying these concepts to real business challenges. Have questions about mechanism design for AI? Send them our way for future episodes!

    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    37 分
  • Principles, agents, and the chain of accountability in AI systems
    2025/05/08

    Dr. Michael Zargham provides a systems engineering perspective on AI agents, emphasizing accountability structures and the relationship between principals who deploy agents and the agents themselves. In this episode, he brings clarity to the often misunderstood concept of agents in AI by grounding them in established engineering principles rather than treating them as mysterious or elusive entities.

    Show highlights

    • Agents should be understood through the lens of the principal-agent relationship, with clear lines of accountability
    • True validation of AI systems means ensuring outcomes match intentions, not just optimizing loss functions
    • LLMs by themselves are "high-dimensional word calculators," not agents - agents are more complex systems with LLMs as components
    • Guardrails provide deterministic constraints ("musts" or "shalls") versus constitutional AI's softer guidance ("shoulds")
    • Systems engineering approaches from civil engineering and materials science offer valuable frameworks for AI development
    • Authority and accountability must align - people shouldn't be held responsible for systems they don't have authority to control
    • The transition from static input-output to closed-loop dynamical systems represents the shift toward truly agentic behavior
    • Robust agent systems require both exploration (lab work) and exploitation (hardened deployment) phases with different standards

    Explore Dr. Zargham's work

    • Protocols and Institutions (Feb 27, 2025)
    • Comments Submitted by BlockScience, University of Washington APL Information Risk and Synthetic Intelligence Research Initiative (IRSIRI), Cognitive Security and Education Forum (COGSEC), and the Active Inference Institute (AII) to the Networking and Information Technology Research and Development National Coordination Office's Request for Comment on The Creation of a National Digital Twins R&D Strategic Plan NITRD-2024-13379 (Aug 8, 2024)



    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    46 分
  • Supervised machine learning for science with Christoph Molnar and Timo Freiesleben, Part 2
    2025/03/27

    Part 2 of this series could have easily been renamed "AI for science: The expert’s guide to practical machine learning.” We continue our discussion with Christoph Molnar and Timo Freisleben to look at how scientists can apply supervised machine learning techniques from the previous episode into their research.

    Introduction to supervised ML for science (0:00)

    • Welcome back to Christoph Molnar and Timo Freisleben, co-authors of “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box”

    The model as the expert? (1:00)

    • Evaluation metrics have profound downstream effects on all modeling decisions
    • Data augmentation offers a simple yet powerful way to incorporate domain knowledge
    • Domain expertise is often undervalued in data science despite being crucial

    Measuring causality: Metrics and blind spots (10:10)

    • Causality approaches in ML range from exploring associations to inferring treatment effects

    Connecting models to scientific understanding (18:00)

    • Interpretation methods must stay within realistic data distributions to yield meaningful insights

    Robustness across distribution shifts (26:40)

    • Robustness requires understanding what distribution shifts affect your model
    • Pre-trained models and transfer learning provide promising paths to more robust scientific ML

    Reproducibility challenges in ML and science (35:00)

    • Reproducibility challenges differ between traditional science and machine learning

    Go back to listen to part one of this series for the conceptual foundations that support these practical applications.

    Check out Christoph and Timo's book “Supervised Machine Learning for Science: How to Stop Worrying and Love Your Black Box” available online now.




    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    42 分
  • Supervised machine learning for science with Christoph Molnar and Timo Freiesleben, Part 1
    2025/03/25

    Machine learning is transforming scientific research across disciplines, but many scientists remain skeptical about using approaches that focus on prediction over causal understanding.

    That’s why we are excited to have Christoph Molnar return to the podcast with Timo Freiesleben. They are co-authors of "Supervised Machine Learning for Science: How to Stop Worrying and Love your Black Box." We will talk about the perceived problems with automation in certain sciences and find out how scientists can use machine learning without losing scientific accuracy.

    • Different scientific disciplines have varying goals beyond prediction, including control, explanation, and reasoning about phenomena
    • Traditional scientific approaches build models from simple to complex, while machine learning often starts with complex models
    • Scientists worry about using ML due to lack of interpretability and causal understanding
    • ML can both integrate domain knowledge and test existing scientific hypotheses
    • "Shortcut learning" occurs when models find predictive patterns that aren't meaningful
    • Machine learning adoption varies widely across scientific fields
    • Ecology and medical imaging have embraced ML, while other fields remain cautious
    • Future directions include ML potentially discovering scientific laws humans can understand
    • Researchers should view machine learning as another tool in their scientific toolkit

    Stay tuned! In part 2, we'll shift the discussion with Christoph and Timo to talk about putting these concepts into practice.


    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    27 分
  • The future of AI: Exploring modeling paradigms
    2025/02/25

    Unlock the secrets to AI's modeling paradigms. We emphasize the importance of modeling practices, how they interact, and how they should be considered in relation to each other before you act. Using the right tool for the right job is key. We hope you enjoy these examples of where the greatest AI and machine learning techniques exist in your routine today.

    More AI agent disruptors (0:56)

    • Proxy from London start-up Convergence AI
    • Another hit to OpenAI, this product is available for free, unlike OpenAI’s Operator.

    AI Paris Summit - What's next for regulation? (4:40)

    • [Vice President] Vance tells Europeans that heavy regulation can kill AI
    • US federal administration withdrawing from the previous trend of sweeping big tech regulation on modeling systems.
    • The EU is pushing to reduce bureaucracy but not regulatory pressure

    Modeling paradigms explained (10:33)

    • As companies look for an edge in high-stakes computations, we’ve seen best-in-class rediscovering expert system-based techniques that, with modern computing power, are breathing new light into them.
      • Paradigm 1: Agents (11:23)
      • Paradigm 2: Generative (14:26)
      • Paradigm 3: Mathematical optimization (regression) (18:33)
      • Paradigm 4: Predictive (classification) (23:19)
      • Paradigm 5: Control theory (24:37)

    The right modeling paradigm for the job? (28:05)


    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    34 分
  • Agentic AI: Here we go again
    2025/02/01

    Agentic AI is the latest foray into big-bet promises for businesses and society at large. While promising autonomy and efficiency, AI agents raise fundamental questions about their accuracy, governance, and the potential pitfalls of over-reliance on automation.

    Does this story sound vaguely familiar? Hold that thought. This discussion about the over-under of certain promises is for you.


    Show Notes


    The economics of LLMs and DeepSeek R1 (00:00:03)

    • Reviewing recent developments in AI technologies and their implications
    • Discussing the impact of DeepSeek’s R1 model on the AI landscape, NVIDIA


    The origins of agentic AI (00:07:12)

    • Status quo of AI models to date: Is big tech backing away from promise of generative AI?
    • Agentic AI designed to perceive, reason, act, and learn


    Governance and agentic AI (00:13:12)

    • Examining the tension between cost efficiency and performance risks [LangChain State of AI Agents Report]
    • Highlighting governance concerns related to AI agents


    Issues with agentic AI implementation (00:21:01)

    • Considering the limitations of AI agents and their adoption in the workplace
    • Analyzing real-world experiments with AI agent technologies, like Devin


    What's next for complex and agentic AI systems (00:29:27)

    • Offering insights on the cautious integration of these systems in business practices
    • Encouraging a thoughtful approach to leveraging AI capabilities for measurable outcomes

    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    30 分
  • Contextual integrity and differential privacy: Theory vs. application with Sebastian Benthall
    2025/01/07

    What if privacy could be as dynamic and socially aware as the communities it aims to protect? Sebastian Benthall, a senior research fellow from NYU’s Information Law Institute, shows us how privacy is complex. He uses Helen Nissenbaum’s work with contextual integrity and concepts in differential privacy to explain the complexity of privacy. Our talk explains how privacy is not just about protecting data but also about following social rules in different situations, from healthcare to education. These rules can change privacy regulations in big ways.

    Show notes

    Intro: Sebastian Benthall (0:03)

    • Research: Designing Fiduciary Artificial Intelligence (Benthall, Shekman)
    • Integrating Differential Privacy and Contextual Integrity (Benthall, Cummings)

    Exploring differential privacy and contextual integrity (1:05)

    • Discussion about the origins of each subject
    • How are differential privacy and contextual integrity used to enforce each other?

    Accepted context or legitimate context? (9:33)

    • Does context develop from what society accepts over time?
    • Approaches to determine situational context and legitimacy

    Next steps in contextual integrity (13:35)

    • Is privacy as we know it ending?
    • Areas where integrated differential privacy and contextual integrity can help (Cummings)

    Interpretations of differential privacy (14:30)

    • Not a silver bullet
    • New questions posed from NIST about its application

    Privacy determined by social norms (20:25)

    • Game theory and its potential for understanding social norms

    Agents and governance: what will ultimately decide privacy? (25:27)

    • Voluntary disclosures and the biases it can present towards groups that are least concerned with privacy
    • Avoiding self-fulfilling prophecy from data and context



    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    33 分
  • Model documentation: Beyond model cards and system cards in AI governance
    2024/11/09

    What if the secret to successful AI governance lies in understanding the evolution of model documentation? In this episode, our hosts challenge the common belief that model cards marked the start of documentation in AI. We explore model documentation practices, from their crucial beginnings in fields like finance to their adaptation in Silicon Valley. Our discussion also highlights the important role of early modelers and statisticians in advocating for a complete approach that includes the entire model development lifecycle.

    Show Notes

    Model documentation origins and best practices (1:03)

    • Documenting a model is a comprehensive process that requires giving users and auditors clear understanding:
      • Why was the model built?
      • What data goes into a model?
      • How is the model implemented?
      • What does the model output?


    Model cards - pros and cons (7:33)

    • Model cards for model reporting, Association for Computing Machinery
    • Evolution from this research to Google's definition to today
    • How the market perceives them vs. what they are
    • Why the analogy “nutrition labels for models” needs a closer look


    System cards - pros and cons (12:03)

    • To their credit, OpenAI system cards somewhat bridge the gap between proper model documentation and a model card.
    • Contains complex descriptions of evaluation methodologies along with results; extra points for reporting red-teaming results
    • Represents 3rd-party opinions of the social and ethical implications of the release of the model


    Automating model documentation with generative AI (17:17)

    • Finding the balance in automation in a great governance strategy
    • Generative AI can provide an assist in editing and personal workflow


    Improving documentation for AI governance (23:11)

    • As model expert, engage from the beginning with writing the bulk of model documentation by hand.
    • The exercise of documenting your models solidifies your understanding of the model's goals, values, and methods for the business

    What did you think? Let us know.

    Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:

    • LinkedIn - Episode summaries, shares of cited articles, and more.
    • YouTube - Was it something that we said? Good. Share your favorite quotes.
    • Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
    続きを読む 一部表示
    28 分