Trustworthy AI Agents: From Guardrails to Hardline Policy with Swept’s Shane Emmons

In this episode of The Innovators & Investors Podcast, host Kristian Marquez sits down with Shane Emmons, founder and CEO of Swept, to explore the complexities and challenges surrounding AI trust and reliability. Shane explains why AI systems can behave unpredictably due to their probabilistic nature and the inherent uncertainty in their predictions. He delves into how Swept tackles these challenges by assessing AI agents’ consistency and enforcing hardline policies to prevent harmful or unintended outcomes, especially in sensitive fields like healthcare. They discuss the evolution of AI agents, from handling simple tasks to managing complex workflows through specialized sub-agents, highlighting recent breakthroughs that enhance reliability and practical business applications. Shane also shares valuable advice for those looking to deepen their understanding of AI, emphasizing hands-on experimentation with popular models like ChatGPT and Claude. The conversation addresses the balance between AI’s benefits and risks, including ongoing efforts to detect misuse and “jailbreak” attempts, drawing parallels with cybersecurity. Looking ahead, Shane paints a picture of AI’s transformative potential over the next decade—not just automating busy work but enabling new creative and strategic endeavors. The episode closes with Shane reflecting on his career journey and key lessons learned while building Swept, offering encouragement for businesses to pragmatically deploy AI while managing associated risks. This discussion offers a comprehensive look at AI’s current realities and future opportunities for innovators and investors alike.

Transcript

Kristian: Welcome to the Innovators & Investors podcast. I’m Kristian Marquez, founder and CEO of Finstrat Management. At Finstrat, we provide CFO-led accounting, finance, and reporting services to help clients monetize their business. Stick around to the end to learn how you can be our next guest. On to the show.

Kristian: Today’s guest is Shane Emmons, founder and CEO of Swept. Shane, welcome.

Shane: Thanks for having me. Happy to be here.

Kristian: Let’s start with Swept. What problem are you solving?

Shane: Swept tackles what we call the trust problem in AI. For teams building or buying AI agents and agentic software, behavior isn’t like the last 30 years of deterministic systems — it’s sometimes consistent and sometimes unpredictable. We help companies understand that variability, improve consistency, plug the gaps, and alert you when the model drifts.

Kristian: Why can AI be unpredictable? What’s at the root?

Shane: People use terms like non-determinism, stochasticity, and probability. In simple terms, AI makes educated guesses based on training data — plus a little randomness for creativity. That creates power and flexibility, but it also introduces risk. We started Swept after seeing an AI agent cause real harm to teens seeking help. We realized that without monitoring and constraint, this becomes a pattern.

Kristian: Is this the tech architecture or data quality? Or both?

Shane: Both. AI exists where knowledge isn’t perfect. If we had perfect rules, we’d use algorithms. AI is fundamentally probabilistic — it works where fixed logic can’t. That’s a limitation and a feature.

Kristian: Let’s define general intelligence and superintelligence.

Shane:

General intelligence: AI that matches human capability across knowledge tasks.
Superintelligence: AI that exceeds all humans combined.

That’s the moonshot some labs are chasing.

Kristian: When you say “prediction,” is that shorthand for outcome generation?

Shane: Yes. Even “1+1=2” is technically prediction in a language model — it has seen that pattern countless times. Tool-use changes this: when the model calls a calculator, it stops guessing and computes. For uncertain problems — drug interactions, stock moves — AI still predicts, just with far more context than humans, but still probabilistically.

Kristian: How does Swept solve for this in practice?

Shane: We don’t evaluate raw foundation models; they’re too open-ended. We evaluate agents — systems scoped to a specific workflow (e.g., quoting manufacturing jobs). That constrains behavior. We then run proprietary evaluation tools to measure consistency and policy adherence. Think of us as a professor scoring the exam, not a model trying to outperform the system.

Kristian: Like guardrails?

Shane: Hardened guardrails. Guardrails can be bypassed. We implement enforced policies. For example, if the model suggests a drug dosage beyond a limit, we block it — not just warn about it.

Kristian: People still jailbreak models. How do you stay ahead?

Shane: Two sides:

Fire prevention — guardrails, safety rules, good training.
Firefighting — detect and stop violations in real time.

We handle #2. Think spam filter for AI misuse — detect jailbreak attempts, prompt leakage, dangerous actions, and stop them. And secure external tools — don’t rely on the AI to “behave.”

Kristian: Your background?

Shane: CS undergrad + grad work in AI 25+ years ago, focused on deploying AI in production. A decade as a data scientist in insurance — saw real-world exploits, including one case costing $1M from a model flaw agents discovered. Then 15 years in startups, including CTO scaling from thousands to tens of millions of users.

Kristian: How can someone upskill without going back to school?

Shane:

Start using Claude / ChatGPT / Gemini as thinking partners — not just writing helpers.
Upload real work, ask “what am I missing?”
Role-play scenarios, challenge assumptions.
Connect to tools and try workflow automation.
Try agent builders — real value comes when AI does work, not chats.

You don’t need data science — you need intuition for probabilistic systems and good prompting habits.

Kristian: Large vs small models?

Shane: Large = more knowledge and reliability. Small = cheaper and faster, but potentially less capable. Smart path:

Start large
Validate performance
Distill into a small task-specific model

That’s how Apple runs dozens of micro-models on-device.

Kristian: Future outlook?

Shane:

Next 12–24 months: Agents doing real work with tool-calling and planning sub-agents.
5 years: Routine knowledge work automated; humans shift to oversight and creative decisions.
10 years: Breakthroughs likely; uncertainty high. Oversight will remain critical.

Kristian: Lately news mentions “AI slop.” What is that?

Shane: Low-effort AI content — bland blogs and art flooding the internet. It’s the uncanny valley of content. Algorithms will penalize it over time. Quality will win.

Kristian: How reliable are agents at simple vs complex tasks?

Shane: Massive improvement recently due to planner + specialist agents. If you deeply understand your workflow, agents can execute reliably in segments. ROI depends on compute vs human labor and oversight cost. Swept minimizes review by surfacing only outliers — that’s where economics work.

Kristian: Do agents get “rewarded” like humans?

Shane: Yes — at training time via reinforcement learning and fine-tuning. In production, we prefer objective success metrics and completion signals over “endless chat engagement.”

Kristian: Any do-overs at Swept?

Shane: We spent months chasing VC instead of building fundamentals. Once we focused on fundamentals, investors came to us. Stay grounded in the business, not the pitch.

Kristian: Anyone you want to acknowledge?

Shane:

Dave DuPont (TeamSnap) — gave me a huge opportunity early on.
Keith Merron — my executive coach; helped me lead authentically.

Kristian: Anything we didn’t cover?

Shane: Deploy more AI — and if there’s risk, get help de-risking it. That’s our mission.

Kristian: Where can listeners reach you?

Shane: swept.ai

Email: shane@swept.ai or hello@swept.ai

Kristian: Shane Emmons, founder & CEO of Swept. I’m Kristian Marquez. Thanks for listening to the Innovators & Investors Podcast.

Kristian (closing):

Want to be a guest? Visit podcast.finstratmgt.com.

Share this episode, tag great guests, use #InnovatorsAndInvestors.

Subscribe and leave a review — it really helps.

Learn more at finstratmgt.com and follow us on LinkedIn and X (@finstratmgt).

Building Trustworthy AI: Navigating the Challenges and Future of Agentic Software with Shane Emmons

Transcript

Related Posts

From Line Cooks to Chefs: Why Goal-Based Programming Is the Next Era of AI Engineering

GPT-5 Removed the One Thing Digital Health & the Enterprise AI Needs

AI Promise to AI Proof

Transcript

Related Posts

From Line Cooks to Chefs: Why Goal-Based Programming Is the Next Era of AI Engineering

GPT-5 Removed the One Thing Digital Health & the Enterprise AI Needs

Join our newsletter for AI Insights

AI Promise to AI Proof