GenAI Risk in Financial Services: What the FS AI RMF Says About Hallucinations, Deepfakes, and Prompt Injection

AI SafetyLast updated on
GenAI Risk in Financial Services: What the FS AI RMF Says About Hallucinations, Deepfakes, and Prompt Injection

Traditional model risk frameworks address models that produce consistent outputs from consistent inputs. GenAI broke that assumption. A large language model serving customer inquiries at a bank can produce different answers to the same question, fabricate regulatory citations that don't exist, or respond to carefully crafted inputs by bypassing its own safety controls.

The Federal Reserve's SR 11-7, published in 2011, established model risk management standards for statistical and machine learning models. It covers model validation, governance, and monitoring. But those standards address models with predictable, deterministic behavior. GenAI requires different controls for different risks.

In February 2026, the Cyber Risk Institute and 108 financial institutions published the FS AI RMF. It is the first sector-specific framework to codify GenAI risks within financial regulatory context. This post examines the GenAI-specific controls it establishes and what they mean for institutions deploying generative AI in production.

Hallucination Risk: When AI Fabricates Financial Advice

GenAI hallucination in financial services carries consequences beyond embarrassment. A customer-facing AI that fabricates a regulatory citation, quotes an incorrect interest rate, or describes product features that don't exist puts the institution in regulatory, legal, and reputational jeopardy.

The risk is structural. Large language models generate outputs by predicting probable next tokens, not by retrieving verified facts. Every output carries some probability of fabrication, regardless of benchmark performance.

The FS AI RMF addresses hallucination through several control categories:

Output validation controls require systematic verification of AI-generated content before it reaches customers or informs decisions. This goes beyond prompt engineering. It means building validation layers that check outputs against verified data sources, flag low-confidence responses, and route uncertain outputs to human reviewers.

Grounding requirements mandate that GenAI outputs connect to verified, authoritative data sources. For a banking assistant, this means tying responses to actual product documentation, current rate sheets, and verified regulatory text.

Escalation paths define what happens when an AI system produces outputs it cannot verify. The system must escalate to human review with documented criteria for when and how escalation occurs.

Systematic hallucination detection requires monitoring infrastructure. Prompts reduce hallucination frequency. Monitoring catches the hallucinations that still occur.

Prompt Injection and Security: The Financial Attack Surface

Prompt injection represents an attack category that traditional application security does not address. An attacker embeds instructions within seemingly normal input, and the AI system executes those instructions instead of its intended behavior.

In financial services, the attack vectors are specific and consequential:

Data extraction: Crafted inputs coax the AI into revealing customer information, internal system details, or proprietary data accessible through its context or tools.

Control bypass: Injected instructions override safety guardrails. The AI approves transactions it shouldn't, provides unauthorized access, or generates responses that slip past compliance filters.

Output manipulation: Attackers manipulate advisory or analytical outputs, potentially affecting investment recommendations, risk assessments, or compliance determinations.

The FS AI RMF requires institutions to treat GenAI as a distinct security surface:

Input sanitization: Systematic filtering and validation of all inputs before they reach the model. This parallels SQL injection prevention but applies to natural language, making it technically more complex.

Adversarial testing: Regular red-team testing targeting prompt injection vulnerabilities. Traditional penetration testing does not cover this attack category.

Security boundaries: Architectural controls that limit what an AI system can access and what actions it can take, regardless of instructions received. The model's permissions are enforced at the system level, not the prompt level.

Deepfakes and Information Integrity: The KYC/Fraud Challenge

Synthetic media can now defeat verification systems that rely on surface-level authenticity checks. Three critical financial services processes are exposed:

Identity verification (KYC): Video-based identity verification faces synthetically generated faces that pass liveness checks. Voice-based authentication faces cloned voice samples.

Fraud detection: Synthetic documents, from fabricated pay stubs to falsified identification, can pass automated screening that checks format and consistency without detecting fabrication.

Document authentication: Contracts, regulatory filings, and correspondence can be synthetically altered to preserve formatting while changing substantive content.

The FS AI RMF addresses these risks through detection requirements (deploying AI-based deepfake detection alongside verification processes), authentication hardening (multi-factor verification that doesn't rely solely on visual or audio confirmation), and information integrity controls (establishing provenance chains for critical documents and communications).

Agentic AI: When Financial Systems Act Autonomously

Agentic AI introduces a risk category with limited precedent in financial services: autonomous systems that take actions with real consequences. An AI agent that executes transactions, modifies customer records, or initiates compliance workflows operates fundamentally differently from one that generates text for human review.

The FS AI RMF establishes controls across three dimensions:

Authorization boundaries define what an agent can and cannot do without human approval. Transaction limits, customer-affecting actions, and system modifications all require explicit boundary definitions, not implied ones derived from prompts or training.

Rollback mechanisms ensure autonomous actions can be reversed. Financial transactions, account modifications, and compliance determinations made by agents must have documented reversal procedures.

Accountability chains establish responsibility for autonomous actions. An agent that approves a loan, executes a trade, or flags a transaction must connect back to a human who is accountable for that outcome.

Autonomous AI errors in financial services carry immediate, material consequences. A chatbot providing incorrect information is a liability event. An agent executing unauthorized transactions is a financial loss. The gap between what agentic AI can do and what it should do needs explicit, enforced boundaries.

The AI Trustworthiness Principles Applied to GenAI

The FS AI RMF defines seven characteristics of trustworthy AI. Each carries specific obligations when applied to generative models:

  1. Valid and reliable: Systematic testing against accuracy benchmarks, with ongoing validation that accounts for model updates and changing data distributions.

  2. Safe: GenAI must not produce outputs that cause financial harm. Safety controls must account for the probabilistic nature of generative outputs.

  3. Secure and resilient: GenAI must withstand adversarial attacks, including prompt injection, and maintain security under stress.

  4. Accountable and transparent: Clear accountability for GenAI outputs. Institutions must explain their AI governance processes to regulators and customers.

  5. Explainable and interpretable: Sufficient explanation for AI-driven decisions, especially for model risk teams evaluating these systems. Full explainability remains technically challenging for large language models, but the obligation persists.

  6. Privacy-enhanced: Protection of personal data in training, fine-tuning, retrieval, and output generation. This includes preventing the model from memorizing or reproducing private information.

  7. Fair with harmful bias managed: Testing for discriminatory patterns, with attention to protected classes under fair lending and consumer protection laws.

These principles extend the NIST AI RMF trustworthiness characteristics into financial services regulatory context. Each maps to concrete control objectives within the framework's 230 total controls.

Operationalizing GenAI Controls with Swept AI

The GenAI risk controls in the FS AI RMF require operational infrastructure. Policy documents and governance committees do not catch hallucinations or detect prompt injection.

At Swept AI, we've built our platform to address these GenAI-specific risks:

Evaluate catches risks before deployment. Our evaluation layer tests for hallucination rates, bias patterns, and security vulnerabilities before a GenAI system enters production. Standardized scorecards aligned to FS AI RMF control objectives cover the framework's pre-deployment requirements.

Supervise detects risks in production. Our supervision layer monitors GenAI outputs in real time, detecting hallucination patterns, prompt injection attempts, and output drift. Outputs that deviate from expected behavior trigger escalation paths, exactly as the framework requires.

Certify generates compliance evidence. Our certification capabilities produce audit-ready documentation mapping GenAI operations to specific FS AI RMF control objectives.

Traditional model risk frameworks don't cover hallucination, prompt injection, deepfakes, or autonomous agent behavior. SR 11-7 was written for a different class of model. The 108 institutions that built the FS AI RMF recognized this gap and codified what controlling GenAI in financial services requires. The operational challenge remains: turning those controls into monitoring, detection, and enforcement that runs in production, every day.

Join our newsletter for AI Insights