The AI Check In
Posts
🛟 Your AI Can Be Tricked: New Risks for U.S. Banks

🛟 Your AI Can Be Tricked: New Risks for U.S. Banks

U.S. banks are rapidly deploying large language models (LLMs) across compliance, client service, and risk, but these systems can be manipulated through prompt injection attacks to leak confidential data or override controls. Regulators and boards now face an urgent mandate: secure AI systems as rigorously as capital models before trust and oversight unravel.

Abbie Widin
July 15, 2025

Hello, Abbie Widin here 👋,

Rewrite the rules. Outmaneuver competitors. Stay two steps ahead of regulators. AI Check In delivers sharp, globally relevant intelligence on AI governance, financial risk, and capital automation.

Already briefing strategy, risk, and compliance leaders at the world’s most influential banks, including JPMorgan, Citi, and Bank of America.

What to expect this edition:

🛟 Need to Know: Can Your AI Be Tricked into Leaking Sensitive Data?
🥷 Deep Dive: Prompt Injection, Jailbreaking, and Training Set Exposure in Finance
🔭 Trends to Watch: Specialized Teams, Vendor Risks, and Code-Based Governance
⚔️ Vendor Spotlight: Lakera Guard & PromptArmor

The AI race is on. Let the games begin.

🛡️ Enjoying AI Check In? Forward this to a colleague who needs to stay two steps ahead.

📬 Not subscribed yet? Sign up here to receive future editions directly.

🛟 Need to Know: Can Your AI Be Tricked into Leaking Sensitive Data?

Large language models (LLMs) are now critical infrastructure across U.S. banking, used in customer service, compliance, and internal analysis. Unlike traditional software, LLMs are designed to follow instructions, making them vulnerable to prompt injection and jailbreak attacks.

A single compromised prompt can force an LLM to reveal proprietary models, client data, or internal audit notes. The Federal Reserve, OCC, and FDIC emphasize that AI models must align with existing risk frameworks. The SEC and CFPB warn that manipulated outputs can trigger enforcement under consumer protection laws.

By 2026, audits are expected to require formal LLM security policies, threat modeling documentation, prompt integrity testing, and incident response protocols. Banks should also consider fairness and bias standards such as NIST AI RMF and ISO/IEC 42001.

🥊 Your Move

Require formal LLM threat modeling and prompt injection tests in board-level risk reviews.
Implement clear incident disclosure protocols covering clients, regulators, and internal stakeholders.
Validate that fairness and bias audits are integrated into all AI governance processes.

🥷 Deep Dive: Prompt Injection, Jailbreaking, and Training Set Exposure in Finance

The Hidden Vulnerability in LLMs

Large language models (LLMs) are being integrated across U.S. banks to support client service, compliance analysis, internal audit, and document summarization. However, LLMs remain vulnerable to prompt injection and jailbreak attacks. Unlike traditional exploits that target software code, these attacks manipulate the instructions that an LLM interprets.

How Prompt Attacks Work

Prompt injection can be direct, where an attacker submits crafted text to override safety instructions. It can also be indirect, embedding malicious instructions in external documents or data sources that the model accesses, such as PDFs or metadata. Jailbreaking uses a series of carefully constructed prompts to circumvent guardrails and force an LLM to reveal or generate content it should restrict.

There’s a great free course here (not an affiliate link) if you want to see what this looks like for yourself. https://learnprompting.org/courses/intro-to-prompt-hacking

Lessons from Other Industries

In other sectors, prompt injection attacks have led to high-profile failures. In 2023, an Air Canada chatbot offered unauthorized refunds due to a manipulated prompt, leading to legal and reputational damage. An automotive service chatbot from Chevrolet has been tricked into selling a vehicle for $1. Travel and hospitality bots have been forced to disclose internal workflows and contract details when manipulated by user inputs.

Unique Risks for Banks

In banking, the consequences are more severe. A successful prompt injection can expose confidential client data, proprietary trading strategies, internal policy documents, or audit plans. Document summarizers using retrieval-augmented generation (RAG) are particularly at risk because they retrieve external documents and incorporate them into model outputs, creating opportunities for hidden instructions.

What the Data Shows

Academic studies underscore the risks. "AgentDojo" simulations on synthetic banking agents showed a 20–50% success rate for prompt injection attacks. Broader tests reported an average 56% attack success rate across 36 model architectures (arXiv:2410.23308). These tests demonstrate that even well-guarded LLMs can be manipulated without code-level intrusions.

Citi’s Proactive Example

Citi has taken a documented stance on generative AI governance. The firm states that no generative AI applications are deployed into production without passing data-exfiltration risk checks. Citi has also invested in Lakera, a vendor specializing in prompt injection defense. While Citi has not disclosed specific technical details of its mitigation architecture (such as exact prompt sanitization workflows or context segmentation), it emphasizes continuous oversight and multi-layer security checks before deployment.

Layers of Defense Banks Can Use

Forward-thinking banks are adopting several approaches to mitigate these risks.

RAG workflows with strict filtering separate factual document retrieval from generative processing, reducing the chance of introducing manipulated content.
Prompt sanitization and filtering involve systematically cleaning user inputs to remove hidden instructions before they reach the model.
Role-based context isolation limits each model’s permissions and task scope, decreasing potential impact if one is compromised.
Red-teaming and adversarial testing simulate attacks to identify vulnerabilities before external threats can exploit them.

Governance and Disclosure Requirements

Beyond technical defenses, governance structures are critical. The Federal Reserve, OCC, and FDIC recommend including AI systems in enterprise risk frameworks. Regulators expect documented threat modeling, continuous testing logs, and incident response playbooks. Banks are also advised to adopt fairness and bias standards like NIST AI RMF and ISO/IEC 42001 to reduce legal and reputational exposure tied to unintended model behavior.

Incident disclosure is an additional governance requirement. In other sectors, delayed or incomplete disclosure has compounded reputational fallout. Financial institutions should establish clear protocols for rapid internal escalation, external regulatory notification, and transparent communication to affected clients when AI-related incidents occur.

Staff Training Imperative

Internal training and operational readiness are equally important. Staff must be trained to understand the signs of prompt-based manipulation and to respond quickly to suspicious AI outputs. Governance teams should document override procedures and decision rights to prevent escalation failures.

The combined approach of technical mitigation, governance integration, incident readiness, and staff training forms a comprehensive defense posture. While no system can eliminate prompt injection risk entirely, banks that actively integrate these measures position themselves to minimize operational disruptions and regulatory penalties.

🥊 Your Move

Integrate prompt injection red-teaming and RAG-based risk testing into quarterly AI audit cycles.
Require documented pre-deployment risk checks for all generative AI tools, including external vendor certifications.
Develop and approve incident escalation and disclosure protocols specific to AI system failures.

🔭 Trends to Watch: Specialized Teams, Vendor Risks, and Code-Based Governance

Banks are building dedicated LLM security and governance teams to address prompt injection, model integrity, and continuous monitoring. Large institutions such as JPMorgan and Citi have begun integrating these functions separately from traditional cybersecurity groups.

Vendor risk is under increasing review. Banks are adding LLM-specific security clauses to contracts and performing continuous monitoring of third-party AI providers.

Subscribe to keep reading

This content is free, but you must be subscribed to The AI Check In to continue reading.

Already a subscriber?Sign in.Not now