- The AI Check In
- Posts
- 🛟 Leading Banks Hire Customer-Facing AI Agents
🛟 Leading Banks Hire Customer-Facing AI Agents
These agents don’t just answer questions. They negotiate refunds, approve payments, and push products. Capital One, JPMorgan, and Upstart are already scaling them and reaping rewards.
👋 Welcome, thank you for joining us here :)
The AI Check In is your weekly power play to navigate AI governance and strategy in banking and finance.
What to expect this edition:
This week, we start our exploration of AI agents in banks. This week’s focus is on external or customer-facing agents, next week, we look at internal AI agents.
- 🛟 Need to Know: Did your AI say the right thing to your customer? 
- 🥷 Deep Dive: From chatbots to autonomous agents - how U.S. banks are rewriting customer interaction 
- ⚔️ Vendor Spotlight: Kasisto’s KAI Platform 
- 📈 Trends to Watch: The horizon for customer-facing AI agents 
Rewrite the rules. Outmaneuver competitors. Stay two steps ahead of regulators.
Let the games begin.
🛟 Need to Know: Did your AI say the right thing to your customer?

AI agents are now speaking on your behalf. They greet your customers, offer product guidance, and might retain memory across sessions. A few are even making decisions.
The Federal Reserve expects clear oversight of AI risks. The Basel Committee requires agents involved in fraud or credit risk to be auditable and fair. But real control lies in:
- Maintaining a live inventory of deployed agents 
- Monitoring escalation failure rates 
- Defining the outer limits of agent autonomy, including memory expiry protocols, decision boundaries, and data retention settings 
Customers (and others) are trying to trick your AI agents for their advantage. Prompt injection attacks are frequent. A 2025 SailPoint study found 23% of firms had AI agents manipulated into revealing credentials and 80% observed unintended actions.
Recent 2025 incidents reinforce the risk: NYC’s small business chatbot gave unlawful HR advice, while Cursor’s AI support bot fabricated a non-existent login policy. Both eroded trust and exposed their operators to legal and reputational fallout.
If your AI suggests the wrong product to the wrong customer, you may face Reg BI or ECOA consequences.
🥊 Your Move:
- Demand agent logs that capture handoffs, fallbacks, and failures. 
- Confirm every agent discloses that it is AI and also what it does with data. 
- Audit product recommendation logic for Reg BI, UDAAP, and ECOA exposure. 
🛡️ Enjoying AI Check In? Forward this to a colleague who needs to stay two steps ahead.
📬 Not subscribed yet? Sign up here to receive future editions directly.
🥷 Deep Dive: From Chatbots to Autonomous Agents—How U.S. Banks Are Rewriting Customer Interaction

In 2025, U.S. banks are undergoing a sharp transition from scripted chatbots to autonomous AI agents capable of dynamic reasoning, memory management, and third-party tool integration.
Unlike chatbots that follow scripts, agents operate in cycles: they interpret a task, plan steps, call tools or APIs, evaluate the result, and loop until resolved or escalated.
These agents are now shaping customer engagement, driving cost reductions. This raises complex questions of oversight, suitability, and brand trust.
From Scripted Support to Dynamic Intelligence
Traditional chatbots are being phased out. In their place, AI agents built using frameworks like LangChain and AutoGen can reason across multiple steps, retrieve internal documents via RAG (retrieval-augmented generation), and escalate intelligently.
Where early bots faltered on edge cases, today’s agents can handle complex queries, like explaining loan eligibility, troubleshooting a failed transfer, or flagging suspicious charges, without falling back to humans.
Leading banks are adopting LLM-specific red-teaming methods to test AI agents before deployment. These include:
- Prompt injection attacks 
- Simulation of biased user personas 
- Testing ethical override triggers (e.g., customer asks agent to recommend risky behavior) 
Without these controls, agents risk drifting from acceptable use boundaries under live conditions.
Case Study 1: JPMorgan Chase

JPMorgan is deploying AI agents in customer-facing roles at scale. The bank reports material operational gains, with early systems driving significant reductions in service costs and faster resolution times.
These agents are integrated into live support environments, surfacing tailored scripts, actions, or answers in real time.
“We’re using AI to anticipate call center intent and resolve common customer issues, like sending out a new debit card, without human intervention.”
Capital One

Capital One describes AI as central to its digital strategy, with “customer-facing intelligent digital agents” deployed to reduce friction and personalize banking interactions. Generative AI is being embedded to improve both service delivery and decision support in real time.
“We’re integrating generative AI to enhance customer servicing experiences, and improving our intelligent digital agent performance.”
Upstart

Whilst not a traditional bank, Upstart’s model shows what an early player in AI-first operations looks like. Its generative AI platform automates 90% of loan decisions with no human intervention and incorporates autonomous verification flows:
“Upstart’s platform handles over 90% of loans with no human intervention.”
Adoption Lag is Human-Driven Not Tech-Driven
Despite the gains, not all internal stakeholders are convinced.
Employee feedback channels (including internal surveys and Glassdoor reviews) reveal that some teams find AI agents rigid, opaque, or inconsistent in complex workflows. Like SAP in a prior era, agent acceptance often hinges on how seamlessly they integrate with human culture and processes, not just how fast they respond.
Training, role clarity, and override protocols are emerging as key to successful rollout.
Governance and Compliance Considerations
As AI agents gain autonomy, governance requirements deepen. Banks must address multiple vectors of operational risk:
- Auditability: Ensure every agent output can be traced—back to the prompt, tools invoked, and decision paths taken. 
- Escalation Protocols: Define clear thresholds where agents must defer to human review, especially on financial advice or eligibility. 
- Memory Expiry: Implement time- or trigger-based memory limits to reduce context drift and misapplied personalization. 
- Agent QA: Build pre-deployment testing frameworks that simulate edge cases, adversarial prompts, and abnormal user behavior mirroring LLM red-teaming. 
- Regulatory Alignment: Stay compliant with Reg BI, ECOA, UDAAP, and emerging state-level consumer protections. 
Exec teams and boards need visibility not just into the model, but into the agent runtime context: what decisions it’s making, what tools it can access, and how it evolves with each customer interaction.
