- The AI Check In
- Posts
- 🛟 The AI Model Drifted. No One Told the Bank.
🛟 The AI Model Drifted. No One Told the Bank.
Capital One tracks drift daily. JPMorgan rechecks outputs with agents. If your LLM quietly fails, will you catch it in time? Regulators still expect validation, even when vendors don’t disclose model updates. Boards need monitoring pipelines or plausible deniability.
Hello, Abbie Widin here 👋,
AI Check In delivers sharp, globally relevant intelligence on AI governance, financial risk, and capital automation.
We brief strategy, risk, and compliance leaders at the world’s most influential banks, including JPMorgan, Citi, and Bank of America.
What to expect this edition:
🛟 Need to Know: The Model Is Getting Dumber and No One Told Compliance
🥷 Deep Dive: Why Foundation Model Drift Matters More Than Accuracy
📈 Trends to Watch: Model Drift gets Technical
⚔️ Vendor Spotlight: Giskard and Truera Monitoring Black-Box Drift
Rewrite the rules. Outmaneuver competitors. Stay two steps ahead of regulators.
Let the games begin.
Breaking: AI Action Plan Implications For Banks
Last week, Trump’s 2025 U.S. AI Action Plan greenlit aggressive AI deployment. But regulators haven’t relaxed a single expectation under SR 11-7, UDAAP, or ECOA.
The upside? First movers scale faster, crush cost curves, and lock in data moats. The downside? Drift, vendor opacity, and silent errors trigger fines, lawsuits, or worse.
The risk and reward are yours alone. Without AI, you fall behind. If your AI fails, you hold the liability.
🛟 Need to Know: The Model Is Getting Dumber and No One Told Compliance

Foundation models like GPT-4, Claude, and Gemini are now embedded in U.S. banking operations. They summarize call transcripts, classify disputes, and support legal and compliance reviews. But recent studies show performance degradation. A Stanford-led analysis found GPT-4’s multi-step math accuracy on prime‑number identification dropped from 97.6% to 2.4% in three months. Claude experiences self-induced drift during complex sessions, high cognitive load or context switching.
All drift has material consequences. Banks face exposure in legal summarization, internal model documentation, and suitability recommendations, each tied to UDAAP, ECOA, or Reg BI. Because these models are accessed via third-party APIs, banks receive no alerts when models are updated or performance degrades.
In April, the OCC flagged third-party AI auditability as a supervisory priority. In May, the Fed confirmed SR 11-7 applies to foundation models used in regulated decisions. Audit and Risk Committees should expect drift risk to surface in examiner reviews within 6–12 months.
🥊 Your Move
Require monthly validation of model outputs used in compliance, credit, or legal functions.
Update SR 11-7 documentation to include foundation model drift.
Renegotiate vendor contracts to mandate update notices and test-before-release clauses.
🛡️ Enjoying AI Check In? Forward this to a colleague who needs to stay two steps ahead.
📬 Not subscribed yet? Sign up here to receive future editions directly.
🥷 Deep Dive: Why Foundation Model Drift Matters More Than Accuracy

In 2025, U.S. banks increasingly rely on foundation models, such as GPT‑4 and Claude, for capturing compliance summaries, processing customer queries, and surfacing risk insights. However, a growing body of research documents significant performance volatility over time.