SR 11-7 AI Compliance Guide

SR 11-7 is the Federal Reserve's Guidance on Model Risk Management, issued in 2011 and now the de facto standard for how financial institutions must govern their models. When AI agents started making credit decisions, flagging fraud, and pricing risk, they became "models" under SR 11-7. Most banks know this. Most have not yet architected their AI systems to comply.

This guide covers the exact technical requirements, the common failure points, and the architecture patterns that make AI agents examination-ready.

1. What SR 11-7 Considers a "Model"

SR 11-7 defines a model broadly: "a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates."

Under this definition, every AI agent that influences a business decision in financial services is a model. This includes:

Credit scoring and underwriting agents
Fraud detection and AML screening agents
Market risk and portfolio optimization agents
Customer service agents that determine account actions
Document processing agents that extract terms from contracts
Any agent whose output feeds into a regulated decision

2. The Three Pillars of SR 11-7

Pillar 1: Model Development and Implementation

Every model must have:

Sound design: documented methodology, theoretical basis, and limitations
Rigorous testing: not just accuracy metrics. Stress testing, sensitivity analysis, out-of-sample validation, and testing against alternative approaches
Documented implementation: code review, version control, change management procedures

The AI problem: Most agent frameworks have none of this. The model is a prompt template in a Python file, the "testing" is a developer clicking through three examples, and the implementation documentation is a Slack message saying "deployed to prod."

Pillar 2: Model Validation (Independent Review)

Every model requires independent validation by a party not involved in development:

Conceptual soundness evaluation
Outcomes analysis (comparing predictions to actuals)
Back-testing and benchmarking
Assessment of limitations and compensating controls

The AI problem: How do you independently validate a non-deterministic agent? Standard model validation assumes reproducible outputs. AI agents give different answers to the same question. You need statistical validation frameworks, distribution analysis of outputs, boundary testing, adversarial probing, and consistency scoring across runs.

Pillar 3: Model Governance

A governance framework that covers the entire model lifecycle:

Model inventory: every agent catalogued with risk tier, owner, use case, and dependencies
Change management: formal approval process for prompt changes, model upgrades, data source modifications
Ongoing monitoring: performance degradation detection, drift monitoring, anomaly alerting
Annual review: re-validation and governance attestation
Escalation procedures: clear paths for model failures, regulatory inquiries, and incident response

3. Where Most Banking AI Fails SR 11-7

Failure 1: No Model Inventory

Banks deploy AI agents across departments, risk, operations, customer service, compliance, and nobody maintains a central registry. When examiners ask "how many AI models do you have in production?", the answer is "we don't know." This is an immediate finding.

Failure 2: Non-Deterministic Validation

Traditional model validation assumes: same input → same output. AI agents violate this assumption. Banks try to validate agents with the same methods they use for logistic regression, and the results are meaningless. You need probabilistic validation frameworks.

Failure 3: No Audit Trail

Most agent frameworks produce logs, not audit trails. Logs tell you what happened. Audit trails prove it to a regulator. The difference is immutability, completeness, and traceability. A ClickHouse append-only table with cryptographic hash chains is an audit trail. A rotating JSON log file is not.

Failure 4: Prompt Changes Without Governance

A developer changes a system prompt and deploys to production in the same afternoon. Under SR 11-7, this is an ungoverned model change. Every prompt modification must go through the same change management process as a model parameter change in a traditional risk model.

4. The Examination-Ready Architecture

Here is the technical architecture that satisfies SR 11-7 for AI agents:

Model Registry: Centralized catalog of every agent, including risk tier, owner, approval status, validation date, and dependencies. Queryable by examiners.
Immutable Audit Trails: ClickHouse-backed append-only storage for every agent input, output, reasoning chain, confidence score, and human override. Cryptographic hash chains for tamper detection.
Probabilistic Validation Suite: Automated testing pipeline that measures output distribution stability, boundary behavior, adversarial robustness, and consistency scores. Runs continuously, not just at deployment.
Change Management Gates: Every prompt change, model update, or data source modification requires formal approval, impact assessment, and re-validation trigger.
OPA Policy Enforcement: Open Policy Agent rules that encode regulatory constraints as machine-readable policies. The agent cannot execute a decision that violates an active policy.
Human-in-the-Loop Framework: Configurable confidence thresholds that route low-confidence or high-impact decisions to human review with full context.
Examiner Dashboard: Read-only interface for regulators to query audit trails, view model documentation, and inspect governance records without engineering support.

5. What Examiners Look For

Based on recent OCC and Fed examination patterns, here is what examiners focus on when reviewing AI systems:

"Show me your model inventory", they want a complete list within minutes, not days
"Show me the validation report for this agent", independent validation, not self-assessment
"Walk me through a decision", end-to-end traceability from input to output to audit record
"What changed since last examination?", governed change management with approval records
"What happens when this agent is wrong?", escalation procedures, human override records, incident history

If you cannot answer these questions in real-time with documented evidence, you will receive findings.

6. Your Compliance Roadmap

Month 1: Agent inventory and risk classification
Month 2: Audit trail infrastructure (ClickHouse + hash chains)
Month 3: Validation framework and initial assessments
Month 4: Change management and OPA policy gates
Month 5: Human oversight framework and examiner dashboard
Month 6: Full governance documentation and examination dry run

Dioval Group specializes in making AI agents examination-ready for SR 11-7, EU AI Act, and SOC 2. We perform the regulatory gap assessment, build the compliance infrastructure, and deliver the documentation examiners expect. Book a diagnostic call →