SR 11-7 AI Compliance Guide
How to make your AI agents examination-ready under the Federal Reserve's Model Risk Management framework, before regulators arrive.
By Aurelio Dioval · May 2026 · 14 min read
SR 11-7 is the Federal Reserve's Guidance on Model Risk Management, issued in 2011 and now the de facto standard for how financial institutions must govern their models. When AI agents started making credit decisions, flagging fraud, and pricing risk, they became "models" under SR 11-7. Most banks know this. Most have not yet architected their AI systems to comply.
This guide covers the exact technical requirements, the common failure points, and the architecture patterns that make AI agents examination-ready.
1. What SR 11-7 Considers a "Model"
SR 11-7 defines a model broadly: "a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates."
Under this definition, every AI agent that influences a business decision in financial services is a model. This includes:
- Credit scoring and underwriting agents
- Fraud detection and AML screening agents
- Market risk and portfolio optimization agents
- Customer service agents that determine account actions
- Document processing agents that extract terms from contracts
- Any agent whose output feeds into a regulated decision
2. The Three Pillars of SR 11-7
Pillar 1: Model Development and Implementation
Every model must have:
- Sound design: documented methodology, theoretical basis, and limitations
- Rigorous testing: not just accuracy metrics. Stress testing, sensitivity analysis, out-of-sample validation, and testing against alternative approaches
- Documented implementation: code review, version control, change management procedures
The AI problem: Most agent frameworks have none of this. The model is a prompt template in a Python file, the "testing" is a developer clicking through three examples, and the implementation documentation is a Slack message saying "deployed to prod."
Pillar 2: Model Validation (Independent Review)
Every model requires independent validation by a party not involved in development:
- Conceptual soundness evaluation
- Outcomes analysis (comparing predictions to actuals)
- Back-testing and benchmarking
- Assessment of limitations and compensating controls
The AI problem: How do you independently validate a non-deterministic agent? Standard model validation assumes reproducible outputs. AI agents give different answers to the same question. You need statistical validation frameworks, distribution analysis of outputs, boundary testing, adversarial probing, and consistency scoring across runs.
Pillar 3: Model Governance
A governance framework that covers the entire model lifecycle:
- Model inventory: every agent catalogued with risk tier, owner, use case, and dependencies
- Change management: formal approval process for prompt changes, model upgrades, data source modifications
- Ongoing monitoring: performance degradation detection, drift monitoring, anomaly alerting
- Annual review: re-validation and governance attestation
- Escalation procedures: clear paths for model failures, regulatory inquiries, and incident response
3. Where Most Banking AI Fails SR 11-7
Failure 1: No Model Inventory
Banks deploy AI agents across departments, risk, operations, customer service, compliance, and nobody maintains a central registry. When examiners ask "how many AI models do you have in production?", the answer is "we don't know." This is an immediate finding.
Failure 2: Non-Deterministic Validation
Traditional model validation assumes: same input → same output. AI agents violate this assumption. Banks try to validate agents with the same methods they use for logistic regression, and the results are meaningless. You need probabilistic validation frameworks.
Failure 3: No Audit Trail
Most agent frameworks produce logs, not audit trails. Logs tell you what happened. Audit trails prove it to a regulator. The difference is immutability, completeness, and traceability. A ClickHouse append-only table with cryptographic hash chains is an audit trail. A rotating JSON log file is not.
Failure 4: Prompt Changes Without Governance
A developer changes a system prompt and deploys to production in the same afternoon. Under SR 11-7, this is an ungoverned model change. Every prompt modification must go through the same change management process as a model parameter change in a traditional risk model.
4. The Examination-Ready Architecture
Here is the technical architecture that satisfies SR 11-7 for AI agents:
- Model Registry: Centralized catalog of every agent, including risk tier, owner, approval status, validation date, and dependencies. Queryable by examiners.
- Immutable Audit Trails: ClickHouse-backed append-only storage for every agent input, output, reasoning chain, confidence score, and human override. Cryptographic hash chains for tamper detection.
- Probabilistic Validation Suite: Automated testing pipeline that measures output distribution stability, boundary behavior, adversarial robustness, and consistency scores. Runs continuously, not just at deployment.
- Change Management Gates: Every prompt change, model update, or data source modification requires formal approval, impact assessment, and re-validation trigger.
- OPA Policy Enforcement: Open Policy Agent rules that encode regulatory constraints as machine-readable policies. The agent cannot execute a decision that violates an active policy.
- Human-in-the-Loop Framework: Configurable confidence thresholds that route low-confidence or high-impact decisions to human review with full context.
- Examiner Dashboard: Read-only interface for regulators to query audit trails, view model documentation, and inspect governance records without engineering support.
5. What Examiners Look For
Based on recent OCC and Fed examination patterns, here is what examiners focus on when reviewing AI systems:
- "Show me your model inventory", they want a complete list within minutes, not days
- "Show me the validation report for this agent", independent validation, not self-assessment
- "Walk me through a decision", end-to-end traceability from input to output to audit record
- "What changed since last examination?", governed change management with approval records
- "What happens when this agent is wrong?", escalation procedures, human override records, incident history
If you cannot answer these questions in real-time with documented evidence, you will receive findings.
6. Your Compliance Roadmap
- Month 1: Agent inventory and risk classification
- Month 2: Audit trail infrastructure (ClickHouse + hash chains)
- Month 3: Validation framework and initial assessments
- Month 4: Change management and OPA policy gates
- Month 5: Human oversight framework and examiner dashboard
- Month 6: Full governance documentation and examination dry run
Dioval Group specializes in making AI agents examination-ready for SR 11-7, EU AI Act, and SOC 2. We perform the regulatory gap assessment, build the compliance infrastructure, and deliver the documentation examiners expect. Book a diagnostic call →
Are Your AI Agents Examination-Ready?
Most banks discover SR 11-7 gaps during an examination, not before. Let us run the diagnostic first.
Request an SR 11-7 Gap Assessment →