Case Study — Audit Intelligence

CARE — an AI-augmented audit workspace where the AI does the reading, and the auditor keeps the decision

For a top-5 national health insurer's payment-integrity program. Fourteen AI components analyze every claim through an 8-stage pipeline; auditors work the output through a conversational review surface that remembers what they asked yesterday and captures every override as training data.

Client
Top-5 National Health Insurer
Industry
Healthcare — Payment Integrity
My Role
Lead Discovery + App Build
Timeline
2026
Stack
Claude Code · LangChain · GCP
Engagement
Deloitte Consulting
01 — The Challenge

Claims audit at scale — but human attention doesn't scale

The Problem

  • Auditors opened 17 tabs to work one claim — policy PDF, claim form, provider history, precedent database
  • Audit decisions captured the verdict but lost the reasoning
  • AI recommendations were batched offline — auditors couldn't ask follow-ups
  • Override decisions were a compliance checkbox, not a training signal
  • Platform health was invisible — no way to catch a drifting agent until the SLA broke
  • Four audiences, four disconnected tools

The Shift

  • One workspace: AI summary, findings, similar cases, decision panel — in one view
  • Inline conversational review — Clara answers the auditor's actual question, in context
  • Override justification as labeled training data for the learning loop
  • AgentOps dashboard: 14 components + 8 stages + feedback loops, fully legible
  • Same data model surfaces differently for Auditor · Manager · Executive · Platform Ops
  • Audit trail that captures every reasoning step, not just the verdict
$30–40M
Estimated Annual Savings
14
AI Components
91%+
AI Decision Accuracy
4
Persona Workspaces
02 — User Research

Four audiences, one pipeline

Same 14-component pipeline underneath; four very different surfaces on top. The auditor owns the decision; the manager owns the pattern; the executive owns the outcome; platform ops owns the machine.

Auditor
Senior Claims Auditor
Reviews flagged claims, validates AI findings, makes the final call — approve, deny, or escalate.
Pain: Prior tool: 17 tabs, 40 minutes per claim. Half the time hunting precedent and policy.
Manager
Audit Team Lead
Reviews auditor decisions, approves rule configurations, monitors override patterns across the team.
Pain: No way to tell if the team is agreeing with the AI or routinely overriding it — both signals matter.
Executive
VP, Payment Integrity
Reports portfolio outcomes, justifies AI investment with performance metrics at board level.
Pain: Need one number for the C-suite, and the metric trail for the CFO, and the auditor override rate for the risk committee.
AgentOps
AI Platform Lead
Monitors the 14-component pipeline, catches bottlenecks, tunes agents when override patterns surface.
Pain: When the SLA slips, need to know immediately: which agent? Which stage? Token cost spike? Queue backup?
03 — MVP Walkthrough

Three live moments — the interactions, not just screenshots

Rather than flat images, each of the three moments below is a live React reconstruction from the prototype — the same typing animation, the same conversational flow, the same override gate that auditors see in the real product.

Moment 01 · Platform anatomy

Fourteen components, eight stages, one claim in motion

Every claim passes through the same pipeline. Audit Agent reads the codes, Quality Check flags the outliers, Orchestration routes by risk score. By the time it reaches Human Review, the auditor is inheriting context — not a blank queue.

Stage 01
Intake
tool
Stage 02
Validation
tool
Stage 03
Orchestration
service
Stage 04
Audit Agent
agent
Stage 05
Quality Check
agent
Stage 06
Assignment
service
Stage 07
Human Review
human
Stage 08
Feedback
agent
AI agent Service Tool Human
Moment 02 · Conversational review

Auditors don't scroll documentation. They ask.

Every workspace embeds Clara, the audit assistant. She surfaces the AI's reasoning, cites similar cases, and answers what the auditor actually wants to know — “were any of these overturned on appeal?”. The conversation is the documentation.

Ask ClaraCLM-2024-8598
Moment 03 · The case dossier

Summary, findings, precedent — pre-assembled by the time the auditor arrives

Before the auditor clicks into a claim, the AI has already done the heavy lift. A TL;DR summary (with a specific recommendation and confidence), confidence-ranked AI findings, deterministic rule-based findings from the Config engine, and the most similar precedent from the last 90 days. Switch tabs to see what each surface does — and notice that 50% of the similar cases were overturned on appeal, which is itself the signal the auditor weighs.

CLM-2024-8598Amount $4,280 · DOS 02/12/2024 · Provider NPI 1023456789

AI RecommendationDENYModifier 59 unbundling87% confidence
  • Policy triggered: MFA-2024-14 §3.2 (bundled pair + modifier)
  • Documentation gap: Operative note missing separate-site reference
  • Precedent: 2 of 4 similar cases overturned — both on documentation
  • Recommended reason code: CO-B20
Moment 04 · Human override as labeled data

Agreement is easy. The override is where accountability lives.

When the auditor aligns with the AI, one click confirms and the denial reason pre-populates. When they override — approve a claim the AI said to deny, or escalate what it approved — a justification is required. That note flows to the learning loop as labeled training data. Try it:

AI RecommendationDENYModifier 59 unbundling
Confidence87%
Step 1 — Your decision
04 — Impact

Decision speed up, override loop feeding the model

$30–40M estimated annual savings

Higher denial accuracy plus a lower appeals-overturn rate contribute an estimated $30–40M in recovered annual spend for the payer — the largest impact lever in the payment-integrity program this year.

Review time down 42%

Auditors spend minutes per claim instead of tens of minutes — the AI brief, similar-cases panel, and inline chat remove the hunt for context.

91.2% AI-human agreement

When the AI recommends and the auditor decides, they align 9 out of 10 times. The 9% disagreement is the most valuable training signal the model gets.

Override is never silent

Every disagreement with the AI requires a justification note. Not compliance theater — the note becomes labeled data for the learning loop.

05 — Reflection

Designing AI that works with experts, not in place of them

The hardest design question on this case wasn't “how do we show the AI output?” — it was “what happens when the auditor disagrees?” That moment of disagreement is where an AI-augmented workflow either earns trust or loses it. Too frictionless, and overrides become thoughtless; too heavy, and auditors route around the tool. Putting a required justification on the override — and then telling the auditor where that note actually goes (the learning loop) — reframed it from compliance theater to collaboration with the model.

Built with Claude Code: Led discovery and built the application on the same keyboard — Claude Code for the UI, LangChain agents on GCP for the reasoning layer. No handoff gap between design spec and shipped prototype
Interaction frame: AI is an auditor on your bench, not a replacement — every surface answers that question before it answers anything else
Override as data: The blocker moment (override justification) is also the best data we get — treat it like a labeled-data pipeline, not a compliance form
Pipeline visibility: A 14-component multi-agent system is invisible to end-users by design, but fully legible to AgentOps when it drifts
Next Case Study

Agentic AI Platform for Care Operations

View Project →