4-Step AI Audit Catches 'Almost Right' Errors
Dylan Davisgo watch the original →
the gist
Dylan Davis outlines a four-step process using separate AI chats to extract claims from outputs, validate them against sources via four labels (supported, conflicts, no proof, needs human judgment), and rewrite for high-stakes tasks like contracts or due diligence.
The Breakthrough
Dylan Davis developed a four-step audit process that uses fresh AI conversations to break AI-generated content into verifiable claims, check them against source material using four labels, and rewrite the content to fix subtle misrepresentations.
What Actually Worked
- Finish the initial AI-generated artifact (document, Excel, PowerPoint) through iteration until ready to ship, then decide if high stakes (financial, legal, reputation) warrant the audit.
- In a new chat with a high-end model like Claude 3 Opus or GPT-4o, paste this prompt to extract claims: "I want you to break this write up into small factual claims. A claim is one fact that can be checked... list out all the factual claims... create a table with three columns: claim number, exact claim, what source you actually pulled it from that can prove this." Attach or paste the output and source.
- In another new chat, validate claims against the source using this prompt: "Your goal here is to check all the claims against the source material... use four labels: supported, conflicts, no proof, needs human judgment... for each claim: label, exact source line or short quote, one-sentence reason." The labels guide actions: keep supported claims, replace conflicts with source facts, remove or soften no-proof claims, flag needs human judgment for manual review.
- In a final new chat, rewrite using this prompt: "Rewrite the original writeup using the audit results below... only use the original writeup as the base... keep the same structure and style... for supported: keep; conflicts: use source facts; no proof: remove or soften; needs human: treat as uncertain." Attach original and audit.
- For extreme stakes, rotate models across steps (e.g., Claude 3 Opus for finish/rewrite, GPT-4o for split, Gemini 1.5 Pro for check) to leverage different biases.
Context
The process targets high-stakes AI uses where subtle errors in 'almost right' outputs mislead, such as contract reviews, vendor due diligence, investment analysis, or proposals. Users skip it for 90% of low-risk tasks to avoid waste. In an example, an AI summary claimed revenue grew 18% 'mainly driven by enterprise customers' (no proof, softened to uncertain) and 'sales team became more efficient' (conflicts with source, corrected).
Notable Quotes
- "The most dangerous AI answer isn't the one that's completely wrong... the dangerous one is the answer that's almost right."
- "Simply asking AI 'Are you sure?' doesn't actually work."
- "90% of the use cases aren't needed for this detailed audit process... optimized for those tasks that are high stakes either financially legally or whatever else."