DeepSec CLI Secures AI-Generated Codebases

AI LABS2026-05-08go watch the original →

the gist

Vercel's DeepSec automates security scans on large repos via regex filtering and batched Claude Code analysis (init/scan/process/report), with 10-20% false positives and focus beyond known vulns.

The Breakthrough

Vercel released DeepSec, a CLI security harness that filters codebases with regex patterns for vulnerabilities and runs batched parallel investigations using Claude Code Opus 4.7 at max effort or GPT 5.5 at x-high reasoning.

What Actually Worked

Developers execute deepsec init to create a .deepsec folder, install dependencies, and generate info.md via a pasted prompt in Claude Code; this file details project overview, authentication flow, threat models, project patterns, and known false positives.
The deepsec scan command runs regex matching on all files to identify and list security-sensitive ones quickly without agent involvement.
deepsec process splits filtered files into batches of around five, assembles fresh framework-specific prompts with project info, and processes them in parallel using Claude Code subscription (default, no API key needed) or configured keys for Claude agent SDK or Codex agent SDK; it resumes from errors and estimates token costs.
deepsec report merges, deduplicates, and normalizes findings into JSON and Markdown reports categorized by severity, including source lines, commit introducer, responsible author, confidence, recommended fixes, and reproduction steps.
Optional deepsec revalidate cross-checks findings for false positives; export writes per-issue files ordered by priority into a findings folder.
The video provides a Claude Code skill that automates the full workflow (init to export) from one prompt, bundling assets, evals, reference scripts, and gap coverage.

Before / After

DeepSec achieves a 10 to 20 percent false positive rate on tested codebases. On an OWASP practice app with 10 documented vulnerabilities, DeepSec surfaced three findings because info.md directed focus beyond known issues. Direct Claude Code review identified 39 issues initially, which narrowed to 13 after scoping instructions. On a second app without pre-known vulns in info.md, DeepSec found nine scoped issues with severity levels and fixes; Claude found 39 unscoped issues (13 scoped) but DeepSec missed runtime issues like CORS misconfigurations and dynamic logical flaws.

Context

AI coding tools accelerate app development but introduce security risks at higher rates, including agents deleting entire projects or production databases and leaks like Apple's internal claude.md. DeepSec addresses this with a systematic, scalable workflow optimized for large repos: cheap regex pre-filtering minimizes expensive agent tokens, batching enables parallelism, git metadata adds accountability, and structured outputs feed human or agent fixes. The video runs the full macOS setup with Claude Code CLI subscription, tests on real vulnerable apps, iterates on misses, and packages a skill for repeatability, filling gaps in ad-hoc agent reviews.

Notable Quotes

"The reported false positive rate sits around 10 to 20 percent, which is strong for an ai security testing workflow at this scale." "Deepsee expected an app where the 10 vulnerabilities are already known and it only focused on issues besides them because they were already known." "Claude tends to identify other issues in addition to the scope along the way. It does not solely focus on the scoped issues that DeepSec was specifically designed for."

Content References

No external books, papers, reports, podcasts, datasets, or events receive detailed review, citation, or recommendation. Tools like Claude Code and Codex appear as implementation backends.