What Is SPECA?
SPECA is a specification-driven security auditing framework built by NyxFoundation that turns natural-language specs into typed security properties and then runs structured proof-attempt audits against implementations. SPECA is one of the best Specification-Driven Security Auditing Frameworks tools for security researchers, smart contract auditors, and application security teams. Its 2026 paper reports recovery of all 15 in-scope vulnerabilities in the Sherlock Ethereum Fusaka contest, plus 4 novel bugs confirmed by developer fix commits.
Quick Overview
| Attribute | Details |
|---|---|
| Type | Specification-Driven Security Auditing Frameworks |
| Best For | Security researchers, smart contract auditors, and application security teams |
| Language/Stack | Python 3.11+, Node.js CLI, uv, Anthropic Claude Code, MCP |
| License | MIT |
| GitHub Stars | N/A |
| Pricing | Open-Source |
| Last Release | N/A |
Who Should Use SPECA?
- Security researchers validating high-value code paths from a written specification, not just a pattern database.
- Smart contract auditors reviewing protocol invariants where the spec exists but implementation drift is subtle and expensive.
- Application security and platform teams that need repeatable triage, JSON artifacts, and phase-by-phase evidence for internal review.
- Teams using Claude-based agent workflows that want a deterministic scope file and auditable outputs instead of free-form prompting.
Not ideal for:
- Projects without a usable spec, scope document, or target metadata.
- Teams that only want one-shot scanner alerts with no review loop or artifact trail.
- Pipelines that cannot tolerate a Python 3.11, uv, and Claude Code setup.
Key Features of SPECA
- Spec-to-property translation — SPECA derives explicit, typed security properties from natural-language scope and spec material instead of relying on detector signatures. That makes the audit target clear before any code reasoning starts.
- Proof-attempt reasoning — Each candidate issue is evaluated as a failed attempt to prove an invariant, which is materially different from generic linting or taint warnings. This is why SPECA can explain why a finding survived the pipeline.
- Multi-phase audit pipeline — The repository exposes separate phases such as spec discovery, subgraph extraction, property generation, code resolution, audit map creation, and review. That structure gives you checkpoints instead of a single opaque score.
- Deterministic JSON artifacts — Outputs land in
outputs/<phase>_PARTIAL_*.json, which makes the run easy to diff in CI and easy to inspect after the fact. You can browse these artifacts withspeca-cli browse. - Benchmark-backed evaluation — The repo includes RQ1, RQ2, and RQ2b harnesses plus paper figures, so SPECA is not just a demo pipeline. The 2026 paper positions it against real contest and benchmark data, including RepoAudit C/C++.
- CLI-first onboarding —
speca-cliexposesdoctor,init,run, andbrowse, which is a sane interface for developers who want terminal control rather than a web-only workflow. The TUI is the right level of abstraction for bootstrapping a target quickly. - Human validation guardrails — SPECA explicitly marks findings as candidate vulnerabilities that must be checked by a human auditor before reporting. That keeps the framework honest and avoids pretending model output is final truth.
SPECA vs Alternatives
| Tool | Best For | Key Differentiator | Pricing |
|---|---|---|---|
| SPECA | Spec-first security audits with evidence-rich findings | Turns written requirements into typed audit checks and proof attempts | Open-source |
| CodeQL | Large-scale code query analysis across many languages | Mature query engine and deep ecosystem for static analysis | Free for OSS, paid for some private use |
| Semgrep | Fast CI scanning and rule authoring | Lightweight rules and quick developer adoption | Freemium |
| Slither | Solidity and EVM contract analysis | Solidity-native detectors with a fast local workflow | Open-source |
Choose SPECA when the spec matters as much as the code and you want a traceable audit trail, not just a detector hit. Choose CodeQL when your team already maintains query packs and needs broad language coverage at scale. Choose Semgrep when the priority is fast CI feedback. Choose Slither when the codebase is Solidity-heavy and you want a purpose-built EVM analyzer.
If you are building a broader review loop, SPECA pairs well with OpenSwarm for multi-agent coordination and OpenTrace for keeping evidence and reasoning paths visible across phases.
How SPECA Works
SPECA works by converting a natural-language scope into a typed audit model, then using that model to drive implementation analysis. The repo makes this concrete with BUG_BOUNTY_SCOPE.json and TARGET_INFO.json, which act as the control plane for what gets audited, what assumptions are allowed, and which target is in scope.
The core design is a transformation pipeline: spec text becomes candidate properties, properties become code-linked assertions, and assertions become proof attempts against the implementation. That is a better fit for security review than generic source scanning because it keeps the audit aligned with explicit invariants rather than incidentally discovered patterns.
The orchestration layer is intentionally inspectable. Intermediate artifacts are written out phase by phase, which means you can review a failed subgraph extraction, a weak property, or an overreaching code resolution step instead of waiting for a final binary answer. If you already run agentic investigations with OpenSwarm, SPECA fits that style because each stage is observable and replayable.
npx speca-cli@latest doctor
npx speca-cli@latest init
npx speca-cli@latest run --target 04
The doctor command checks the local toolchain, init scaffolds the scope files, and run launches the selected target workflow. After the run finishes, you should expect partial JSON artifacts under outputs/, which you can inspect directly or compare against later runs in CI.
Pros and Cons of SPECA
Pros:
- Spec-anchored analysis reduces the chance that your audit drifts away from what the protocol or project actually promised.
- Interpretability is strong because each candidate maps back to a phase artifact, not just a final alert string.
- Benchmark evidence is credible: the 2026 paper reports all 15 in-scope Sherlock Fusaka vulnerabilities recovered and 4 novel bugs found.
- Repository structure is practical for engineering teams because the scripts, prompts, benchmarks, tests, and website are separated cleanly.
- CLI workflow makes it easy to automate runs in shells, scripts, and CI pipelines.
- Open-source licensing under MIT makes adoption straightforward for research and internal tooling.
Cons:
- Specification quality matters; if the scope is vague, SPECA can only reason over vague constraints.
- Setup overhead is real because the full workflow expects Python, uv, Claude Code, and MCP wiring.
- Human review is mandatory, so SPECA shortens the audit loop but does not replace an experienced auditor.
- Research artifact bias means some teams will need integration work before it feels like a polished commercial product.
- Scope-driven coverage can miss bugs that sit outside the written requirements, which is the trade-off for higher precision.
Getting Started with SPECA
npm install -g @anthropic-ai/claude-code
npx speca-cli@latest doctor
npx speca-cli@latest init
npx speca-cli@latest run --target 04
The first command installs the Claude Code dependency that the full workflow expects, and the next two commands scaffold and execute a target. If you want the full repo-based orchestrator instead of the CLI wrapper, clone the repository, run uv sync, and set up MCP before calling the Python phase runner.
After the initial run, inspect outputs/<phase>_PARTIAL_*.json and use speca-cli browse to review the generated artifacts. The first target should be small enough to validate the environment, then you can iterate by refining BUG_BOUNTY_SCOPE.json and TARGET_INFO.json until the findings match your audit goal.
Verdict
SPECA is the strongest option for specification-first security audits when you have a clear scope and a human reviewer in the loop. Its biggest strength is the typed proof-attempt pipeline, and its main caveat is setup plus scope quality. If you need traceable, spec-grounded findings, SPECA is worth adopting.



