VulnClaw: AI-driven goal-oriented penetration testing and automation tool

An automated penetration testing platform built on LLM agents and a plugin toolchain that emphasizes goal-driven convergence and evidence-level verification; suited for authorized testing and teaching but requires attention to licensing, model dependency, and runtime isolation risks.

GitHub Unclecheng-li/VulnClaw Updated 2026-06-30 Branch main Stars 1.1K Forks 171

LLM Agent Penetration Testing CLI + Web UI Plugin/Skill System

💡 Deep Analysis

How does the evidence-level anti-hallucination gate work, and what are its boundary conditions and weaknesses in practice?

Core Analysis ¶

Mechanism Overview: VulnClaw requires that any claim of “flag/vulnerability confirmed” must appear verbatim in a real tool output (HTTP response, python_execute output, or packet capture); otherwise the claim is discarded or marked unverified.

Technical Strengths ¶

Strict prevention of fabricated conclusions: Verbatim matching greatly reduces the chance of LLMs claiming false wins—especially effective for weaker models.
Traceable evidence: Accepted conclusions are backed by specific tool outputs for reporting and auditing.

Boundary Conditions & Weaknesses ¶

Output truncation/tampering: WAFs, proxies, or timeouts can make evidence incomplete, causing true findings to be marked unverified.
Encoding/format discrepancies: Escaping, Base64, compression, or chunked responses can break verbatim matching.
Multi-step evidence aggregation: Some PoCs require stitching evidence across multiple responses; single-response matching is insufficient.

Practical Recommendations ¶

Make matching strategies configurable (regex, decode-and-match, cross-response aggregation) to reduce false negatives.
Persist raw tool outputs for manual verification; when necessary, allow relaxed matching with confidence annotations.

Important: The evidence gate is effective against hallucination, but atomic string matching alone is not sufficient—adjust matching rules to the environment and keep manual review paths.

Summary: The evidence gate is valuable to reduce false positives, but needs robust output handling and configurable matching to handle real-world network and encoding complexities.

87.0%

As a new user, what is the learning curve and common pitfalls for VulnClaw, and what are clear best practices?

Core Analysis ¶

Onboarding Difficulty: VulnClaw is powerful but has a moderate-to-high learning curve—users need pentest fundamentals (fingerprinting, injection, PoC validation), LLM configuration knowledge (provider, api_key, model), and container/network debugging skills (Docker networking, host.docker.internal).

Common Pitfalls ¶

Hallucinations and false positives still occur: Even with evidence gates, truncated or misconfigured tool outputs can cause discard/false judgments.
Docker networking traps: localhost inside containers refers to the container itself; scanning host or other containers requires special network setup.
Resource & cost overruns: LLM calls and long-running cycles can incur significant API and compute costs.
python_execute risk: Built-in execution is not a strong isolation sandbox and may leak information or run unsafe commands.

Best Practices ¶

Phase your onboarding: Start with quick/recon in an authorized isolated lab to watch Fact/Intent flows before enabling exploits or python_execute.
Set clear boundaries: Use TUI/config to set allow/deny actions, port/path whitelists, and dry-run mode.
Persist logs & evidence: Enable session persistence and export raw tool outputs for manual verification and auditing.
Control costs: Monitor LLM API usage, pick appropriate models per target, and tune timeouts/concurrency.
Sandbox execution: Disable python_execute in untrusted environments or run it inside controlled containers/sandboxes.

Important: Do not treat automation output as final—manually verify any critical finding against raw evidence.

Summary: Following stepwise validation, least-privilege, and evidence verification significantly reduces onboarding risks and common failures.

87.0%

How do I extend or integrate VulnClaw into existing toolchains (plugins, MCP services, report export)?

Core Analysis ¶

Extension Points: VulnClaw provides a low-coupling plugin system and MCP (tool) abstractions to integrate custom detection logic, browser automation, and packet-replay capabilities into the main pipeline; findings are merged into structured reports and runnable PoCs.

Integration Methods ¶

Plugins (vulnclaw/plugins/): Implement specific vulnerability checks. Plugin results merge into SessionState.findings and participate in the blackboard Fact write-back.
MCP service integration: Wrap existing capture/browser automation services as MCPs (implementing fetch/chrome-devtools/burp interfaces) so the agent can invoke them remotely and treat outputs as Facts.
Report & PoC export: The agent can emit structured Markdown reports and Python PoCs for import into bug trackers or manual review.

Implementation Notes ¶

Preserve verifiable outputs: Integrated tools must return raw outputs (HTTP bodies, captures, python outputs) to trigger the evidence gate.
Follow Fact/Intent protocol: Plugins should write confirmed conclusions back as Facts rather than emitting unverified claims in reports.
Version/compatibility testing: Validate compatibility across LLM providers and MCP versions and test behavior under different models.

Important: Prioritize the completeness and persistence of tool outputs during integration—otherwise the evidence-driven closed loop loses its effectiveness.

Summary: Using plugins and MCP abstractions, VulnClaw can be integrated with existing toolchains and produce auditable findings and PoCs, provided integrated tools deliver consistent, verifiable outputs.

86.0%

In real red-team or long-term monitoring scenarios, how should VulnClaw's continuous pentesting and persistent session features be used, and what risks must be controlled?

Core Analysis ¶

Capability Summary: VulnClaw supports continuous pentesting and persistent sessions (e.g., default 100 rounds/period × 10 periods) and retains failure memories in persistent mode to evolve payloads and strategies across cycles.

Benefits ¶

Long-term trend detection: Tracks configuration or vulnerability surface changes over time and generates periodic reports for asset management.
Incremental bypass capability: Failure classification and L0-L4 escalation help the agent progressively try more sophisticated bypasses across cycles.

Risks & Controls ¶

Business impact: Long-running, frequent probes may trigger alerts or affect production—limit to authorized windows and low-frequency probing.
Ongoing costs: Periodic runs incur continuous LLM and compute costs—set budgets and auto-pause thresholds.
Data & privilege risks: Persisted sessions may contain sensitive outputs—use access controls and encrypted storage.
Accumulation of false positives/misoperations: Automation can accumulate unverified conclusions—schedule regular human reviews and state cleanup.

Operational Recommendations ¶

Use a “low-frequency probes + periodic deep testing” model: low-cost models for heartbeats; trigger high-cost deep pentests only on detected changes.
Enforce approvals & alerts for enabling exploits or python_execute in cycles.
Configure budgets and auto-pause policies based on tokens/cost/time thresholds; export intermediate reports on pause.
Harden auditability and storage security: encrypt persisted data and restrict access, preserving an operation audit trail.

Important: Continuous features are powerful but risky—use approvals, budgets, and isolation when applying them in red-team or monitoring contexts.

Summary: With low-frequency probing, targeted deep tests, budget controls, and audit governance, VulnClaw’s persistence is valuable for long-term assessment—but only with strict risk and compliance controls.

86.0%

How should I configure LLM provider, cost controls, and concurrency to get predictable results in real tests?

Core Analysis ¶

Context: VulnClaw supports multiple LLM providers and customizable base_url/model, with defaults for long-running periodic runs (e.g., 100 rounds/period × 10 periods). Model choice, concurrency, and loop configuration directly impact detection quality and cost.

Technical Recommendations ¶

Tiered model strategy: Use low-cost/low-latency models for broad reconnaissance (fingerprinting, port/dir enumeration). When a high-value Intent is found, switch to a higher-quality model for exploitation and report generation.
Budget & period limits: Do not enable long-term periodic runs without budgets. Set per-session max rounds, total token/API limits, and per-target time budgets.
Concurrency & timeout control: Limit concurrent requests to avoid hitting target rate limits or triggering WAFs; set sensible timeouts and retry policies per tool.

Practical Steps ¶

vulnclaw config provider <provider> and vulnclaw config set llm.api_key to configure provider.
Configure session.engine, session.max_rounds, and budget; cap concurrency.
Run quick/recon to triage targets, then enable higher-quality models and exploits for high-value findings.
Monitor LLM API usage in real time and set alerts to pause runs when cost thresholds are exceeded.

Important: High-quality models increase exploit success but also cost—balance based on target value.

Summary: Use a “lightweight collection + strong exploitation” tiered approach, enforce budgets and concurrency limits, and monitor usage/alerts to keep detection effective and costs predictable.

84.0%

✨ Highlights

Goal-driven solver that avoids fixed-round blind loops
Evidence-level anti-hallucination gate that requires real tool outputs
Built-in CLI with optional local Web UI and Docker support
Plugin-based vuln detection and 21 built-in penetration skills
Depends on external LLM providers and API keys; cost and privacy should be evaluated
Includes high-risk python_execute capabilities with potential sandbox/isolation concerns
Repository license and contribution status unknown; legal/compliance risk for commercial use

🔧 Engineering

Automates full flow from reconnaissance to report generation using LLM Agent + MCP toolchain
Uses blackboard state-space search and OODA loop to achieve goal-driven convergence
Supports multiple LLM providers, plugin extensibility, and persistent testing modes

⚠️ Risks

Highly sensitive to external model stability and cost; results depend on model quality and quota
Unknown license and apparent lack of contributors (contributors=0); enterprise adoption requires clear legal and maintenance responsibilities
Built-in remote execution/script features introduce runtime security and isolation risks; strict authorization and sandboxing required

👥 For who?

Authorized pentesters, red teams, CTF players, and security education organizations
Suitable for intermediate-to-advanced users with LLM experience who can bear API costs and risk controls