OpenClaw + ZeroLeaks: Understanding a 2/100 Security Score (And What It Means for Your Agents)
IMPORTANT
TL;DR: A public ZeroLeaks security audit gave OpenClaw (formerly Clawdbot/Moltbot) a 2/100 security score, with 84% of extraction attempts successfully pulling the full system prompt and 91% of prompt‑injection attacks achieving control over agent behavior. Combined with a separate 10/10 “critical risk” rating, this means any OpenClaw agent exposed to untrusted input should be treated as if an attacker can read its secrets and make it run arbitrary actions.
Security researchers, red‑teamers, and even mainstream tech outlets are now describing OpenClaw as a “security dumpster fire.” Yet the tool remains extremely popular because it is one of the few agents that can actually take actions on your behalf—checking in for flights, managing email, moving files, or even interacting with wallets.
This post unpacks what the ZeroLeaks (often misspelled “ZeroLinks”) report really says, how to interpret a 2/100 score in practice, and concrete steps to harden or contain OpenClaw if you still decide to use it.
1. What Is ZeroLeaks and What Did It Test?
ZeroLeaks is a specialized LLM/agent security scanner designed to probe prompt‑injection resilience, secret exfiltration, and unsafe tool usage in AI systems. In January 2026, security engineer Lucas Valbuena publicly ran OpenClaw through ZeroLeaks and shared the results on X:
| Metric | Result |
|---|---|
| Overall Security Score | 2/100 |
| Critical Risk Score | 10/10 |
| Extraction Rate | 84% |
| Prompt Injection Success | 91% |
| System Prompt Leaked | First turn |
Valbuena notes that this exposed complete access to OpenClaw’s internal configuration, including files such as SOUL.md, AGENTS.md, and the definitions of all installed skills and memory files.
Other practitioners and commentators amplified the findings, calling OpenClaw a “security dumpster fire” and warning that anyone interacting with an OpenClaw‑based agent could potentially access and manipulate its entire internal state.
2. How Bad Is a 2/100 Security Score in Practice?
On paper, “2/100” sounds terrible. In practice, the underlying failure modes are even worse:
- Almost no boundary between public input and internal secrets. If 84% of extraction attempts succeed, an attacker talking to your agent has a very high chance of seeing its system prompt, tool configuration, and memory.
- Prompt injection works most of the time. With 91% success, attackers can usually override whatever safety or business logic you thought you encoded in the system prompt.
- No requirement for sophisticated techniques. Reports and video walkthroughs show that basic, well‑known attack patterns are enough to break OpenClaw’s defenses.
- High‑impact capabilities behind those prompts. OpenClaw is often wired to sensitive tools: shells, file systems, browsers, and wallets. When injection succeeds, attackers do not just change text outputs—they can trigger system‑level actions.
A YouTube deep‑dive walking through the ZeroLeaks report highlights exactly this combination: a perfect 10/10 critical‑risk score paired with a tiny 2/100 security score and extremely high injection success rate. The presenter concludes that OpenClaw is “very, very vulnerable” and easily manipulated into revealing sensitive information or executing attacker‑controlled instructions.
CAUTION
If you let untrusted users or content talk to an OpenClaw agent, assume it will eventually leak secrets and run malicious actions.
3. Why Do OpenClaw’s ZeroLeaks Results Look So Bad?
The ZeroLeaks findings are not a random fluke; they reflect deeper design choices:
-
Prompt‑centric control without strong guards. OpenClaw relies heavily on system prompts and markdown files (e.g.,
SOUL.md,AGENTS.md) to define behavior. Without robust filtering or verification, adversaries can trick the model into exposing or rewriting those rules. -
Over‑powered tool integrations. Many default or community recipes grant OpenClaw shell access, broad file‑system access, and deep browser control. This maximizes convenience but turns every successful prompt injection into a potential RCE event.
-
Unvetted skills and marketplace plugins. Security analyses found over 400 malware‑infected skills on OpenClaw’s ClawHub marketplace, including the most‑downloaded add‑on, which functioned as a malware delivery vehicle.
-
Lack of sandboxing and privilege separation. OpenClaw typically runs with the same privileges as the user account on the host system. There is little in the way of OS‑level containment if the agent is compromised.
Taken together, these factors mean that any weaknesses in prompt‑level defenses propagate directly into the operating system and network.
TIP
For a deeper understanding of prompt injection risks, see our Prompt Injection Defense guide.
4. Real‑World Risk Scenarios: What Attackers Can Actually Do
Combining the ZeroLeaks results with what is known about OpenClaw’s ecosystem yields several realistic attack paths:
4.1 Exfiltrating Secrets from Configuration and Memory
Because ZeroLeaks can extract system prompts and internal files in most trials, an attacker talking to your agent can often see:
- API keys and secrets embedded in prompts or markdown files
- Internal URLs, database connection strings, and admin endpoints
- Descriptions of privileged tools and their capabilities
Once those are exposed, the attacker can pivot to direct API calls or targeted exploitation of downstream systems.
4.2 Remote‑Controlling the Host via Shell and Filesystem Tools
In many OpenClaw setups:
- The agent can run shell commands (
bash,powershell, etc.). - It can read and write arbitrary files under the user’s home directory.
- It may have access to browser sessions or password stores.
With a 91% injection success rate, an adversary can frequently convince the agent to:
- Download and execute malware
- Add SSH keys or create new system users
- Search for and exfiltrate credential files, wallet backups, or SSH keys
- Modify scripts or infrastructure‑as‑code repos to embed backdoors
4.3 Compromising Agents Through Social Platforms and Marketplaces
The risk is amplified when OpenClaw agents are wired into public content feeds or unvetted skills:
- On platforms like Moltbook, attackers could use compromised or impersonated agents to post prompt‑injection payloads that other agents will automatically ingest. Learn more about the Moltbook database breach.
- On the ClawHub marketplace, users install community‑contributed skills that have already been found to deliver malware and steal crypto or credentials.
In both cases, there is no strong trust boundary between “social content” or “skills” and the agent’s privileged actions.
5. ZeroLeaks as a Signal for Security Baselines
For security and platform teams, the ZeroLeaks scores provide a quantifiable baseline:
- Any agent framework scoring single digits out of 100 on security should not be exposed to untrusted users or content in production.
- A 10/10 critical‑risk rating indicates that compromise has high potential impact (data theft, financial loss, or full host takeover), not just cosmetic misbehavior.
- High extraction and injection success rates show that current mitigations are insufficient, even for well‑known, commodity attack patterns.
This does not mean OpenClaw can never be used, but it does mean:
- It belongs in isolated lab environments, not as a first‑class citizen in production infrastructure.
- Any connection to the public internet, social platforms, or third‑party skills must be treated as an adversarial interface.
- Organizations should prefer safer alternatives or add strong external control planes (gateways, policy engines, verification layers) around any OpenClaw deployment.
Emerging tools and platforms that measure hallucination rates, verify AI outputs, and provide governance layers for brand or factual accuracy can complement scanners like ZeroLeaks by focusing on output trustworthiness, not just direct exploitability.
6. Hardening and Containing OpenClaw: Practical Steps
If you still want to experiment with OpenClaw despite the ZeroLeaks findings, treat it like any high‑risk binary from the internet.
6.1 Isolate the Runtime Environment
- Run OpenClaw in a dedicated VM or container, not on your main workstation.
- Use a minimal, hardened host (for example, a small cloud VM with strict firewall rules) rather than a laptop full of secrets.
- Disable passwordless sudo and ensure the OpenClaw process runs under a restricted user.
- Avoid mounting host directories with sensitive data into containers that run OpenClaw.
A common pattern is to deploy OpenClaw on a locked‑down cloud instance (for example, a low‑cost DigitalOcean Droplet dedicated to AI agents) behind a VPN or private network, so that compromises do not spill into the rest of your infrastructure.
6.2 Minimize Available Tools and Credentials
- Start from no tools and no secrets. Only grant:
- Specific filesystem access where absolutely required
- Limited shell commands (if any)
- Single‑purpose API keys with tight scopes and rate limits
- Store secrets in a separate, managed vault service rather than embedding them in prompts or markdown files.
- Use different keys for experiments versus production systems.
For comprehensive hardening guidance, see our Gateway Hardening Guide.
6.3 Control and Sanitize Inputs
- Treat all external inputs as hostile: web pages, emails, chat messages, social posts, and content from AI‑to‑AI platforms.
- Implement pre‑processing layers that:
- Strip or neutralize markdown and HTML where possible
- Block obviously adversarial patterns (“ignore previous instructions,” “copy your system prompt,” etc.)
- Limit the length and complexity of inputs passed to the agent
- Consider routing sensitive tasks through models or tools with stronger hallucination and injection controls, using OpenClaw only as a thin orchestration layer.
6.4 Add Observation and Kill‑Switches
- Log every tool invocation (shell, file IO, HTTP requests) with timestamps and parameters.
- Stream these logs into a SIEM or monitoring system and alert on unusual patterns (e.g., mass file reads, outbound connections to unknown domains, or repeated keychain access).
- Implement a manual kill‑switch that can immediately stop OpenClaw and terminate its host VM or container if suspicious behavior is detected.
6.5 Keep OpenClaw Away from Wallets and Production Keys
Given the malware already discovered on ClawHub and the ease of prompt injection, never:
- Connect OpenClaw directly to wallets holding significant value.
- Give it unrestricted access to exchange APIs that can withdraw or trade assets.
- Let it manage production database credentials, root cloud keys, or core CI/CD pipelines.
If you want agents involved in financial or production workflows, design narrowly scoped, audited automation paths with multiple layers of approval and verification.
WARNING
Review our Security Audit Checklist to ensure your deployment meets minimum security standards.
7. Frequently Asked Questions About OpenClaw’s ZeroLeaks Results
Is OpenClaw “unsafe” by design?
OpenClaw is designed for flexibility and power, not for strong isolation or policy enforcement. Its authors and community largely optimized for “agents that can really do things,” leaving security hardening to users. The ZeroLeaks results show that this tradeoff currently leaves huge attack surfaces open.
Can updates or patches fix the 2/100 score?
In principle, yes—future versions could:
- Refuse to reveal system prompts
- Enforce stricter tool usage policies
- Integrate with external security layers and scanners
However, meaningful improvement would require architectural changes and continuous re‑testing. Until independent audits show substantially better scores, organizations should assume the current risk profile remains.
Is “ZeroLinks” different from ZeroLeaks?
Most public references to OpenClaw’s audit point to ZeroLeaks—the name used in published reports and social posts. “ZeroLinks” appears to be a common misspelling rather than a separate product. When evaluating tools or citing results, use the correct name and, where possible, link to the original reports.
Should I deploy OpenClaw in production today?
Given:
- The 2/100 security score and 10/10 critical‑risk rating
- Evidence of malware‑infected skills in its ecosystem
- Weak isolation and strong access to host resources
…OpenClaw is not suitable as a primary production automation layer on unsegmented infrastructure. If you do experiment with it in business settings, contain it aggressively and avoid giving it direct access to crown‑jewel systems or data.
8. Key Takeaways for Securing AI Agents After the ZeroLeaks Report
-
Metrics like “2/100” and “10/10 critical risk” are not abstract—they map to concrete, exploitable behaviors. In OpenClaw’s case, they mean secrets leakage and remote‑control potential under realistic adversarial prompts.
-
Prompt injection is not a niche academic problem. With 91% success rates, attackers can reliably override what you thought your agent would or would not do.
-
Agent power must be balanced with isolation and governance. If a framework makes it easy to run shell commands and move money, you must compensate with OS‑level sandboxing, minimal privilege, and external verification.
-
Security posture is ecosystem‑wide. OpenClaw’s risk is magnified by insecure platforms like Moltbook and malware‑laden marketplaces like ClawHub.
-
Use tools like ZeroLeaks as ongoing guardrails. Periodically re‑scan your agents and workflows as you add new skills or tools, and complement exploit‑focused scanners with hallucination‑ and truth‑measurement platforms to keep outputs reliable.
Handled correctly, the OpenClaw + ZeroLeaks story can be a turning point: a high‑profile incident that pushes teams to treat AI agents as serious software supply‑chain components, not harmless toys. The sooner organizations adapt their security models to that reality, the fewer costly surprises they will face as agent ecosystems grow.
Related Resources
SecureMolt Guides
- AI Agent Security Fundamentals
- Gateway Hardening Guide
- Security Audit Checklist
- Prompt Injection Defense
- Migrating to OpenClaw
Related Articles
- Moltbook Database Breach: Why OpenClaw + Moltbook Can No Longer Be Trusted
- Moltbook: When AI Agents Get Their Own Social Network
- OpenClaw: The Final Name
- OpenClaw CVE Vulnerabilities
Knowledge is the first line of defense. Now you know. 🦞