We Scanned 141 MCP Servers. Here's What We Found.
Model Context Protocol (MCP) is having a moment. In the past year, it has gone from an Anthropic-internal spec to the de facto standard for giving AI agents access to real tools - filesystems, databases, APIs, cloud infrastructure. Claude, Cursor, and a growing list of AI clients all support it. Developers are shipping MCP servers for everything from Slack to Kubernetes to your local shell.
With that growth comes an obvious question: how secure are these servers?
We built Oxvault Scanner to answer it. It’s an open-source static analysis tool purpose-built for MCP server security - not a general-purpose linter, but a scanner that understands MCP-specific risk patterns: what it means when an MCP tool leaks environment variables, what path traversal looks like in a filesystem-access server, what “hardcoded credential” means when your tool is running inside an AI agent.
Last week we ran it against 141 publicly available MCP servers on GitHub. Here’s what we found.
What We Scanned
The 141 servers span the range of what’s actually deployed in the wild:
- Infrastructure tools: AWS MCP Servers, Cloudflare MCP, Kubernetes MCP, Terraform MCP
- Developer tools: GitHub MCP Server, Desktop Commander MCP, Sentry MCP, Salesforce DX MCP
- Data and search: Exa MCP, Firecrawl MCP, MongoDB MCP, PostgreSQL MCP, Supabase MCP
- Productivity: Slack MCP, Notion MCP, Google Workspace MCP, Asana MCP, HubSpot MCP
- AI-adjacent: Composio MCP, Browserbase MCP, Playwright MCP
All of these are open-source repos - public code that anyone can inspect. We cloned each one, ran the scanner, and collected the results. No paid tiers, no closed-source servers. This sweep used scanner v0.3.3, which includes false positive fixes applied after our initial 67-server sweep - in particular, tighter rules for test fixtures, minified bundles, and scanner blocklists that were producing noise in the earlier run.
The scanner runs static analysis only - no dynamic execution, no network calls during scanning. Every finding reflects what’s in the code.
The Numbers
| Metric | Result |
|---|---|
| Servers scanned | 141 |
| Servers with HIGH+ findings | ~71 (50%) |
| Confirmed CRITICAL findings | 135 |
| Precision (HIGH+CRITICAL) | 93% |
| CVE detection rate | 12/12 (100%) |
50% of MCP servers have at least one HIGH or CRITICAL finding. This number reflects the scanner after false positive fixes - tighter rules that no longer fire on test fixtures, minified bundle artifacts, or scanner blocklists. The 93% precision means that when the scanner reports a HIGH or CRITICAL finding, it’s real almost every time.
The 135 confirmed CRITICAL findings across 141 servers means nearly one critical per server on average across the servers that have them. Some things stand out across both sweeps.
What We Found
Hardcoded credentials are real and they’re out there
The scanner found 19 hardcoded credential patterns across the sweep. Most are test fixtures (mocked tokens in test utilities) and configuration constant names - not real secrets. But not all of them.
In the Cloudflare MCP repository, in a file called apps/demo-day/frontend/script.js, we found this:
Authorization: 'Bearer 8gmjguywgvsy2hvxnqpqzapwjq896ke3',
High confidence. That is a real 32-character alphanumeric Bearer token committed in the source code of a demo application in the Cloudflare MCP repository. It follows Cloudflare Workers token format. This isn’t a placeholder like test-token or YOUR_API_KEY_HERE. It’s a real credential that should be rotated.
Committed credentials in demo apps are one of the most common sources of real-world secret exposure. Engineers build a demo, hardcode a token to get it working, push it - and it stays there long after the demo is done.
A path traversal risk in one of the most-installed MCP servers
Desktop Commander MCP is among the most widely used MCP servers. It gives AI agents access to your filesystem and terminal. Its installation instructions point directly to npx @wonderwhy-er/desktop-commander.
The scanner flagged three instances of this pattern in Desktop Commander’s filesystem access code:
// In filesystem-handlers.js:
if (filePath === '~' || filePath.startsWith('~/') || filePath.startsWith(`~${path.sep}`)) {
// In filesystem.js:
const subdirCheck = normalizedPathToCheck.startsWith(normalizedAllowedDir + path.sep);
The issue: startsWith() used as a path containment check is bypassable. If your allowed directory is /home/user/projects, a path like /home/user/projects-evil/secret.txt passes the check. Combined with symlinks, the bypass surface is larger. The correct pattern is to call path.resolve() on both paths first, then compare with a trailing separator.
For a tool that explicitly exposes filesystem access to an AI agent, a containment bypass is a meaningful risk. An AI agent operating on user-provided input - say, a repository URL or a filename from a web search result - could be manipulated to access paths outside the intended scope.
AWS’s MCP server intentionally runs exec()
The AWS MCP Servers monorepo ships a server called aws-diagram-mcp-server that generates AWS architecture diagrams. It includes a file called _sandbox_runner.py that contains this:
exec( # nosec B102 nosem
and
exec(code, namespace) # nosec B102 nosem
AWS engineers are clearly aware this is risky - the nosec comment explicitly suppresses the bandit security scanner warning. The design is intentional: the server accepts user-provided Python code (to describe an architecture) and executes it in a restricted namespace to generate a diagram.
The security question isn’t whether exec() is present - it’s whether the namespace restriction is actually sandboxed. Python exec() with a custom namespace can still be escaped through builtins manipulation, __import__, __class__.__mro__ chains, and other techniques. This is a known hard problem that dedicated Python sandboxing libraries exist to solve.
This is the kind of finding where the scanner is useful not because it found something obviously broken, but because it surfaces a pattern that warrants manual review. Does the sandbox hold up? That requires a human to evaluate.
Environment variable exposure: real noise, real signal
The scanner found 777 environment variable patterns (CWE-526) - by far the largest category. Most of these are benign: reading process.env.PORT, checking process.env.NODE_ENV, using process.env.API_KEY to configure a client. That’s normal MCP server behavior.
A smaller subset warranted attention. In Desktop Commander MCP, the scanner found this in setup-claude-server.js:
return `wsl-${process.env.WSL_DISTRO_NAME || "unknown"}`;
return process.env.TERM_PROGRAM.toLowerCase();
These values get returned from the setup function - potentially surfacing system environment information to the MCP response. WSL_DISTRO_NAME reveals whether the server is running under WSL. TERM_PROGRAM reveals the terminal application. In isolation, low risk. As part of a fingerprinting or recon chain, slightly more interesting.
In Sentry MCP, the scanner correctly flagged:
return Boolean(process.env.ANTHROPIC_API_KEY);
This returns a boolean indicating whether an API key is configured - not the key itself. The scanner’s mcp-env-leakage rule fires because an env var is being evaluated and the result returned. This is a case where the scanner is technically correct but the risk is low: true or false leaks presence of a key, not the key.
Deep Dives
The Scanner Blocklist Problem
One of the more interesting false positive patterns we found in AWS MCP Servers: the scanner flagged code inside aws_diagram_mcp_server/scanner.py - which is itself a security scanner embedded in the MCP server that checks generated code for dangerous patterns. The file contains a list like:
('exec(', 'exec'),
('os.system(', 'os.system'),
('pickle.loads(', 'pickle.loads'),
Our scanner flagged these as mcp-code-eval (CWE-94) and mcp-unsafe-deserialization (CWE-502) findings. It matched the pattern text inside the blocklist as if it were live code.
This is a known challenge in static analysis: string literals that describe dangerous patterns look identical to those patterns to a regexp-based scanner. The fix is contextual analysis - understanding that a string inside a list literal is not a function call. Pattern-matching tools that can’t distinguish these will always produce noise in security-aware codebases that use denylist strings.
Why Minified Bundles Are a Scanning Anti-Pattern
Exa MCP ships a Smithery deployment bundle at .smithery/shttp/index.cjs. The scanner returned multiple HIGH findings against this file, including:
"new Function() - dynamic code generation: || ${s} === \"boolean\" || ${t} === null`)"
new Function() in a minified bundle almost always comes from AJV, the JSON Schema validator. AJV generates validator functions dynamically using new Function() for performance. It’s an intentional design decision in one of the most-audited npm packages in existence.
The lesson here is that scanning minified or bundled files with pattern-matching rules produces a high proportion of false positives. The patterns that indicate risk in source code - dynamic imports, function construction, eval - appear routinely in minified third-party library bundles for entirely legitimate reasons. The signal-to-noise ratio drops significantly.
When we filter out findings from minified files and committed node_modules, the finding count drops substantially - but the remaining findings are much more likely to be real.
Path Traversal: The Pattern That Keeps Coming Back
Path traversal via startsWith() appeared in multiple servers, not just Desktop Commander. The same issue appeared in Sentry MCP’s CORS utility (though in that context it’s not a security issue - URL path matching, not filesystem containment). The rule is catching a real vulnerability class but firing in contexts where it doesn’t apply.
The underlying issue is that startsWith() is used for two very different purposes:
- Filesystem path containment - “is this file inside the allowed directory?” (dangerous if not preceded by
path.resolve()) - URL prefix matching - “does this route start with /api?” (safe by design)
A scanner that understands these contexts would only fire on the first case. A regexp-based scanner can’t easily distinguish them.
What This Means
For MCP server developers:
-
Audit your committed credentials. Run
git log -S "Bearer" --alland similar patterns on your repos. Test fixtures with realistic-looking tokens get committed more often than people think. -
If your server does filesystem access, use
path.resolve()before any containment check. ThestartsWith()bypass is a real vulnerability class with known exploits. -
If you’re running user-controlled code (like aws-diagram-mcp-server does), verify your sandboxing rigorously. Python namespace-based sandboxes have a documented history of escapes.
-
Don’t commit secrets in demo apps.
git-secretsand similar pre-commit hooks exist for exactly this reason.
For AI developers using MCP:
The MCP ecosystem is young and moving fast. Not every server you install has been security-reviewed. When you give an MCP server access to your filesystem, shell, or cloud credentials, you’re trusting a code path that may have issues the developer hasn’t thought through. Running a scanner before installing unfamiliar MCP servers is a reasonable precaution.
For the ecosystem:
50% of servers having HIGH+ findings with 93% precision means the signal is real: half the MCP ecosystem has confirmed security issues worth reviewing. The earlier 76% number from our first sweep included significant noise from overly broad rules - the refined number is more useful precisely because you can trust it.
As MCP adoption grows and more servers gain access to sensitive infrastructure - production databases, cloud APIs, enterprise systems - the security bar needs to rise.
About Oxvault
Oxvault Scanner is an open-source MCP security scanner. You can run it on any MCP server:
# Install
curl -fsSL https://oxvault.dev/install.sh | sh
# Scan any MCP server
oxvault scan github:org/mcp-server
The scanner is free, runs entirely locally, and doesn’t send your code anywhere. It’s available as a GitHub Action (oxvault/scan-action@v1) for CI/CD integration.
Repository: github.com/oxvault/scanner
We publish all findings from this sweep in the repo. If you maintain one of the servers we scanned and want to discuss the findings, open an issue or reach out.