Research Report · First published April 2026 · Continuously updated

State of MCP Package Security

By Michael K Onyekwere

Numbers below regenerate every hour from the live dataset. Last recomputed 2026-06-08.

Primary research from continuous monitoring of 1,498 Model Context Protocol packages published to npm. This report summarises what 26,161 scans across that set reveal about the ecosystem's score distribution, finding prevalence, capability surface, and real maintainer response patterns. Machine-readable companion at /api/ecosystem/stats.

1,498

Packages monitored

26,161

Scans on record

92.4

Mean score

median 95

Public advisories

36 high / 23 low

Where the ecosystem stands

The headline is that most published MCP packages score well. Out of 1,000 scored packages, mean score is 92.4 of 100 and the median is 95. Risk categorisation is dominated by LOW (922 packages, 92%), with a long tail of packages at ELEVATED or worse that represents the material supply-chain surface a consumer actually has to worry about.

Risk distribution

LOW

922 (92%)

MODERATE

58 (6%)

ELEVATED

19 (2%)

HIGH

1 (0%)

CRITICAL

0 (0%)

Score distribution

90-100

795

80-89

147

70-79

60-69

50-59

40-49

30-39

20-29

10-19

0-9

Interpretation: the concentration at 90-100 reflects the MCP ecosystem's current publisher skew. Most packages are small, recent, and do not bundle risky behaviours. The packages below 70 are what the Policy Gate is designed to catch before they land in a repo.

What the scanner finds most

Across the most recent 500 scans, the scanner produced 598 findings total (80% of scans returned at least one finding). The most common finding is the absence of provenance attestations, a verifiability gap rather than a vulnerability. The tail is where real security issues concentrate.

Finding type	Count	Share
no_provenance	376	63%
command_injection	74	12%
no_repository	62	10%
install_script	52	9%
excessive_dependencies	15	3%
unsafe_eval	10	2%
no_license	4	1%
hardcoded_secret	3	1%
sensitive_file_access	2	0%

Finding severity distribution

critical

high

medium

13%

482

low

81%

Severity reflects scanner v2.1 context-aware downgrade. Findings flagged by a regex match are reduced in severity when a known sanitizer wrapper (e.g. validateCommand, execFile with array args) or a test-fixture context is detected within the same code region. This removes the bulk of false positives that would otherwise dominate the HIGH column.

Capability surface across the ecosystem

Capability classification runs across MCP tool definitions extracted from each package's source. These are the powers a consuming agent inherits when it installs a package. 574 monitored packages have classified capability surfaces on file. The counts below are packages with that capability present (a single package can declare many).

search_index

316

database_access

197

network_egress

195

email_messaging

150

secrets_access

139

filesystem_read

112

memory_state

browser_automation

repo_read

cloud_infra

filesystem_write

shell_exec

The distribution is instructive for consumers: a majority of classified packages declare either search, database, network, or email capabilities, which means installing an arbitrary MCP server by default grants at least one of those powers to any agent using it. The Policy Gate surfaces capability additions between versions so a consumer can decide whether a v1.4 → v1.5 bump that adds email_messaging was intended.

Install-script prevalence

Of 871 recently scanned packages, 72 (8%) publish at least one install-time script (preinstall, postinstall, or install). Most are benign (version banners, setup scripts), but the pattern is a classic supply-chain vector: any code in those scripts executes on npm install before the consumer has a chance to inspect the package. For MCP consumers who run agents in production, install-script packages deserve a manual review step.

Advisory cadence

AgentScore publishes public advisories when a monitored package materially worsens (score drop, new high-severity finding, new capability addition that changes the trust envelope). A total of 61 advisories have been published to date: 36 high-severity and 23 low-severity. Recent publications:

2026-05-23lowhemmabo-mcp-server
2026-05-23highchrome-devtools-mcp
2026-05-25low@axon-trading/mcp
2026-05-26highsafari-mcp
2026-05-26criticalzentric-protocol-mcp
2026-05-26highcodeloop-mcp-server
2026-05-28highvidlens-mcp
2026-06-04low@zereight/mcp-gitlab
2026-06-06lowcodefizz-editor-agent
2026-06-06low@mthines/reaper-mcp

Machine-readable advisory feed: /security/advisories (HTML) and /security/advisories/rss.xml (RSS).

Case studies from this period

Numbers describe the ecosystem's shape. Cases describe how it actually responds when a finding lands in front of a maintainer. Three from the reporting period:

Redis pinned every MCP dependency after our scan

Five MCP packages installed via unpinned npx -y in RedisInsight. Two days from our scan report to redis/RedisInsight#5763 closed with every MCP version pinned.

Full case study →

Agions shipped security fixes to taskflow-ai in 48h, then went further

HIGH command_injection and install_script findings. Maintainer released v3.0.2 in 48h with validateCommand wrapper, then v4.0.0 two days later with seven capabilities deleted from the tool surface. Four-day arc from scan report to architectural cleanup.

Full case study →

fa-mcp-sdk: live credentials in a published tarball

A published npm package shipping an entire config file of production secrets (OpenAI key, Active Directory password, Consul tokens, Postgres superuser credentials, JWT key). Five versions republished after our April 19 private disclosure still contained the same file. Escalated to security@npmjs.com on April 22. Latest verification on May 6 still found the file present in fa-mcp-sdk@0.4.74. A standing reminder that scanner findings labelled "hardcoded_secret" are rare (0.3% of findings this period) but consequential when real.

Maintainer response patterns

Of every disclosure or capability-change report we have filed since launch, four distinct positive-response shapes have emerged on the public record. Each is a different kind of proof point about what disclosure actually buys you in this ecosystem. Verified by re-reading the GitHub threads, commit dates, and npm release timelines on May 9, 2026.

1. Consumer-prevention pin

Redis (RedisInsight#5763, April 14). After we flagged five MCP packages installed via npx -y with no pinning, the maintainer shipped commit f0c887c7 (“chore: pin MCP dependency versions to prevent supply chain attacks”) plus a Dependabot cooldown config the same day. Three-day turnaround. See /case-study/redis.

2. Author-side fix

Agions (taskflow-ai#6, April 22-24). Issue closed in 48 hours, followed by v4.0.0 structural cleanup release. Public class-level disclosure produced shipped fixes, not just acknowledgement. See /case-study/agions.

3. Technical pushback that improves the scanner

HomenShum / nodebench-mcp (April 26). Security-engineering maintainer reviewed the report and split it into real findings (three shell-exec sites refactored to argv-based spawn) and false positives (better-sqlite3 db.exec matched the shell-exec detector). Both sides shipped: their package hardened, our scanner gained sanitizer mitigators in commit 4ee2659. Score moved 55 to 85. See /case-study/nodebench.

4. Corrected false-positive, gracious response

IgorGanapolsky / ThumbGate (#975, April 20-May 7). Scanner over-fired on the maintainer's own test fixtures and detection regexes (it is itself a security tool). Caught and corrected within hours. Path-aware mitigator f6c1af0 shipped days later; future scans recovered automatically (25 to 55). Maintainer engaged three weeks later, committed to clearer test-corpus labelling.

The shapes are not interchangeable. Pinning is consumer-side defence. Author-side fixes are remediation. Technical pushback improves both sides. Corrected false-positives demonstrate the iteration loop. Treating any single response as “winning” misframes what each one actually proves. The fifth shape is the one we file most often: no response. The fa-mcp-sdk package has republished seven times since the original disclosure with no maintainer acknowledgement; cases like that route to the GitHub Security Lab CVE pipeline instead.

What this means for MCP consumers

Pin your MCP dependencies. npx -y and unpinned npm specs pull whatever is latest at install time. Any maintainer compromise propagates without warning. This is the Redis lesson.
Re-evaluate capability changes at bumps. A v1.4 to v1.5 patch that adds email_messaging, filesystem_write, or shell_exec is a scope change, not a routine update. The Policy Gate surfaces these automatically.
Treat install scripts as a manual-review gate. 8% of packages in this sample publish one. Most are benign. The Policy Gate flags their presence so a human decides.
Watch the advisory feed. Score drops and finding additions on packages already in your inventory are the early-warning signal. RSS: /security/advisories/rss.xml.

Methodology

Discovery sweeps npm via keyword search (keywords:mcp-server, keywords:model-context-protocol), broad text search filtered to MCP-relevant results, and dependency-reverse search across several MCP SDKs (@modelcontextprotocol/sdk, fastmcp, mcp-framework, @mcp-ui/server). Enrollment requires a minimum weekly-downloads threshold. Enrolled packages are rescanned on a continuous cadence, with real-time change detection via the npm registry feed. The scanner is static analysis only: it downloads published tarballs, analyses metadata and source in memory, and does not execute code or inspect runtime behaviour. Full methodology including finding definitions, severity rules, and OWASP MCP Top 10 coverage map is at /methodology. The underlying dataset is queryable at /api/ecosystem/stats (JSON, revalidated hourly). Findings sample size for distribution tables: 500 most recent scans. Full scan count to date: 26,161.

Use the Policy Gate in your repo

One YAML block. Free for public repos. Auto-provisions via GitHub OIDC.

Install the Policy Gate Get the JSON

Report generated 2026-06-08. Watch feed last updated 2026-06-08.