Scanner / Precision
Precision changelog
The AgentScore scanner is a regex-based detector. Regex cannot tell a database method call apart from a shell exec, or a test fixture apart from a real credential, without context. Scanner v2.1 ships a mitigator system that scans a window around each match for known sanitizer wrappers or test-fixture markers and downgrades the severity when one fires.
The list below is every mitigator the scanner has gained, paired with the public report that motivated it. Each entry was driven by a real maintainer interaction, not a hypothetical edge case.
Scanner version
v2.1
Current ruleset digest
3185eb87b4ce
How mitigators work
When a finding fires, the scanner looks at a 2,000-character window around the match. If a sanitizer pattern (like validateCommand, execFile, or a database-shaped .exec()) hits in that window, the severity is downgraded. The original finding stays visible with an annotation showing which mitigator fired and where, so you can audit the call yourself.
The downgrades: command_injection, unsafe_eval, sensitive_file_access go to LOW. hardcoded_secret goes to MEDIUM (a placeholder is still an information disclosure about test infrastructure, just not a credential).
Precision is bounded by what regex can express. A renamed sanitizer or a database method that does not match the variable-name list still produces a false positive. Real data-flow analysis is on the v2.2 roadmap.
Recent mitigator additions
Database .exec and eval-in-message-strings (HomenShum/nodebench-ai#8)
Maintainer reviewed two HIGH findings against source. Confirmed three real command_injection sites and refactored to argv-based spawn. Correctly identified unsafe_eval as a false positive: the regex matched better-sqlite3's db.exec(`SQL`) and the literal word eval inside a recommendations.push string.
Patterns added
/\.exec\s*\(\s*[`'"]?\s*(SELECT|INSERT|UPDATE|DELETE|CREATE|ALTER|DROP|PRAGMA|VACUUM|BEGIN|COMMIT|ROLLBACK|TRUNCATE|REPLACE|MERGE|GRANT|REVOKE|EXPLAIN)\b/iSQL keyword immediately after .exec( — strong signal this is a database method, not child_process.exec.
/\b(db|database|conn|connection|client|pool|prepared|stmt|sql|query|knex|prisma)\.\s*exec\s*\(/iDatabase-shaped variable names calling .exec — better-sqlite3, pg, mysql2, prisma raw queries.
/\/[a-zA-Z]*[Ee]val[A-Z][a-zA-Z]*\.(js|ts|mjs|cjs)\b/Files like selfEvalTools.js, llmJudgeEval.js, pipelineEval.js — eval refers to evaluation flow, not JavaScript eval.
/\b(?:recommendations?|messages?|errors?|warnings?|notes?)\s*\.\s*push\s*\(\s*[`'"]/iStrings being pushed into a recommendations or messages array are message text, not executable code.
/\b(console|logger|log|debug|info|warn|error|trace)\s*\.\s*[a-z]+\s*\(\s*[`'"][^`'"]*\beval\b/iconsole.log and friends emitting strings that contain the word eval.
Declarative test-fixture markers (claude-flow CRITICAL hardcoded_secret in dist/)
claude-flow shipped a manifest-validator with structural test fixtures inside dist/. Existing test_fixture rules expected files to live under tests/ or specs/ — they did not catch shipped-as-data fixtures.
Patterns added
/\babc(123|def|xyz)/iCanonical fake-credential placeholders. claude-flow used sk-abc123-style test keys.
/\b(123|abcd){3,}/iLong repeating placeholder sequences like abcdabcdabcd.
/\bexpected(Outcome|Result|Behavior|Behaviour|Verdict|Action|Status)\s*:/Declarative fixture structure: { params: { ... }, expectedOutcome: 'deny' } — pure data, not exploitable code.
/\b(should|must|will)(Fail|Pass|Reject|Allow|Block|Deny|Throw|Match)\b/iTest-shaped predicate names sitting near otherwise-dangerous-looking literals.
/\/(?:[a-zA-Z-]*-)?(validator|sanitizer|detector|scanner|denier|filter)\.(js|ts|mjs|cjs)\b/iValidator and sanitizer source files contain example dangerous strings by design.
Sanitizer wrappers (Agions/taskflow-ai#6)
Maintainer shipped v3.0.2 with a validateCommand wrapper around shell_exec (whitelist + dangerous-pattern detection) within 48h of the scan report. The scanner's HIGH command_injection finding for the same code was no longer the right severity once a guard was in place.
Patterns added
/\bvalidateCommand\b/Direct match for the wrapper Agions introduced.
/\bsanitize(?:Command|Args|Input)?\b/iCommon naming convention for input sanitizers across the ecosystem.
/\bisAllowedCommand\b/iWhitelist-style guards.
/\bexecFile\b/execFile with an args array cannot shell-interpret. Different posture from exec.
/\bspawn\s*\(\s*[^,)]+,\s*\[/spawn('cmd', [args]) array form bypasses the shell entirely.
Reporting a precision gap
If the scanner flagged something it should not have, or missed something it should have caught, the report goes through public issue forms on the scanner repo. Detection-accuracy reports are not security disclosures and do not need confidentiality.
Real vulnerabilities in AgentScore infrastructure go to security@agentscores.xyz, not these forms.