Evaluation and Analyst Validation

ThreatMapper output is a first-pass structured analysis, not a substitute for analyst validation.

Why Validation Matters

LLM-generated mappings can contain false positives, false negatives, and ambiguous technique assignments. ATT&CK techniques can overlap conceptually, and a report may omit the procedure detail needed for a defensible mapping.

Analyst Review Workflow

Locate the exact source evidence for each proposed technique.
Compare the described procedure with the ATT&CK definition.
Reject mappings supported only by generic intent or unsupported inference.
Record confidence, telemetry requirements, and contradictory evidence.
Validate group/campaign similarity with non-TTP evidence.
Hand validated gaps to detection engineering with an owner and status.

Use ThreatMapper output as a first-pass structured analysis. Analysts should validate technique mappings against source evidence, procedure descriptions, telemetry requirements, and ATT&CK definitions.

Sample Evaluation Format

Report	Expected Techniques	Extracted Techniques	False Positives	Missed Techniques	Analyst Notes
Sample incident	T1059.001, T1053.005	T1059.001, T1071.001	T1071.001	T1053.005	Validate procedure evidence and update prompt/test case

Suggested Benchmark Methodology

Build a versioned benchmark from analyst-reviewed reports. Measure technique precision and recall, evidence-grounding quality, confidence calibration, repeatability by provider/model, and time saved after review. Keep the benchmark separate from prompt-development examples and rerun it after provider, prompt, or ATT&CK-version changes.

Detection Engineering Handoff

A useful handoff records ATT&CK technique ID and name, tactic, procedure evidence, required telemetry, detection idea, existing coverage, gap status, priority, validation status, owner, and notes.

Why Validation Matters​

Analyst Review Workflow​

Sample Evaluation Format​

Suggested Benchmark Methodology​

Detection Engineering Handoff​

Why Validation Matters

Analyst Review Workflow

Sample Evaluation Format

Suggested Benchmark Methodology

Detection Engineering Handoff