⚠️
Evaluation in Progress:Benchmarks are actively being run. Results and coverage are preliminary.
Evaluation Benchmarks

Comprehensive Safety & Risk Evaluation Suite

ARM evaluates frontier AI models across 20+ rigorous benchmarks spanning offensive capabilities, alignment risks, adversarial robustness, and societal harms. All evaluations run through Inspect with standardized protocols for reproducible, comparable results.

💻

Offensive Cyber Capabilities

CTF challenges, exploitation, and dangerous cyber capabilities

3CBCybenchGDM CTFInterCodeCyberSecEval
🎭

Scheming & Deceptive Alignment

Self-reasoning, stealth behavior, and honesty under pressure

MASKAgentic MisalignmentGDM Self-reasoningGDM StealthSycophancy
⚗️

Harmful Agent Capabilities

Hazardous knowledge and potential for direct harm

AgentHarmWMDPSOS BENCHAIR Bench
🛡️

Adversarial Robustness

Jailbreak and prompt injection resistance

StrongREJECTAgentDojoMake Me Pay
⚖️

Bias & Fairness

Stereotype bias and fairness metrics

BBQBOLDStereoSet
📊

Calibration & Honesty

Uncertainty calibration and appropriate refusal

SimpleQAAbstentionBenchXSTest

Featured Benchmarks

Detailed results from key evaluation benchmarks currently available