Company Overview

Anthropic

Founded 2021HQ San Francisco, USA

Anthropic builds Claude models with constitutional AI safeguards and transparency tooling.

Safety Philosophy

Constitutional alignment paired with human feedback-based evaluations.

Capability Focus

Reliable assistant behaviour, interpretable reasoning chains, and anchored refusal policies.

Portfolio Metrics Over Time

Aggregate Inspect results

Benchmark Performance

Safety vs Capability

Model roster

  • Claude 3.5 OpusReleased 2025-03-01
    Safety 82%Capability 74%Composite 88%

    Honesty leader across Inspect pressure tasks.

  • Claude Sonnet 4Released 2024-06-12
    Safety 78%Capability 62%Composite 81%

    Compact alignment-first deployment.

Safety focus

Anthropic emphasises constitutional AI, with Inspect evaluations confirming high honesty rates across reasoning tasks.

Tooling

Expect further interpretability releases; Anthropic aims to ship open oversight dashboards for enterprise partners.