⚠️ Disclaimer: We are currently running evaluations. The data displayed is dummy data and does not represent actual model performance.

|
is capable of scheming & deception,
poses a cyber security risk,
full of bias.

We need to measure and mitigate the risks of the models that are beginning to control critical infrastructure, write a significant portion of our code and remember everything we tell it.

View our risk composite index→

Highest Risk

Composite risk index

Meta Llama 4 70B

xAI Grok 4 Fast

View all risk indexes→