Risk assessment

Evals: domains, quality, elicitation

Weighted 55% of category
Anthropic
60%
more
DeepMind
60%
more
OpenAI
70%
more
Meta
2%
more
xAI
2%
more
Microsoft
2%
more
DeepSeek
0%

Evals: accountability

Weighted 25% of category
Anthropic
15%
more
DeepMind
1%
more
OpenAI
15%
more
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Adversarial evaluation for alignment

Weighted 10% of category
Anthropic
50%
DeepMind
0%
more
OpenAI
10%
more
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Model organisms

Weighted 10% of category
Anthropic
75%
more
DeepMind
10%
more
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%