Scheming risk prevention

View weights for all 8 criteria →

Safety case: process

Weighted 25% of category
Anthropic
0%
more
DeepMind
20%
more
OpenAI
0%
more
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Safety case: accountability

Weighted 25% of category
Anthropic
0%
DeepMind
0%
more
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Internal deployment protocol

Weighted 20% of category
Anthropic
0%
DeepMind
5%
more
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

External deployment protocol

Weighted 10% of category
Anthropic
0%
DeepMind
0%
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Plan for if an AI is caught scheming

Weighted 10% of category
Anthropic
0%
DeepMind
0%
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Training: remove scheming capabilities

Weighted 3% of category
Anthropic
0%
DeepMind
0%
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Training: adversarial training

Weighted 3% of category
Anthropic
0%
DeepMind
0%
more
OpenAI
0%
Meta
0%
xAI
0%
Microsoft
0%
DeepSeek
0%

Training: safe architecture

Weighted 3% of category
Anthropic
50%
more
DeepMind
55%
more
OpenAI
55%
Meta
50%
xAI
50%
Microsoft
50%
DeepSeek
50%