Up to date as of September 15

Overall score

Scheming risk prevention

View weights for all 8 criteria →

Safety case: process

Weighted 25% of category

	10%	No clear plan on scheming risks more
	20%	DeepMind has the beginning of a control plan more
	5%	OpenAI is nominally planning to address risks from misalignment, but its plan is very vague and doesn't seem to consider that misalignment could be hard to notice more
	0%	No plan on scheming risks
	0%	No plan on scheming risks more
	0%	No plan on scheming risks
	0%	No plan on scheming risks

Weighted 25% of category

Weighted 20% of category

	0%	No but Anthropic has done research on probing
	5%	DeepMind is planning to implement monitoring to address risks from deceptive alignment by the time its models have a certain level of instrumental reasoning capabilities more
	0%	No but OpenAI has done research on monitoring
	0%	Nothing on misalignment risks during internal deployment
	0%	Nothing on misalignment risks during internal deployment
	0%	Nothing on misalignment risks during internal deployment
	0%	Nothing on misalignment risks during internal deployment

Weighted 10% of category

	0%	Anthropic is doing nothing on this
	0%	DeepMind is doing nothing on this
	0%	OpenAI is doing nothing on this
	0%	Meta is doing nothing on this
	0%	xAI is doing nothing on this
	0%	Microsoft is doing nothing on this
	0%	DeepSeek is doing nothing on this

Weighted 10% of category

	0%	Anthropic doesn't do anything on this
	0%	DeepMind doesn't do anything on this
	0%	OpenAI doesn't do anything on this
	0%	Meta doesn't do anything on this
	0%	xAI doesn't do anything on this
	0%	Microsoft doesn't do anything on this
	0%	DeepSeek doesn't do anything on this

Weighted 3% of category

Weighted 3% of category

Weighted 3% of category

	50%	Reasoning is in natural language, but no policy or articulation of this principle more
	55%	Reasoning is in natural language, but no policy, but articulation of this principle, but only in a research paper more
	55%	Reasoning is in natural language, but no policy, but articulation of this principle, but only in a research blogpost
	50%	Reasoning is in natural language, but no policy or articulation of this principle
	50%	Reasoning is in natural language, but no policy or articulation of this principle
	50%	Reasoning is in natural language, but no policy or articulation of this principle
	50%	Reasoning is in natural language, but no policy or articulation of this principle