AI Lab Watch

Categories

Companies

Resources

Blog

About

			*
Weighted score	28%	20%	18%	5%	4%	3%	1%
Risk assessment	44%	29%	34%	6%	6%	1%	0%	27% weight
Scheming risk prevention	4%	8%	3%	2%	2%	2%	2%	21% weight
Boosting safety research	70%	52%	35%	25%	0%	13%	8%	14% weight
Misuse prevention	14%	4%	9%	0%	1%	0%	0%	12% weight
Prep for extreme security	3%	5%	0%	0%	0%	0%	0%	12% weight
Risk info sharing	35%	13%	32%	0%	28%	0%	0%	8% weight
Planning	14%	26%	0%	0%	0%	1%	0%	6% weight

Up to date as of September 15

Overall score

Planning

Safety plan

Weighted 50% of category

	25%	No but the Responsible Scaling Policy and "Planned ASL-3 Safeguards" in "Responsible Scaling Program Updates", and ASL-3 safeguards report are better than nothing more
	50%	An Approach to Technical AGI Safety and Security describes the DeepMind safety team's "anytime" (i.e., possible to implement quickly) plan for addressing misuse and misalignment risks more
	0%	OpenAI once had the central part of a mainline safety plan, but has not since Superalignment was dissolved more
	0%	Meta hasn't published a plan more
	0%	xAI hasn't published a plan more
	0%	Microsoft hasn't published a plan more
	0%	DeepSeek hasn't published a plan

Plan for how to use AGI

Weighted 40% of category

	0%	Anthropic hasn't published a plan
	0%	DeepMind hasn't published a plan
	0%	OpenAI hasn't published a plan
	0%	Meta hasn't published a plan
	0%	xAI hasn't published a plan
	0%	Microsoft hasn't published a plan
	0%	DeepSeek hasn't published a plan

Prepare for a pivot

Weighted 10% of category

	10%	Anthropic's RSP says Anthropic might need to pause and "We will set expectations with internal stakeholders about the potential for such pauses" more
	5%	DeepMind's Frontier Safety Framework says "A model flagged by an alert threshold may be assessed to pose risks for which readily available mitigations (including but not limited to those described below) may not be sufficient. If this happens, the response plan may involve putting deployment or further development on hold until adequate mitigations can be applied."
	0%	OpenAI's Preparedness Framework says some models could "require additional safeguards (safety and security controls) during development, regardless of whether or when they are externally deployed," but no more more
	1%	Meta's Frontier AI Framework says "If a model appears to uniquely enable the execution of a threat scenario we will pause development while we investigate whether barriers to realizing the catastrophic outcome remain," but there are several loopholes
	0%	xAI doesn't do this more
	5%	Microsoft's Frontier Governance Framework says "If, during the implementation of this framework, we identify a risk we cannot sufficiently mitigate, we will pause development and deployment until the point at which mitigation practices evolve to meet the risk," but this is not credible
	0%	DeepSeek doesn't do this