AI Lab Watch

			*
Risk assessment	44%	29%	32%	1%	1%	1%	0%	27% weight
Scheming risk prevention	3%	8%	2%	2%	2%	2%	2%	21% weight
Boosting safety research	68%	56%	37%	28%	0%	15%	8%	14% weight
Misuse prevention	12%	4%	5%	0%	0%	0%	0%	12% weight
Prep for extreme security	3%	5%	0%	0%	0%	0%	0%	12% weight
Risk info sharing	35%	13%	32%	0%	28%	0%	0%	8% weight
Planning	14%	26%	0%	0%	0%	1%	0%	6% weight

Scheming risk prevention

21% weight

Boosting safety research

Prep for extreme security

Anthropic

Risk assessment 44%

50%

Evals: domains, quality, elicitation

15%

Evals: accountability

50%

Adversarial evaluation for alignment

75%

Model organisms

AI companies should do model evals and uplift experiments to determine whether models have dangerous capabilities or how close they are. They should also prepare to check whether models will act well in high-stakes situations.

Compare all companies on risk assessment

Evals: domains, quality, elicitation

50%

Click to show details/rubric

Evals: accountability

15%

Click to show details/rubric

Adversarial evaluation for alignment

50%

Click to show details/rubric

Model organisms

75%

Click to show details/rubric

Scheming risk prevention 3%

Safety case: process

Safety case: accountability

Internal deployment protocol

External deployment protocol

Plan for if an AI is caught scheming

Training: remove scheming capabilities

Training: adversarial training

50%

Training: safe architecture

AIs show signs that if they were more capable, they would sometimes scheme, i.e. fake alignment and subvert safety measures in order to gain power. AI companies should prepare for risks from models scheming, especially during internal deployment: if they can't reliably prevent scheming, they should prepare to catch some schemers and deploy potentially scheming models safely.

Compare all companies on scheming risk prevention

Safety case: process

Click to show details/rubric

Safety case: accountability

Click to show details/rubric

Internal deployment protocol

Click to show details/rubric

External deployment protocol

Click to show details/rubric

Plan for if an AI is caught scheming

Click to show details/rubric

Training: remove scheming capabilities

Click to show details/rubric

Training: adversarial training

Click to show details/rubric

Training: safe architecture

50%

Click to show details/rubric

Boosting safety research 68%

82%

Publishing safety research

Deep access for external safety researchers

100%

Mentoring external safety researchers

AI companies should do (extreme-risk-focused) safety research, and they should publish it to boost safety at other AI companies. Additionally, they should assist external safety researchers by sharing deep model access and mentoring.

Compare all companies on boosting safety research

Publishing safety research

82%

Click to show details/rubric

Deep access for external safety researchers

Click to show details/rubric

Mentoring external safety researchers

100%

Misuse prevention 12%

25%

Safety case: process

Safety case: accountability

Removing dangerous capabilities

10%

Emergency protocol

AI companies should prepare to prevent catastrophic misuse for deployments via API, once models are capable of enabling catastrophic harm.

Compare all companies on misuse prevention

Safety case: process

25%

Click to show details/rubric

Safety case: accountability

Click to show details/rubric

Removing dangerous capabilities

Click to show details/rubric

Emergency protocol

10%

Click to show details/rubric

Prep for extreme security 3%

Plan for SL5

Red-team resilience

10%

Practices: isolated network

10%

Practices: secure developers' machines

25%

Practices: multiparty controls

10%

Practices: secure boot

Track record

AI companies should prepare to protect model weights and code by the time AI massively boosts R&D, even from top-priority operations by the top cyber-capable institutions.

Compare all companies on prep for extreme security

Plan for SL5

Click to show details/rubric

Red-team resilience

Practices: isolated network

10%

Click to show details/rubric

Practices: secure developers' machines

10%

Click to show details/rubric

Practices: multiparty controls

25%

Click to show details/rubric

Practices: secure boot

10%

Click to show details/rubric

Track record

Compare all companies on risk info sharing

Planning 14%

25%

Safety plan

Plan for how to use AGI

10%

Prepare for a pivot

AI companies should plan for the possibility that dangerous capabilities appear soon and safety isn't easy: both for evaluating and improving safety of their systems and for using their systems to make the world safer.

Compare all companies on planning

Safety plan

25%

Click to show details/rubric

Plan for how to use AGI

Click to show details/rubric

Prepare for a pivot

10%

Click to show details/rubric

AI Lab Watch

Categories

Companies

Resources

Blog

About

Anthropic

Risk assessment 44%

Evals: domains, quality, elicitation

Evals: accountability

Adversarial evaluation for alignment

Model organisms

Scheming risk prevention 3%

Safety case: process

Safety case: accountability

Internal deployment protocol

External deployment protocol

Plan for if an AI is caught scheming

Training: remove scheming capabilities

Training: adversarial training

Training: safe architecture

Boosting safety research 68%

Publishing safety research

Deep access for external safety researchers

Mentoring external safety researchers

Misuse prevention 12%

Safety case: process

Safety case: accountability

Removing dangerous capabilities

Emergency protocol

Prep for extreme security 3%

Plan for SL5

Red-team resilience

Practices: isolated network

Practices: secure developers' machines

Practices: multiparty controls

Practices: secure boot

Track record

Risk info sharing 35%

Incident reporting

Talk about extreme risks

Describe worst-case outcome

Don't publish some capabilities research

Planning 14%

Safety plan

Plan for how to use AGI

Prepare for a pivot