Anthropic

Risk assessment 49%

60%
Evals: domains, quality, elicitation
15%
Evals: accountability
50%
Adversarial evaluation for alignment
75%
Model organisms

Evals: domains, quality, elicitation

60%
Click to show details/rubric

Evals: accountability

15%
Click to show details/rubric

Adversarial evaluation for alignment

50%
Click to show details/rubric

Model organisms

75%
Click to show details/rubric