Introducing v0.5 of the AI Safety Benchmark from MLCommons
Benchmark (surveying)
DOI:
10.48550/arxiv.2404.12241
Publication Date:
2024-04-18
AUTHORS (97)
ABSTRACT
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by MLCommons Working Group. The Benchmark designed to assess safety risks systems that use chat-tuned language models. We introduce a principled approach specifying and constructing benchmark, for covers only single case (an adult chatting general-purpose assistant in English), limited set personas (i.e., typical users, malicious vulnerable users). new taxonomy 13 hazard categories, 7 have tests benchmark. plan release version 1.0 end 2024. v1.0 benchmark will provide meaningful insights into systems. However, should not be used sought fully document limitations, flaws, challenges v0.5. includes (1) comprises cases, types under test (SUTs), context, personas, tests, items; (2) categories with definitions subcategories; (3) seven each comprising unique items, i.e., prompts. There are 43,090 items total, we templates; (4) grading system against benchmark; (5) an openly available platform, downloadable tool, called ModelBench can evaluate on (6) example evaluation report benchmarks performance over dozen models; (7) specification
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....