- Privacy-Preserving Technologies in Data
- Formal Methods in Verification
- Security and Verification in Computing
- Software Testing and Debugging Techniques
- Explainable Artificial Intelligence (XAI)
- Adversarial Robustness in Machine Learning
Massachusetts Institute of Technology
2024
External audits of AI systems are increasingly recognized as a key mechanism for governance. The effectiveness an audit, however, depends on the degree access granted to auditors. Recent state-of-the-art have primarily relied black-box access, in which auditors can only query system and observe its outputs. However, white-box system's inner workings (e.g., weights, activations, gradients) allows auditor perform stronger attacks, more thoroughly interpret models, conduct fine-tuning....
We introduce DafnyBench, the largest benchmark of its kind for training and evaluating machine learning systems formal software verification. test ability LLMs such as GPT-4 Claude 3 to auto-generate enough hints Dafny verification engine successfully verify over 750 programs with about 53,000 lines code. The best model prompting scheme achieved 68% success rate, we quantify how this rate improves when retrying error message feedback it deteriorates amount required code hints. hope that...