Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management
FOS: Computer and information sciences
Computer Science - Computation and Language
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
Computation and Language (cs.CL)
DOI:
10.48550/arxiv.2108.01764
Publication Date:
2021-01-01
AUTHORS (7)
ABSTRACT
Accepted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks<br/>Recent advances in Natural Language Processing (NLP), and specifically automated Question Answering (QA) systems, have demonstrated both impressive linguistic fluency and a pernicious tendency to reflect social biases. In this study, we introduce Q-Pain, a dataset for assessing bias in medical QA in the context of pain management, one of the most challenging forms of clinical decision-making. Along with the dataset, we propose a new, rigorous framework, including a sample experimental design, to measure the potential biases present when making treatment decisions. We demonstrate its use by assessing two reference Question-Answering systems, GPT-2 and GPT-3, and find statistically significant differences in treatment between intersectional race-gender subgroups, thus reaffirming the risks posed by AI in medical settings, and the need for datasets like ours to ensure safety before medical AI applications are deployed.<br/>
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....