The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations
DOI:
10.31235/osf.io/j3bnt_v2
Publication Date:
2025-01-29T10:24:54Z
AUTHORS (3)
ABSTRACT
Large Language Models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not interchangeable, there is limited guidance on using LLMs to obtain valid estimates causal effects other parameters. We argue LLM should be treated as potentially informative observations, while subjects serve a gold standard in mixed design. This paradigm preserves validity offers more precise at lower cost than experiments relying exclusively subjects. demonstrate–and extend–prediction-powered inference (PPI), method combines observations. define the PPI correlation measure interchangeability derive effective sample size for PPI. also introduce power analysis optimally choose between costly less cheap Mixed designs could enhance scientific productivity reduce inequality access evidence.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....