The Mixed Subjects Design: Treating Large Language Models as Potentially Informative Observations
Mixed model
DOI:
10.31235/osf.io/j3bnt_v3
Publication Date:
2025-02-01T16:28:35Z
AUTHORS (3)
ABSTRACT
Large Language Models (LLMs) provide cost-effective but possibly inaccurate predictions of human behavior. Despite growing evidence that predicted and observed behavior are often not interchangeable, there is limited guidance on using LLMs to obtain valid estimates causal effects other parameters. We argue LLM should be treated as potentially informative observations, while subjects serve a gold standard in mixed design. This paradigm preserves validity offers more precise at lower cost than experiments relying exclusively subjects. demonstrate–and extend–prediction-powered inference (PPI), method combines observations. define the PPI correlation measure interchangeability derive effective sample size for PPI. also introduce power analysis optimally choose between costly less cheap Mixed designs could enhance scientific productivity reduce inequality access evidence.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....