All Your Base Are Belong to Us: The Urgent Reality of Unproctored Testing in the Age of LLMs
Base (topology)
DOI:
10.1111/ijsa.70005
Publication Date:
2025-03-12T04:44:21Z
AUTHORS (1)
ABSTRACT
ABSTRACT The release of new generative artificial intelligence (AI) tools, including large language models (LLMs), continues at a rapid pace. Upon the OpenAI's o1 models, I reconducted Hickman et al.'s (2024) analyses examining how well LLMs perform on quantitative ability (number series) test. GPT‐4 scored below 20th percentile (compared to thousands human test takers), but 95th percentile. In response these updated findings and Lievens Dunlop's (2025) article about effects validity pre‐employment assessments, make an urgent call action for selection assessment researchers practitioners. A recent survey suggests that proportion applicants are already using AI tools complete high‐stakes it seems no current assessments will be safe long. Thus, offer possibilities future testing, detail their benefits drawbacks, provide recommendations. These are: increased use proctoring, adding strict time limits, LLM detection software, think‐aloud (or similar) protocols, collecting analyzing trace data, emphasizing samples over signs, redesigning allow during completion. Several inspire research modernize assessment. Future should seek improve our understanding design valid use, effectively test‐taker whether protocols can help differentiate experts novices.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (65)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....