NFDI4DS | UHH-SEMS - Publication Details

ALMs: Authorial Language Models for Authorship Attribution

Perplexity Authorship Attribution Macro Stylometry

DOI: 10.48550/arxiv.2401.12005 Publication Date: 2024-01-01

Abstract Supplemental Material References Cited by

AUTHORS (3)

Weihang Huang

Akira Murakami

Jack Grieve

ABSTRACT

In this paper, we introduce an authorship attribution method called Authorial Language Models (ALMs) that involves identifying the most likely author of a questioned document based on perplexity calculated for set causal language models fine-tuned writings candidate author. We benchmarked ALMs against state-of-art-systems using CCAT50 dataset and Blogs50 datasets. find achieves macro-average accuracy score 83.6% Blogs50, outperforming all other methods, 74.9% CCAT50, matching performance best method. To assess shorter texts, also conducted text ablation testing. found to reach 70%, needs 40 tokens 400 while 60% requires 20 70 CCAT50.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications

PlumX Metrics

ALMs: Authorial Language Models for Authorship Attribution

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....