Data Collection and Analysis of Print and Fan Fiction Classification
0202 electrical engineering, electronic engineering, information engineering
02 engineering and technology
Text Classification
400
004
Natural Language Processing
DOI:
10.5220/0010774100003122
Publication Date:
2022-02-17T13:10:18Z
AUTHORS (2)
ABSTRACT
Fan fiction has provided opportunities for genre enthusiasts to produce their own story lines from existing print fiction. It has also introduced concerns including intellectual property issues for traditional print publishers. An interesting and difficult problem is determining whether a given segment of text is fan fiction or print fiction. Classifying unstructured text remains a critical step for many intelligent systems. In this paper we detail how a significant volume of print and fan fiction was obtained. The data is processed using a proposed pipeline and then analysed using various supervised machine learning classifiers. Given 5 to 10 sentences, our results show an accuracy of 80-90% can be achieved using traditional approaches. To our knowledge this is the first study that explores this type of fiction classification problem.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....