Text classification to streamline online wildlife trade analyses
Sample (material)
Wildlife trade
Identification
DOI:
10.1371/journal.pone.0254007
Publication Date:
2021-07-09T17:59:52Z
AUTHORS (7)
ABSTRACT
Automated monitoring of websites that trade wildlife is increasingly necessary to inform conservation and biosecurity efforts. However, e-commerce trading can contain a vast number advertisements, an unknown proportion which may be irrelevant researchers practitioners. Given many wildlife-trade advertisements have unstructured text format, automated identification relevant listings has not traditionally been possible, nor attempted. Other scientific disciplines solved similar problems using machine learning natural language processing models, such as classifiers. Here, we test the ability suite classifiers extract from occurring on Internet. We collected data Australian classifieds website where people post their pet birds (n = 16.5k advertisements). found predict, with high degree accuracy, are (ROC AUC ≥ 0.98, F1 score 0.77). Furthermore, in attempt answer question ‘how much required adequately performing model?’, conducted sensitivity analysis by simulating decreases sample sizes measure subsequent change model performance. From our analysis, minimum size 33% (c. 5.5k listings) accurately identify (for dataset), providing reference point for future applications this sort. Our results suggest classification viable tool applied online reduce time dedicated cleaning. success will vary depending websites, therefore context dependent. Further work integrate other tools, image classification, provide better predictive abilities streamlining related data.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (28)
CITATIONS (17)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....