Implementing Active Learning in Cybersecurity: Detecting Anomalies in Redacted Emails

Subject-matter expert Supervised Learning
DOI: 10.48550/arxiv.2303.00870 Publication Date: 2023-01-01
ABSTRACT
Research on email anomaly detection has typically relied specially prepared datasets that may not adequately reflect the type of data occurs in industry settings. In our research, at a major financial services company, privacy concerns prevented inspection bodies emails and attachment details (although subject headings filenames were available). This made labeling possible anomalies resulting redacted more difficult. Another source difficulty is high volume combined with scarcity resources making machine learning (ML) necessity, but also creating need for efficient human training ML models. Active (AL) been proposed as way to make models efficient. However, implementation Learning methods human-centered AI challenge due potential analyst uncertainty, task can be further complicated domains such cybersecurity domain (or healthcare, aviation, etc.) where mistakes have highly adverse consequences. this paper we present research results concerning application emails, comparing utility different implementing active context. We evaluate AL strategies their impact model performance. examine how ratings confidence experts labels inform AL. The obtained are discussed terms implications methodology role model-assisted screening.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....