Word sequence kernels
Benchmark (surveying)
Sequence (biology)
Kernel (algebra)
Similarity (geometry)
DOI:
10.5555/944919.944963
Publication Date:
2003-03-01
AUTHORS (4)
ABSTRACT
We address the problem of categorising documents using kernel-based methods such as Support Vector Machines. Since work Joachims (1998), there is ample experimental evidence that SVM standard word frequencies features yield state-of-the-art performance on a number benchmark problems. Recently, Lodhi et al. (2002) proposed use string kernels, novel way computing document similarity based matching non-consecutive subsequences characters. In this article, we propose technique with sequences words rather than This approach has several advantages, in particular it more efficient computationally and ties closely linguistic pre-processing techniques. present some extensions to sequence kernels dealing symbol-dependent match-dependent decay factors, empirical evaluations these Reuters-21578 datasets.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....