Named Entity Recognition using Support Vector Machine: A Language Independent Approach
0202 electrical engineering, electronic engineering, information engineering
Named Entity Recognition (NER)
02 engineering and technology
Hindi.
Bengali
01 natural sciences
Named Entity (NE)
Support Vector Machine (SVM)
0105 earth and related environmental sciences
DOI:
10.5281/zenodo.1057979
Publication Date:
2010-03-23
AUTHORS (2)
ABSTRACT
Named Entity Recognition (NER) aims to classify each word of a document into predefined target named entity classes and is now-a-days considered be fundamental for many Natural Language Processing (NLP) tasks such as information retrieval, machine translation, extraction, question answering systems others. This paper reports about the development NER system Bengali Hindi using Support Vector Machine (SVM). Though this state art learning technique has been widely applied in several well-studied languages, use Indian languages (ILs) very new. The makes different contextual words along with variety features that are helpful predicting four (NE) classes, Person name, Location Organization name Miscellaneous name. We have used annotated corpora 122,467 tokens 502,974 tagged twelve NE , defined part IJCNLP-08 Shared Task South East Asian Languages (SSEAL) . In addition, we manually 150K wordforms news corpus, developed from web-archive leading newspaper. also an unsupervised algorithm order generate lexical context patterns unlabeled corpus. Lexical SVM improve performance. tested gold standard test sets 35K, 60K Bengali, Hindi, respectively. Evaluation results demonstrated recall, precision, f-score values 88.61%, 80.12%, 84.15%, respectively, 80.23%, 74.34%, 77.17%, Hindi. Results show improvement by 5.13% patterns. Statistical analysis, ANOVA performed compare performance proposed existing HMM based both languages. Keywords—Named (NE); (NER); (SVM); Bengali;
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....