Named Entity Recognition and Classification for Punjabi Shahmukhi

Named Entity Recognition Benchmark (surveying) Identification Corpus Linguistics Named entity Entity linking Proper noun
DOI: 10.1145/3383306 Publication Date: 2020-05-04T06:53:19Z
ABSTRACT
Named entity recognition (NER) refers to the identification of proper nouns from natural language text and classifying them into named types, such as person, location, organization. Due widespread applications NER, numerous NER techniques benchmark datasets have been developed for both Western Asian languages. Even though Shahmukhi script Punjabi has used by nearly three fourths speakers worldwide, Gurmukhi main focus research activities. Specifically, a corpus is non-existent, which thwarted commencement script. To this end, article presents development specifications first-ever Shahmukhi. The newly composed 318,275 tokens 16,300 entities, including 11,147 persons, 3,140 locations, 2,013 organizations. establish strength our corpus, we compared with its counterparts. Furthermore, demonstrated usability using five supervised learning techniques, two state-of-the-art deep techniques. results are compared, valuable insights about behaviors most effective technique discussed.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (28)
CITATIONS (17)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....