A Dataset of German Legal Documents for Named Entity Recognition
Named Entity Recognition
DOI:
10.48550/arxiv.2003.13016
Publication Date:
2020-01-01
AUTHORS (3)
ABSTRACT
We describe a dataset developed for Named Entity Recognition in German federal court decisions. It consists of approx. 67,000 sentences with over 2 million tokens. The resource contains 54,000 manually annotated entities, mapped to 19 fine-grained semantic classes: person, judge, lawyer, country, city, street, landscape, organization, company, institution, court, brand, law, ordinance, European legal norm, regulation, contract, decision, and literature. documents were, furthermore, automatically more than 35,000 TimeML-based time expressions. dataset, which is available under CC-BY 4.0 license the CoNNL-2002 format, was training an NER service EU project Lynx.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....