NFDI4DS | UHH-SEMS - Publication Details

A comparison of statistical association measures for identifying dependency-based collocations in various languages.

Association (psychology) Adjective Word Association Parallel corpora

DOI: 10.18653/v1/w19-5107 Publication Date: 2019-09-12T18:39:56Z

Abstract Supplemental Material References Cited by

AUTHORS (3)

Marcos Garcia

Marcos García Salido

Margarita Alonso-...

ABSTRACT

This paper presents an exploration of different statistical association measures to automatically identify collocations from corpora in English, Portuguese, and Spanish. To evaluate the impact metrics we manually annotated with three syntactic patterns (adjective-noun, verb-object nominal compounds). We took advantage PARSEME 1.1 Shared Task by selecting a subset 155k tokens referred languages, which 1,526 corresponding Lexical Functions according Meaning-Text Theory. Using resulting gold-standard, have carried out comparison between frequency data several well-known measures, both symmetric asymmetric. The results show that combination dependency triples raw information is as powerful best most languages. Furthermore, despite asymmetric behaviour collocations, directional approaches perform worse than ones extraction these phraseological combinations.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (5)

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications CROSSREF - Publications

PlumX Metrics

A comparison of statistical association measures for identifying dependency-based collocations in various languages.

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....