NFDI4DS | UHH-SEMS - Publication Details

Çağrı Çöltekin

ORCID: 0000-0003-1031-6327

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5017594053

Research Areas

Natural Language Processing Techniques
Topic Modeling
Authorship Attribution and Profiling
Speech Recognition and Synthesis
Semantic Web and Ontologies
Text Readability and Simplification
Speech and dialogue systems
Hate Speech and Cyberbullying Detection
Linguistic Variation and Morphology
Language Development and Disorders
Language and cultural evolution
Swearing, Euphemism, Multilingualism
Music and Audio Processing
Phonetics and Phonology Research
Syntax, Semantics, Linguistic Variation
Mental Health via Writing
Freedom of Expression and Defamation
Social Media and Politics
Digital Communication and Language
Language, Metaphor, and Cognition
Sentiment Analysis and Opinion Mining
Linguistics, Language Diversity, and Identity
Hearing Loss and Rehabilitation
Mathematics, Computing, and Information Processing
Legal Language and Interpretation

University of Tübingen
2015-2024

University of Pennsylvania
2018

University of Colorado Boulder
2018

Commonwealth Scientific and Industrial Research Organisation
2018

Nuance Communications (United Kingdom)
2018

California University of Pennsylvania
2018

German Research Centre for Artificial Intelligence
2018

Toyota Technological Institute at Chicago
2018

Uppsala University
2017

University of Groningen
2013-2015

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

OPENALEX - Publications

Marcos Zampieri Preslav Nakov Sara Rosenthal Pepa Atanasova Georgi Karadzhov and 4 more

Marcos Zampieri, Preslav Nakov, Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Hamdy Mubarak, Leon Derczynski, Zeses Pitenis, Çağrı Çöltekin. Proceedings of the Fourteenth Workshop on Semantic Evaluation. 2020.

10.18653/v1/2020.semeval-1.188 article EN cc-by 2020-01-01

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

OPENALEX - Publications

Daniel Zeman Martin Popel Milan Straka Jan Hajič Joakim Nivre and 57 more

Daniel Zeman, Martin Popel, Milan Straka, Jan Hajič, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Potthast, Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie Cinková, Hajič jr., Jaroslava Hlaváčová, Václava Kettnerová, Zdeňka Urešová, Jenna Kanerva, Stina Ojala, Missilä, Christopher D. Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi...

10.18653/v1/k17-3001 article EN cc-by 2017-01-01

Identifying Depression on Reddit: The Effect of Training Data

OPENALEX - Publications

Inna Pirina Çağrı Çöltekin

This paper presents a set of classification experiments for identifying depression in posts gathered from social media platforms. In addition to the data previously by other researchers, we collect additional platform Reddit. Our show promising results texts. More importantly, however, that choice corpora is crucial and can lead misleading conclusions case poor data.

10.18653/v1/w18-5903 article EN cc-by 2018-01-01

The ParlaMint corpora of parliamentary proceedings

OPENALEX - Publications

Tomaž Erjavec Maciej Ogrodniczuk Petya Osenova Nikola Ljubešić Kiril Simov and 23 more

This paper presents the ParlaMint corpora containing transcriptions of sessions 17 European national parliaments with half a billion words. The are uniformly encoded, contain rich meta-data about 11 thousand speakers, and linguistically annotated following Universal Dependencies formalism named entities. Samples conversion scripts available from project's GitHub repository, complete openly via CLARIN.SI repository for download, as well through NoSketch Engine KonText concordancers Parlameter...

10.1007/s10579-021-09574-0 article EN cc-by Language Resources and Evaluation 2022-02-02

Findings of the VarDial Evaluation Campaign 2023

OPENALEX - Publications

Noëmi Aepli Çağrı Çöltekin Rob van der Goot Tommi Jauhiainen Mourhaf Kazzaz and 5 more

Noëmi Aepli, Çağrı Çöltekin, Rob Van Der Goot, Tommi Jauhiainen, Mourhaf Kazzaz, Nikola Ljubešić, Kai North, Barbara Plank, Yves Scherrer, Marcos Zampieri. Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023). 2023.

10.18653/v1/2023.vardial-1.25 article EN cc-by 2023-01-01

Tübingen-Oslo at SemEval-2018 Task 2: SVMs perform better than RNNs in Emoji Prediction

OPENALEX - Publications

Çağrı Çöltekin Taraka Rama

This paper describes our participation in the SemEval-2018 task Multilingual Emoji Prediction. We participated both English and Spanish subtasks, experimenting with support vector machines (SVMs) recurrent neural networks. Our SVM classifier obtained top rank subtasks macro-averaged F1-measures of 35.99% for 22.36% data sets. Similar to a few earlier attempts, results networks were not on par linear SVMs.

10.18653/v1/s18-1004 article EN cc-by 2018-01-01

Language Discrimination and Transfer Learning for Similar Languages: Experiments with Feature Combinations and Adaptation

OPENALEX - Publications

Nianheng Wu Eric DeMattos Kwok Him So Pin-zhen Chen Çağrı Çöltekin

This paper describes the work done by team tearsofjoy participating in VarDial 2019 Evaluation Campaign. We developed two systems based on Support Vector Machines: SVM with a flat combination of features and ensembles. participated all language/dialect identification tasks, as well Moldavian vs. Romanian cross-dialect topic (MRC) task. Our achieved first place German Dialect (GDI) MRC subtasks 2 3, second simplified variant Discriminating between Mainland Taiwan variation Mandarin Chinese...

10.18653/v1/w19-1406 article EN 2019-01-01

Using Gabmap

OPENALEX - Publications

Therese Leinonen Çağrı Çöltekin John Nerbonne

Gabmap is a freely available, open-source web application that analyzes the data of language variation, e.g. varying words for same concepts, pronunciations words, or frequencies syntactic constructions in transcribed conversations. an integrated part CLARIN (see http://portal.clarin.nl). This article summarizes Gabmap's basic functionality, adding material on some new features and reporting range uses to which has been put. modestly successful, its popularity underscores fact study...

10.1016/j.lingua.2015.02.004 article EN cc-by-nc-nd Lingua 2015-03-12

Tübingen system in VarDial 2017 shared task: experiments with language identification and cross-lingual parsing

OPENALEX - Publications

Çağrı Çöltekin Taraka Rama

This paper describes our systems and results on VarDial 2017 shared tasks. Besides three language/dialect discrimination tasks, we also participated in the cross-lingual dependency parsing (CLP) task using a simple methodology which briefly describe this paper. For all used linear SVMs with character word features. The system achieves competitive among other task. We report additional experiments neural network models. performance of models was close but always below corresponding SVM...

10.18653/v1/w17-1218 article EN cc-by 2017-01-01

Using Universal Dependencies in cross-linguistic complexity research

OPENALEX - Publications

Aleksandrs Berdičevskis Çağrı Çöltekin Katharina Ehret Kilu von Prince Daniel Ross and 6 more

Aleksandrs Berdicevskis, Çağrı Çöltekin, Katharina Ehret, Kilu von Prince, Daniel Ross, Bill Thompson, Chunxiao Yan, Vera Demberg, Gary Lupyan, Taraka Rama, Christian Bentz. Proceedings of the Second Workshop on Universal Dependencies (UDW 2018). 2018.

10.18653/v1/w18-6002 article EN cc-by 2018-01-01

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

OPENALEX - Publications

Marcos Zampieri Preslav Nakov Sara Brin Rosenthal Pepa Atanasova Georgi Karadzhov and 4 more

We present the results and main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020). The task involves three subtasks corresponding to hierarchical taxonomy OLID schema (Zampieri et al., 2019a) from OffensEval 2019. featured five languages: English, Arabic, Danish, Greek, Turkish for Subtask A. In addition, English also Subtasks B C. 2020 was one most popular tasks at attracting a large number participants across all languages....

10.48550/arxiv.2006.07235 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Coming Soon ...