NFDI4DS | UHH-SEMS - Publication Details

Daryna Dementieva

ORCID: 0000-0003-0929-4140

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5043273387

Research Areas

Topic Modeling
Natural Language Processing Techniques
Hate Speech and Cyberbullying Detection
Spam and Phishing Detection
Text Readability and Simplification
Misinformation and Its Impacts
Advanced Malware Detection Techniques
Authorship Attribution and Profiling
Artificial Intelligence in Healthcare and Education
Speech Recognition and Synthesis
Software Engineering Research
Information Systems and Technology Applications
Statistical and Computational Modeling
Advanced Text Analysis Techniques
Text and Document Classification Technologies
Advanced Data Processing Techniques
Explainable Artificial Intelligence (XAI)
Sentiment Analysis and Opinion Mining
Biomedical Text Mining and Ontologies
Big Data Technologies and Applications
Diverse Scientific Research in Ukraine
Data Quality and Management
Online Learning and Analytics
Foreign Language Teaching Methods
Media Influence and Politics

Technical University of Munich
2022-2024

Skolkovo Institute of Science and Technology
2020-2022

University of Mannheim
2022

Skolkovo Foundation
2020

ChatGPT for good? On opportunities and challenges of large language models for education

OPENALEX - Publications

Enkelejda Kasneci Kathrin Seßler Stefan Küchemann Maria Bannert Daryna Dementieva and 18 more

10.1016/j.lindif.2023.102274 article EN Learning and Individual Differences 2023-03-09

ChatGPT for Good? On Opportunities and Challenges of Large Language Models for Education

OPENALEX - Publications

Enkelejda Kasneci Kathrin Seßler Stefan Küchemann Maria Bannert Daryna Dementieva and 18 more

Large language models represent a significant advancement in the field of AI. The underlying technology is key to further innovations and, despite critical views and even bans within communities regions, large are here stay. This position paper presents potential benefits challenges educational applications models, from student teacher perspectives. We briefly discuss current state their applications. then highlight how these can be used create content, improve engagement interaction,...

10.35542/osf.io/5er8f preprint EN 2023-01-30

ParaDetox: Detoxification with Parallel Data

OPENALEX - Publications

Varvara Logacheva Daryna Dementieva Sergey Ustyantsev Daniil Moskovskiy David C. Dale and 3 more

Varvara Logacheva, Daryna Dementieva, Sergey Ustyantsev, Daniil Moskovskiy, David Dale, Irina Krotova, Nikita Semenov, Alexander Panchenko. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.469 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages

OPENALEX - Publications

Shamsuddeen Hassan Muhammad Nedjma Ousidhoum Idris Abdulmumin Jan Philip Wahle Terry Ruas and 43 more

People worldwide use language in subtle and complex ways to express emotions. While emotion recognition -- an umbrella term for several NLP tasks significantly impacts different applications other fields, most work the area is focused on high-resource languages. Therefore, this has led major disparities research proposed solutions, especially low-resource languages that suffer from lack of high-quality datasets. In paper, we present BRIGHTER-- a collection multilabeled emotion-annotated...

10.48550/arxiv.2502.11926 preprint EN arXiv (Cornell University) 2025-02-17

Text Detoxification using Large Pre-trained Neural Models

OPENALEX - Publications

David Dale Anton Voronov Daryna Dementieva Varvara Logacheva Olga Kozlova and 2 more

David Dale, Anton Voronov, Daryna Dementieva, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021.

10.18653/v1/2021.emnlp-main.629 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers

OPENALEX - Publications

Mohamed Hesham Ibrahim Abdalla Simon Malberg Daryna Dementieva E. Mosca Georg Groh

As generative NLP can now produce content nearly indistinguishable from human writing, it is becoming difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information machine-generated text be factually wrong or even entirely fabricated. In this work, we introduce a novel benchmark dataset containing human-written papers SCIgen, GPT-2, GPT-3, ChatGPT, Galactica, as well co-created by humans ChatGPT. We also experiment with several...

10.3390/info14100522 article EN cc-by Information 2023-09-26

Cross-lingual Evidence Improves Monolingual Fake News Detection

OPENALEX - Publications

Daryna Dementieva Alexander Panchenko

Daryna Dementieva, Alexander Panchenko. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing: Student Research Workshop. 2021.

10.18653/v1/2021.acl-srw.32 article EN cc-by 2021-01-01

Fake News Detection using Multilingual Evidence

OPENALEX - Publications

Daryna Dementieva Alexander Panchenko

Nowadays, misleading information spreads over the internet at an incredible speed, which can lead to irreparable consequences. As a result, it is becoming more and essential combat fake news, especially in early stages of its origins. Over past years, lot work has been done this direction. However, all existed solutions have their limitations. One main limitations current approaches that majority models are focused only on one language do not use any multilingual information. In work, we...

10.1109/dsaa49011.2020.00111 article EN 2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA) 2020-10-01

Adam-Smith at SemEval-2023 Task 4: Discovering Human Values in Arguments with Ensembles of Transformer-based Models

OPENALEX - Publications

Daniel Schroter Daryna Dementieva Georg Groh

This paper presents the best-performing approach alias "Adam Smith" for SemEval-2023 Task 4: "Identification of Human Values behind Arguments". The goal task was to create systems that automatically identify values within textual arguments. We train transformer-based models until they reach their loss minimum or f1-score maximum. Ensembling by selecting one global decision threshold maximizes leads system in competition. based on stacking with logistic regressions shows best performance an...

10.18653/v1/2023.semeval-1.74 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2023-01-01

Toxicity Classification in Ukrainian

OPENALEX - Publications

Daryna Dementieva Valeriia Khylenko Nikolay Babakov Georg Groh

10.18653/v1/2024.woah-1.19 article EN 2024-01-01

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

OPENALEX - Publications

Daryna Dementieva Nikolay Babakov Alexander Panchenko

10.18653/v1/2024.naacl-short.12 article EN 2024-01-01

Multiverse: Multilingual Evidence for Fake News Detection

OPENALEX - Publications

Daryna Dementieva Mikhail Kuimov Alexander Panchenko

The rapid spread of deceptive information on the internet can have severe and irreparable consequences. As a result, it is important to develop technology that detect fake news. Although significant progress has been made in this area, current methods are limited because they focus only one language do not incorporate multilingual information. In work, we propose Multiverse-a new feature based evidence be used for news detection improve existing approaches. Our hypothesis cross-lingual as...

10.3390/jimaging9040077 article EN cc-by Journal of Imaging 2023-03-27

Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models.

OPENALEX - Publications

Daniil Moskovskiy Daryna Dementieva Alexander Panchenko

Detoxification is a task of generating text in polite style while preserving meaning and fluency the original toxic text. Existing detoxification methods are monolingual i.e. designed to work one exact language. This investigates multilingual cross-lingual behavior large models this setting. Unlike previous works we aim make language able perform without direct fine-tuning given Experiments show that capable performing transfer. However, tested state-of-the-art not on currently inevitable...

10.18653/v1/2022.acl-srw.26 article EN cc-by 2022-01-01

RUSSE-2022: Findings of the First Russian Detoxification Shared Task Based on Parallel Corpora

OPENALEX - Publications

Daryna Dementieva Varvara Logacheva Irina Nikishinа Alena Fenogenova David C. Dale and 4 more

Text detoxification is the task of rewriting a toxic text into neutral while preserving its original content. It has wide range applications, e.g. moderation output neural chatbots or suggesting less emotional version posts on social networks. This paper provides description RUSSE-2022 competition methods for Russian language. first which features (i) parallel training data and (ii) manual evaluation. We describe setup competition, solutions participating teams analyse their performance. In...

10.28995/2075-7182-2022-21-114-131 article EN Computational Linguistics and Intellectual Technologies 2022-06-18

IFAN: An Explainability-Focused Interaction Framework for Humans and NLP Models

OPENALEX - Publications

E. Mosca Daryna Dementieva Tohid Ebrahim Ajdari Maximilian Kummeth Kirill Gringauz and 2 more

Edoardo Mosca, Daryna Dementieva, Tohid Ebrahim Ajdari, Maximilian Kummeth, Kirill Gringauz, Yutong Zhou, Georg Groh. Proceedings of the 13th International Joint Conference on Natural Language Processing and 3rd Asia-Pacific Chapter Association for Computational Linguistics: System Demonstrations. 2023.

10.18653/v1/2023.ijcnlp-demo.7 article EN cc-by 2023-01-01

Methods for Detoxification of Texts for the Russian Language

OPENALEX - Publications

Daryna Dementieva Daniil Moskovskiy Varvara Logacheva David Dale Olga Kozlova and 2 more

We introduce the first study of automatic detoxification Russian texts to combat offensive language. This kind textual style transfer can be used for processing toxic content on social media or eliminating toxicity in automatically generated texts. While much work has been done English language this field, there are no works suggest two types models—an approach based BERT architecture that performs local corrections and a supervised pretrained GPT-2 model. compare these methods with several...

10.3390/mti5090054 article EN cc-by Multimodal Technologies and Interaction 2021-09-04

SkoltechNLP at SemEval-2020 Task 11: Exploring Unsupervised Text Augmentation for Propaganda Detection

OPENALEX - Publications

Daryna Dementieva Igor L. Markov Alexander Panchenko

This paper presents a solution for the Span Identification (SI) task in "Detection of Propaganda Techniques News Articles" competition at SemEval-2020. The goal SI is to identify specific fragments each article which contain use least one propaganda technique. binary sequence tagging task. We tested several approaches finally selecting fine-tuned BERT model as our baseline model. Our main contribution an investigation unsupervised data augmentation techniques based on distributional...

10.18653/v1/2020.semeval-1.234 article EN cc-by 2020-01-01

Methods for Detoxification of Texts for the Russian Language

OPENALEX - Publications

Daryna Dementieva Daniil Moskovskiy Varvara Logacheva David Dale Alexander Panchenko and 2 more

We introduce the first study of automatic detoxification Russian texts to combat offensive language.Such a kind textual style transfer can be used, for instance, processing toxic content in social media.While much work has been done English language this field, it never solved yet.We test two types models -unsupervised approach based on BERT architecture that performs local corrections and supervised pretrained GPT-2 model -and compare them with several baselines.In addition, we describe...

10.28995/2075-7182-2021-20-179-190 article EN Kompʹûternaâ lingvistika i intellektualʹnye tehnologii 2021-06-19

Detecting Text Formality: A Study of Text Classification Approaches

OPENALEX - Publications

Daryna Dementieva Nikolay Babakov Alexander Panchenko

Formality is one of the important characteristics text documents.The automatic detection formality level a potentially beneficial for various natural language processing tasks.Before, two large-scale datasets were introduced multiple languages featuring annotation-GYAFC and X-FORMAL.However, they primarily used training style transfer models.At same time, on its own may also be useful application.This work proposes first to our knowledge systematic study methods based statistical,...

10.26615/978-954-452-092-2_031 article EN 2023-01-01

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification

OPENALEX - Publications

Daryna Dementieva Daniil Moskovskiy David Dale Alexander Panchenko

Daryna Dementieva, Daniil Moskovskiy, David Dale, Alexander Panchenko. Proceedings of the 13th International Joint Conference on Natural Language Processing and 3rd Asia-Pacific Chapter Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.ijcnlp-main.70 article EN cc-by 2023-01-01

Toxicity Classification in Ukrainian

OPENALEX - Publications

Daryna Dementieva Valeriia Khylenko Nikolay Babakov Georg Groh

The task of toxicity detection is still a relevant task, especially in the context safe and fair LMs development. Nevertheless, labeled binary classification corpora are not available for all languages, which understandable given resource-intensive nature annotation process. Ukrainian, particular, among languages lacking such resources. To our knowledge, there has been no existing corpus Ukrainian. In this study, we aim to fill gap by investigating cross-lingual knowledge transfer techniques...

10.48550/arxiv.2404.17841 preprint EN arXiv (Cornell University) 2024-04-27

Demarked: A Strategy for Enhanced Abusive Speech Moderation through Counterspeech, Detoxification, and Message Management

OPENALEX - Publications

Seid Muhie Yimam Daryna Dementieva Tim Fischer Daniil Moskovskiy Naquee Rizwan and 6 more

Despite regulations imposed by nations and social media platforms, such as recent EU targeting digital violence, abusive content persists a significant challenge. Existing approaches primarily rely on binary solutions, outright blocking or banning, yet fail to address the complex nature of speech. In this work, we propose more comprehensive approach called Demarcation scoring speech based four aspect -- (i) severity scale; (ii) presence target; (iii) context (iv) legal scale suggesting...

10.48550/arxiv.2406.19543 preprint EN arXiv (Cornell University) 2024-06-27

Coming Soon ...