Jelena Mitrović

ORCID: 0000-0003-3220-8749
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Hate Speech and Cyberbullying Detection
  • Advanced Text Analysis Techniques
  • Sentiment Analysis and Opinion Mining
  • Artificial Intelligence in Law
  • Semantic Web and Ontologies
  • Misinformation and Its Impacts
  • European and International Law Studies
  • Social Media and Politics
  • Comparative and International Law Studies
  • Opinion Dynamics and Social Influence
  • Web Data Mining and Analysis
  • Legal Language and Interpretation
  • Vaccine Coverage and Hesitancy
  • Pregnancy and preeclampsia studies
  • Human Mobility and Location-Based Analysis
  • Authorship Attribution and Profiling
  • Nuclear reactor physics and engineering
  • Expert finding and Q&A systems
  • linguistics and terminology studies
  • Magnetic confinement fusion research
  • Diverse Scientific and Economic Studies
  • Urban Transport and Accessibility
  • Flow Measurement and Analysis

University of Passau
2017-2024

German Research Centre for Artificial Intelligence
2022-2023

University of Novi Sad
2014-2022

University of Groningen
2021

University of Turin
2021

Institut za Reumatologiju
2018

University of Belgrade
2013-2015

We introduce HateBERT, a re-trained BERT model for abusive language detection in English. The was trained on RAL-E, large-scale dataset of Reddit comments English from communities banned being offensive, abusive, or hateful that we have curated and made available to the public. present results detailed comparison between general pre-trained retrained version three datasets hate speech tasks. In all datasets, HateBERT outperforms corresponding model. also discuss battery experiments comparing...

10.18653/v1/2021.woah-1.3 article EN cc-by 2021-01-01

Media has a substantial impact on public perception of events, and, accordingly, the way media presents events can potentially alter beliefs and views public. One ways in which bias news articles be introduced is by altering word choice. Such form very challenging to identify automatically due high context-dependence lack large-scale gold-standard data set. In this paper, we present prototypical yet robust diverse set for research. It consists 1,700 statements representing various instances...

10.1016/j.ipm.2021.102505 article EN cc-by-nc-nd Information Processing & Management 2021-02-11

In the 2015 migration crisis thousands of refugees and migrants crossed border to Hungary, Austria Germany. The movements these people are reflected in social media, especially on Twitter. this paper we present a dataset 3275 Tweets from months September October 2015. These annotated regarding their relevance containing quantitative movement information refugees/migrants into We for posterior analysis or as basis creating an automated extraction / prediction system.

10.1109/wetice.2019.00039 article EN 2019-06-01

The use of BERT, one the most popular language models, has led to improvements in many Natural Language Processing (NLP) tasks. One such task is Named Entity Recognition (NER) i.e. automatic identification named entities as location, person, organization, etc. from a given text. It also an important base step for NLP tasks information extraction and argumentation mining. Even though there much research done on NER using BERT other same not explored detail when it comes Legal or Tech. applies...

10.5220/0011749400003393 article EN cc-by-nc-nd Proceedings of the 14th International Conference on Agents and Artificial Intelligence 2023-01-01

Since the first COVID-19 vaccine appeared, there has been a growing tendency to automatically determine public attitudes toward it. In particular, it was important find reasons for hesitancy, since directly correlated with pandemic protraction. Natural language processing (NLP) and health researchers have turned social media (eg, Twitter, Reddit, Facebook) user-created content from which they can gauge opinion on vaccination. To process such content, use number of NLP techniques, most...

10.2196/42261 article EN cc-by Journal of Medical Internet Research 2022-09-29

Abstract Web search is a crucial technology for the digital economy. Dominated by few gatekeepers focused on commercial success, however, web publishers have to optimize their content these gatekeepers, resulting in closed ecosystem of engines as well risk sacrificing quality. To encourage an open and offer users genuine choice among alternative engines, we propose development Open Index (OWI). We outline six core principles developing maintaining index, based data principles, legal...

10.1002/asi.24818 article EN cc-by-nc Journal of the Association for Information Science and Technology 2023-08-07

Background and Objectives: Gestational diabetes mellitus (GDM) may impact both maternal fetal/neonatal health. The identification of prognostic indicators for GDM improve risk assessment selection patient intensive monitoring. aim this study was to find potential predictors adverse pregnancy outcome in normoglycemic patients by comparing the levels different biochemical parameters values blood cell count (BCC) between with good outcome. Materials Methods: Prospective clinical included 49...

10.3390/medicina60081250 article EN cc-by Medicina 2024-07-31

This paper presents our submission for the SemEval shared task 6, sub-task A on identification of offensive language. Our proposed model, C-BiGRU, combines a Convolutional Neural Network (CNN) with bidirectional Recurrent (RNN). We utilize word2vec to capture semantic similarities between words. composition allows us extract long term dependencies in tweets and distinguish non-offensive tweets. In addition, we evaluate approach different dataset show that model is capable detecting online...

10.18653/v1/s19-2127 article EN cc-by 2019-01-01

10.5220/0010187305150521 article EN cc-by-nc-nd Proceedings of the 14th International Conference on Agents and Artificial Intelligence 2021-01-01

This paper surveys ontological modeling of rhetorical concepts, developed for use in argument mining and other applications computational rhetoric, projecting their future directions. We include models schemes applying Rhetorical Structure Theory (RST); the RhetFig proposal modeling; related RetFig Ontology Figures Serbian (developed by two authors); Lassoing Rhetoric project another authors). The venture is interesting its multifaceted approach to linguistic devices, prominently including...

10.3233/aac-170027 article EN cc-by-nc Argument & Computation 2017-01-01

Accurate identification of insect species is an indispensable and challenging requirement for every entomologist, particularly if the involved in disease outbreaks. The European MediLabSecure project designed (ID) exercise available to any willing participant with aim assessing improving knowledge mosquito taxonomy. was based on high-definition photomicrographs mosquitoes (26 adult females 12 larvae) collected from western Palaearctic. Sixty-five responses Europe, North Africa Middle East...

10.1051/parasite/2022045 article EN cc-by Parasite 2022-01-01

Generative pre-trained transformers (GPT) have recently demonstrated excellent performance in various natural language tasks. The development of ChatGPT and the released GPT-4 model has shown competence solving complex higher-order reasoning tasks without further training or fine-tuning. However, applicability strength these models classifying legal texts context argument mining are yet to be realized not been tested thoroughly. In this study, we investigate effectiveness GPT-like models,...

10.3389/frai.2023.1278796 article EN cc-by Frontiers in Artificial Intelligence 2023-11-17

The choice of embedding model is a crucial step in the design Retrieval Augmented Generation (RAG) systems. Given sheer volume available options, identifying clusters similar models streamlines this selection process. Relying solely on benchmark performance scores only allows for weak assessment similarity. Thus, study, we evaluate similarity within context RAG Our two-fold: We use Centered Kernel Alignment to compare embeddings pair-wise level. Additionally, as it especially pertinent...

10.48550/arxiv.2407.08275 preprint EN arXiv (Cornell University) 2024-07-11

The paper presents a language dependent model for classification of statements into ironic and non-ironic. uses various resources: morphological dictionaries, sentiment lexicon, lexicon markers WordNet based ontology. This approach features: antonymous pairs obtained using the reasoning rules over Serbian ontology (R), in which one member has positive polarity (PPR), words (PSP), ordered sequence tags (OSA), Part-of-Speech (POS) irony (M). evaluation was performed on two collections tweets...

10.1145/3136273.3136298 article EN 2017-09-20

There is evidence that specific segments of the population were hit particularly hard by Covid-19 pandemic (e.g., people with a migration background). In this context, impact and role played online platforms in facilitating integration or fragmentation public debates social groups recurring topic discussion. This where our study ties in, we ask: How vaccination discussed evaluated different language communities Germany on Twitter during pandemic? We collected all tweets German, Russian,...

10.17645/mac.v11i1.6058 article EN cc-by Media and Communication 2023-03-27

We introduce an approach to multilingual Offensive Language Detection based on the mBERT transformer model. download extra training data from Twitter in English, Danish, and Turkish, use it re-train then fine-tuned model provided and, some configurations, implement transfer learning exploiting typological relatedness between English Danish. Our systems obtained good results across three languages (.9036 for EN, .7619 DA, .7789 TR).

10.18653/v1/2020.semeval-1.202 article EN cc-by 2020-01-01
Coming Soon ...