Sanja Štajner

ORCID: 0000-0002-7780-7035
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Text Readability and Simplification
  • Topic Modeling
  • Sentiment Analysis and Opinion Mining
  • Advanced Text Analysis Techniques
  • Authorship Attribution and Profiling
  • Software Engineering Research
  • Speech and dialogue systems
  • Video Analysis and Summarization
  • Interpreting and Communication in Healthcare
  • Online Learning and Analytics
  • Linguistics and Cultural Studies
  • Financial Markets and Investment Strategies
  • Stock Market Forecasting Methods
  • Auditing, Earnings Management, Governance
  • Linguistics, Language Diversity, and Identity
  • Linguistic Variation and Morphology
  • Spam and Phishing Detection
  • Personality Traits and Psychology
  • Translation Studies and Practices
  • Complex Network Analysis Techniques
  • Misinformation and Its Impacts
  • Machine Learning and Algorithms
  • Digital Innovation in Industries
  • Humor Studies and Applications

Institut für Informationsverarbeitung
2023

University of Stuttgart
2023

Laboratoire d'Informatique de Paris-Nord
2023

Stanford University
2023

Universitat Pompeu Fabra
2022

NortonLifeLock (United States)
2021-2022

GfK (Germany)
2019-2021

University of Mannheim
2017-2020

Université de Lille
2018

Centre National de la Recherche Scientifique
2018

We present the first attempt at using sequence to neural networks model text simplification (TS). Unlike previously proposed automated TS systems, our (NTS) systems are able simultaneously perform lexical and content reduction. An extensive human evaluation of output has shown that NTS achieve almost perfect grammaticality meaning preservation sentences higher level than state-of-the-art

10.18653/v1/p17-2014 article EN cc-by 2017-01-01

Goran Glavaš, Sanja Štajner. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2011 article EN cc-by 2015-01-01

Seid Muhie Yimam, Chris Biemann, Shervin Malmasi, Gustavo Paetzold, Lucia Specia, Sanja Štajner, Anaïs Tack, Marcos Zampieri. Proceedings of the Thirteenth Workshop on Innovative Use NLP for Building Educational Applications. 2018.

10.18653/v1/w18-0507 article EN cc-by 2018-01-01

The way in which a text is written can be barrier for many people. Automatic simplification natural language processing technology that, when mature, could used to produce texts that are adapted the specific needs of particular users. Most research area automatic has dealt with English language. In this article, we present results from Simplext project, dedicated Spanish. We modular system procedures syntactic and lexical grounded on analysis corpus manually simplified people special needs....

10.1145/2738046 article EN ACM Transactions on Accessible Computing 2015-05-11

Since the late 1990s, automatic text simplification (ATS) was promoted as a natural language processing (NLP) task with great potential to make texts more accessible people various reading or cognitive disabilities, and enable their better social inclusion.Large multidisciplinary projects showed promising steps in that direction.Since 2010, field started attracting attention but at cost of major shifts system architecture, target audience, evaluation strategies.Somewhere along way, focus has...

10.18653/v1/2021.findings-acl.233 article EN cc-by 2021-01-01

Sanja Štajner, Hannah Béchara, Horacio Saggion. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2135 article EN cc-by 2015-01-01

Even in highly-developed countries, as many 15-30% of the population can only understand texts written using a basic vocabulary. Their understanding everyday is limited, which prevents them from taking an active role society and making informed decisions regarding healthcare, legal representation, or democratic choice. Lexical simplification natural language processing task that aims to make text understandable everyone by replacing complex vocabulary expressions with simpler ones, while...

10.3389/frai.2022.991242 article EN cc-by Frontiers in Artificial Intelligence 2022-09-22

Horacio Saggion, Sanja Štajner, Daniel Ferrés, Kim Cheng Sheang, Matthew Shardlow, Kai North, Marcos Zampieri. Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022). 2022.

10.18653/v1/2022.tsar-1.31 article EN cc-by 2022-01-01

Complex Word Identification (CWI) is an important task in lexical simplification and text accessibility.Due to the lack of CWI datasets, previous works largely depend on Simple English Wikipedia edit histories for obtaining 'gold standard' annotations, which are mixed quality, limited only.We collect complex words/phrases (CP) English, German Spanish, annotated by both native non-native speakers, propose language independent features that can be used train multilingual crosslingual models.We...

10.26615/978-954-452-049-6_104 article EN 2017-11-10

Automatic detection of the four MBTI personality dimensions from texts has recently attracted noticeable attention natural language processing and computational linguistic communities. Despite large collections Twitter data for training, best systems rarely even outperform majority-class baseline. In this paper, we discuss theoretical reasons such low results present insights an annotation study that further shed light on issue.

10.18653/v1/2021.eacl-main.312 article EN cc-by 2021-01-01

Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina Mcmillan-major, Anna Shvets, Ashish Upadhyay, Bernd Bohnet, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna...

10.18653/v1/2022.emnlp-demos.27 article EN cc-by 2022-01-01

In this paper, we present our participation to the EmoContext shared task on detecting emotions in English textual conversations between a human and chatbot. We propose four neural systems combine them further improve results. show that ensemble can successfully distinguish three (SAD, HAPPY, ANGRY) separate from rest (OTHERS) highly-imbalanced scenario. Our best system achieved 0.77 F1-score was ranked fourth out of 165 submissions.

10.18653/v1/s19-2057 article EN cc-by 2019-01-01

Personality profiling has long been used in psychology to predict life outcomes. Recently, automatic detection of personality traits from written messages gained significant attention computational linguistics and natural language processing communities, due its applicability various fields. In this survey, we show the trajectory research towards purely approaches, through psycholinguistics, recent approaches on large datasets automatically extracted social media. We point out what lost...

10.18653/v1/2020.coling-main.553 article EN cc-by Proceedings of the 17th international conference on Computational linguistics - 2020-01-01

This study addresses the automatic simplification of texts in Spanish order to make them more accessible people with cognitive disabilities. A corpus analysis original and manually simplified news articles was undertaken identify quantify relevant operations be implemented a text system. The were further compared at sentence level by means feature extraction various machine learning classification algorithms, using three different groups features (POS frequencies, syntactic information,...

10.13053/cys-17-2-1530 article EN Computación y Sistemas 2013-06-29

This study explores the possibility of replacing costly and time-consuming human evaluation grammaticality meaning preservation output text simplification (TS) systems with some automatic measures. The focus is on six widely used machine translation (MT) metrics their correlation judgements in snippets. As results show a significant between them, we go further try to classify simplified sentences into: (1) those which are acceptable; (2) need minimal post-editing; (3) should be discarded....

10.3115/v1/w14-1201 article EN cc-by 2014-01-01

10.1016/j.eswa.2017.04.005 article EN Expert Systems with Applications 2017-04-06

Sanja Štajner, Marc Franco-Salvador, Simone Paolo Ponzetto, Rosso, Heiner Stuckenschmidt. Proceedings of the 55th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2017.

10.18653/v1/p17-2016 article EN cc-by 2017-01-01

According to the official adult literacy report conducted in 24 highly-developed countries, more than 50% adults, on average, can only understand basic vocabulary, short sentences, and syntactic constructions. Everyday information found news articles is thus inaccessible many people, impeding their social inclusion informed decision-making. Systems for automatic sentence simplification aim provide scalable solution this problem. In paper, we propose new state-of-the-art systems English...

10.1609/aaai.v36i11.21477 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28
Coming Soon ...