Elena Volodina

ORCID: 0000-0003-1935-1321
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Text Readability and Simplification
  • Topic Modeling
  • Speech and dialogue systems
  • Second Language Acquisition and Learning
  • Semantic Web and Ontologies
  • Lexicography and Language Studies
  • Linguistic Studies and Language Acquisition
  • Educational Methods and Teacher Development
  • Foreign Language Teaching Methods
  • linguistics and terminology studies
  • Material Properties and Applications
  • Mobile Crowdsensing and Crowdsourcing
  • Educational Innovations and Challenges
  • Industrial Engineering and Technologies
  • Discourse Analysis and Cultural Communication
  • Diabetes Management and Research
  • Intelligent Tutoring Systems and Adaptive Learning
  • Psychology of Development and Education
  • Educational Technology and Assessment
  • Language, Communication, and Linguistic Studies
  • Advanced Text Analysis Techniques
  • Expert finding and Q&A systems
  • Speech Recognition and Synthesis
  • Interpreting and Communication in Healthcare

University of Gothenburg
2016-2025

University of California, Los Angeles
2024

Göteborgs Stads
2016-2022

Swedish National Bank
2012-2019

University of Tyumen
2015-2016

Institute for Infocomm Research
2016

Johns Hopkins University
2016

We present the KELLY project and its work on developing monolingual bilingual word lists for language learning, using corpus methods, nine languages thirty-six pairs. describe method discuss many challenges encountered. have loaded data into an online database to make it accessible anyone explore we our own first explorations of it. The focus paper is thus twofold, covering pedagogical methodological aspects lists' construction, linguistic by-product project, database.

10.1007/s10579-013-9251-2 article EN cc-by Language Resources and Evaluation 2013-09-13

We present approaches for the identification of sentences understandable by second language learners Swedish, which can be used in automatically generated exercises based on corpora. In this work we merged methods and knowledge from machine learning-based readability research, rule-based studies Good Dictionary Examples learning syllabuses. The proposed selection have also been implemented as a module free web-based platform. Users use different parameters linguistic filters to personalize...

10.3115/v1/w14-1821 article EN 2014-01-01

Abstract This paper introduces MultiGEC, a dataset for multilingual Grammatical Error Correction (GEC) in twelve European languages: Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian. MultiGEC distinguishes itself from previous GEC datasets that it covers several underrepresented languages, which we argue should be included resources used to train models Natural Language Processing tasks which, as itself, have implications Learner...

10.1075/ijlcr.24033.mas article EN International Journal of Learner Corpus Research 2025-04-01

The Automated Evaluation of Scientific Writing, or AESW, is the task identifying sentences in need correction to ensure their appropriateness a scientific prose.The data set comes from professional editing company, VTeX, with two aligned versions same text -before and after -and covers variety textual infelicities that proofreaders have edited.While previous shared tasks focused solely on grammatical errors (Dale Kilgarriff, 2011;Dale et al., 2012; Ng 2013;Ng 2014), this time edits cover...

10.18653/v1/w16-0506 article EN cc-by 2016-01-01

This paper reports on the NLP4CALL shared task Multilingual Grammatical Error Detection (MultiGED-2023), which included five languages: Czech, English, German, Italian and Swedish. It is first organized by Computational SLA1 working group, whose aim to promote less represented languages in fields of Correction, other related fields. The MultiGED datasets have been produced based second language (L2) learner corpora for each particular language. In this we introduce as a whole, elaborate...

10.3384/ecp197001 article EN cc-by Linköping electronic conference proceedings 2023-05-16

The article presents the results of a survey on dictionary use in Europe, focusing general monolingual dictionaries. is broadest to date, covering close 10,000 users (and non-users) nearly thirty countries. Our covers varied user groups, going beyond students and translators who have tended dominate such studies thus far. was delivered via an online platform, language versions specific each target country. It completed by 9,562 respondents, over 300 respondents per country average. consisted...

10.1093/ijl/ecy022 article EN International Journal of Lexicography 2018-10-25

Corpora and web texts can become a rich language learning resource if we have means of assessing whether they are linguistically appropriate for learners at given proficiency level. In this paper, aim addressing issue by presenting the first approach predicting linguistic complexity Swedish second material on 5-point scale. After showing that traditional readability measure, Läsbarhetsindex (LIX), is not suitable task, propose supervised machine model, based range features, reliably classify...

10.48550/arxiv.1603.08868 preprint EN other-oa arXiv (Cornell University) 2016-01-01

In this paper we present work-in-progress where investigate the usefulness of previously created word lists to task single-word lexical complexity analysis and prediction level for learners Swedish as a second language. The used map each single CEFR level, consists predicting levels unseen words. contrast previous work on word-level complexity, experiment with topics additional features show that linking words significantly increases accuracy classification.

10.18653/v1/w18-0508 article EN cc-by 2018-01-01

We present a framework and its implementation relying on Natural Language Processing methods, which aims at the identification of exercise item candidates from corpora. The hybrid system combining heuristics machine learning methods includes number relevant selection criteria. focus two fundamental aspects: linguistic complexity dependence extracted sentences their original context. Previous work generation addressed these criteria only to limited extent, refined overall candidate sentence...

10.48550/arxiv.1706.03530 preprint EN other-oa arXiv (Cornell University) 2017-01-01

In our study we investigated second and foreign language (L2) sentence readability, an area little explored so far in the case of several languages, including Swedish. The outcome research consists two methods for selection from native corpora based on Natural Language Processing (NLP) machine learning (ML) techniques. approaches have been made available online within Larka, Intelligent CALL (ICALL) platform offering activities learners students linguistics. Such automatic suitable sentences...

10.14705/rpnet.2013.000164 article EN 2013-11-30

We present a new resource for Swedish, SweLL, corpus of Swedish Learner essays linked to learners' performance according the Common European Framework Reference (CEFR). SweLL consists three subcorpora - SpIn, SW1203 and Tisus, collected from different educational establishments. The common metadata all includes age, gender, native languages, time residence in Sweden, type written task. Depending on subcorpus, learner texts may contain additional information, such as text genres, topics,...

10.48550/arxiv.1604.06583 preprint EN other-oa arXiv (Cornell University) 2016-01-01

This article gives a short introduction to the Swedish Second Language Profile, tool that visualizes language in learner corpora from different angles, such as vocabulary, grammar and morphology. The is aimed at research on Acquisition, development of NLP models, teaching second language, automatic approaches for learning, number other fields.

10.3384/ecp205002 article EN cc-by Linköping electronic conference proceedings 2024-01-04
Coming Soon ...