Alexey Romanov

ORCID: 0009-0004-0678-4456
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Authorship Attribution and Profiling
  • Semantic Web and Ontologies
  • Hate Speech and Cyberbullying Detection
  • Mechanics and Biomechanics Studies
  • Biomedical Text Mining and Ontologies
  • Multimodal Machine Learning Applications
  • Robotic Mechanisms and Dynamics
  • Engineering Technology and Methodologies
  • Software Engineering Research
  • Humor Studies and Applications
  • Discourse Analysis and Cultural Communication
  • Video Analysis and Summarization
  • Anomaly Detection Techniques and Applications
  • Social Media and Politics
  • Language, Communication, and Linguistic Studies
  • Handwritten Text Recognition Techniques
  • Opinion Dynamics and Social Influence
  • Ethics and Social Impacts of AI
  • Advanced Graph Neural Networks
  • Economic and Technological Systems Analysis
  • Domain Adaptation and Few-Shot Learning
  • Names, Identity, and Discrimination Research
  • Machine Learning in Healthcare

Ural State University of Economics
2024

Bauman Moscow State Technical University
2013-2023

Huawei Technologies (China)
2023

Institute of Machines Science
2021

Microsoft Research (United Kingdom)
2021

University of Southern California
2020

University of Massachusetts Lowell
2015-2019

Saarland University
2018

Georgia Institute of Technology
2018

Universitat Politècnica de València
2018

Olga Kovaleva, Alexey Romanov, Anna Rogers, Rumshisky. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1445 article EN cc-by 2019-01-01

State of the art models using deep neural networks have become very good in learning an accurate mapping from inputs to outputs. However, they still lack generalization capabilities conditions that differ ones encountered during training. This is even more challenging specialized, and knowledge intensive domains, where training data limited. To address this gap, we introduce MedNLI - a dataset annotated by doctors, performing natural language inference task (NLI), grounded medical history...

10.18653/v1/d18-1187 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

This paper demonstrates the effectiveness of a Long Short-Term Memory language model in our initial efforts to generate unconstrained rap lyrics.The goal this is lyrics that are similar style given rapper, but not identical existing lyrics: task ghostwriting.Unlike previous work, which defines explicit templates for lyric generation, its own rhyme scheme, line length, and verse length.Our experiments show produces better "ghostwritten" than baseline model.

10.18653/v1/d15-1221 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

We present a large-scale study of gender bias in occupation classification, task where the use machine learning may lead to negative outcomes on peoples' lives. analyze potential allocation harms that can result from semantic representation bias. To do so, we impact classification including explicit indicators---such as first names and pronouns---in different representations online biographies. Additionally, quantify remains when these indicators are "scrubbed," describe proxy behavior...

10.1145/3287560.3287572 preprint EN 2019-01-09

This paper describes a new shared task for humor understanding that attempts to eschew the ubiquitous binary approach detection and focus on comparative ranking instead. The is based dataset of funny tweets posted in response hashtags, collected from 'Hashtag Wars' segment TV show @midnight. results are evaluated two subtasks require participants generate either correct pairwise comparisons (subtask A), or B) terms how they are. 7 teams participated subtask A, 5 B. best accuracy A was 0.675....

10.18653/v1/s17-2004 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2017-01-01

In this paper, we propose to use a set of simple, uniform in architecture LSTM-based models recover different kinds temporal relations from text. Using the shortest dependency path between entities as input, same is used extract intra-sentence, cross-sentence, and document creation time relations. A "double-checking" technique reverses entity pairs classification, boosting recall positive cases reducing misclassifications opposite classes. An efficient pruning algorithm resolves conflicts...

10.18653/v1/d17-1092 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

In order to determine argument structure in text, one must understand how individual components of the overall are linked. This work presents first neural network-based approach link extraction mining. Specifically, we propose a novel architecture that applies Pointer Network sequence-to-sequence attention modeling structural prediction discourse parsing tasks. We then develop joint model extends this simultaneously address task and classification components. The proposed achieves...

10.18653/v1/d17-1143 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Alexey Romanov, Anna Rumshisky, Rogers, David Donahue. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1088 article EN 2019-01-01

There is a growing body of work that proposes methods for mitigating bias in machine learning systems. These typically rely on access to protected attributes such as race, gender, or age. However, this raises two significant challenges: (1) may not be available it legal use them, and (2) often desirable simultaneously consider multiple attributes, well their intersections. In the context occupation classification, we propose method discouraging correlation between predicted probability an...

10.48550/arxiv.1904.05233 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The paper proposes a novel machine learning-based approach to the pathfinding problem on extremely large graphs. This method leverages diffusion distance estimation via neural network and uses beam search for pathfinding. We demonstrate its efficiency by finding solutions 4x4x4 5x5x5 Rubik's cubes with unprecedentedly short solution lengths, outperforming all available solvers introducing first learning solver beyond 3x3x3 case. In particular, it surpasses every single case of combined best...

10.48550/arxiv.2502.13266 preprint EN arXiv (Cornell University) 2025-02-18

Alexey Romanov, Maria De-Arteaga, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, Anna Rumshisky, Adam Kalai. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1424 article EN 2019-01-01

This paper describes the winning system for SemEval-2017 Task 6: #HashtagWars: Learning a Sense of Humor. Humor detection has up until now been predominantly addressed using feature-based approaches. Our utilizes recurrent deep learning methods with dense embeddings to predict humorous tweets from @midnight show #HashtagWars. In order include both meaning and sound in analysis, GloVe are combined novel phonetic representation serve as input an LSTM component. The output is character-based...

10.18653/v1/s17-2010 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2017-01-01

This paper addresses the problem of representation learning. Using an autoencoder framework, we propose and evaluate several loss functions that can be used as alternative to commonly cross-entropy reconstruction loss. The proposed use similarities between words in embedding space, train any neural model for text generation. We show introduced amplify semantic diversity reconstructed sentences, while preserving original meaning input. test derived autoencoder-generated representations on...

10.18653/v1/d18-1525 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

In this work, we present a new dataset for computational humor, specifically comparative humor ranking, which attempts to eschew the ubiquitous binary approach detection. The consists of tweets that are humorous responses given hashtag. We describe motivation dataset, as well collection process, includes description our semi-automated system data collection. also initial experiments using both unsupervised and supervised approaches. Our best achieved 63.7% accuracy, suggesting task is much...

10.48550/arxiv.1612.03216 preprint EN other-oa arXiv (Cornell University) 2016-01-01

This article deals with the composite threat-practices that change interlocutors' dispositions of emotional states within communicative performative construct threat and may extend space this construct. The allows us to study organize I-speaker I-hearer's possible generated by threats in combination additional elements menasive space. number points cognitive complexity threat-acts. These can soften effect on state strengthen their complex pragmatic effect.

10.1016/j.sbspro.2015.10.030 article EN Procedia - Social and Behavioral Sciences 2015-10-01

This paper describes the SimiHawk system submission from UMass Lowell for core Semantic Textual Similarity task at SemEval-2016.We built four systems: a small featurebased that leverages word alignment and machine translation quality evaluation metrics, two end-to-end LSTM-based systems, an ensemble system.The LSTMbased systems used either simple LSTM architecture or Tree-LSTM structure.We found of three base feature-based model obtained best results, outperforming each model's correlation...

10.18653/v1/s16-1115 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2016-01-01

Language generation tasks that seek to mimic human ability use language creatively are difficult evaluate, since one must consider creativity, style, and other non-trivial aspects of the generated text. The goal this paper is develop evaluations methods for such task, ghostwriting rap lyrics, provide an explicit, quantifiable foundation goals future directions task. Ghostwriting produce text similar in style emulated artist, yet distinct content. We a novel evaluation methodology addresses...

10.18653/v1/w18-1604 article EN cc-by 2018-01-01
Coming Soon ...