- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Speech and dialogue systems
- Shape Memory Alloy Transformations
- Advanced Text Analysis Techniques
- Speech Recognition and Synthesis
- Magnetic Properties of Alloys
- Explainable Artificial Intelligence (XAI)
- Magnetic Properties and Applications
- Domain Adaptation and Few-Shot Learning
- Magnetic and transport properties of perovskites and related materials
- Sentiment Analysis and Opinion Mining
- Biomedical Text Mining and Ontologies
- Heusler alloys: electronic and magnetic properties
- Advanced Graph Neural Networks
- Magnetic properties of thin films
- Adversarial Robustness in Machine Learning
- Machine Learning in Healthcare
- Text and Document Classification Technologies
- Machine Learning and Algorithms
- Software Engineering Research
- Neural Networks and Applications
- Bayesian Modeling and Causal Inference
University of Luxembourg
2016-2024
University of Edinburgh
2017-2023
University of Amsterdam
2014-2023
IT University of Copenhagen
2023
Ludwig-Maximilians-Universität München
2023
Munich Center for Machine Learning
2023
Poltava V.G. Korolenko National Pedagogical University
2022
Innopolis University
2022
Language Science (South Korea)
2019-2021
Heidelberg University
2020
Semantic role labeling (SRL) is the task of identifying predicate-argument structure a sentence. It typically regarded as an important step in standard NLP pipeline. As semantic representations are closely related to syntactic ones, we exploit information our model. We propose version graph convolutional networks (GCNs), recent class neural operating on graphs, suited model dependency graphs. GCNs over trees used sentence encoders, producing latent feature words observe that GCN layers...
Multi-head self-attention is a key component of the Transformer, state-of-the-art architecture for neural machine translation. In this work we evaluate contribution made by individual attention heads to overall performance model and analyze roles played them in encoder. We find that most important confident play consistent often linguistically-interpretable roles. When pruning using method based on stochastic gates differentiable relaxation L0 penalty, observe specialized are last be pruned....
In this paper we present a novel framework for extracting the ratable aspects of objects from online user reviews. Extracting such is an important challenge in automatically mining product opinions web and generating opinion-based summaries reviews [18, 19, 7, 12, 27, 36, 21]. Our models are based on extensions to standard topic modeling methods as LDA PLSA induce multi-grain topics. We argue that more appropriate our task since tend produce topics correspond global properties (e.g., brand...
We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation. rely on graph-convolutional networks (GCNs), recent class of developed modeling graph-structured data. Our GCNs use predicted dependency trees source sentences produce representations words (i.e. hidden states the encoder) that are sensitive their neighborhoods. take word as input output, so they can easily be incorporated layers standard...
Humans use rich natural language to describe and communicate visual perceptions. In order provide descriptions for content, this paper combines two important ingredients. First, we generate a semantic representation of the content including e.g. object activity labels. To predict learn CRF model relationships between different components input. And second, propose formulate generation as machine translation problem using source generated sentences target language. For exploit power parallel...
Standard machine translation systems process sentences in isolation and hence ignore extra-sentential information, even though extended context can both prevent mistakes ambiguous cases improve coherence. We introduce a context-aware neural model designed such way that the flow of information from to be controlled analyzed. experiment with an English-Russian subtitles dataset, observe much what is captured by our deals improving pronoun translation. measure correspondences between induced...
Nicola De Cao, Wilker Aziz, Ivan Titov. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
Diego Marcheggiani, Jasmijn Bastings, Ivan Titov. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018.
Entity linking involves aligning textual mentions of named entities to their corresponding entries in a knowledge base. systems often exploit relations between document (e.g., coreference) decide if the decisions are compatible. Unlike previous approaches, which relied on supervised or heuristics predict these relations, we treat as latent variables our neural entity-linking model. We induce without any supervision while optimizing system an end-to-end fashion. Our multi-relational model...
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that NMT requires stronger modeling capacity support language pairs with varying typological characteristics, overcome bottleneck via language-specific components deepening architectures. identify the off-target issue (i.e. translating into a wrong target language) as...
The success of neural networks comes hand in with a desire for more interpretability. We focus on text classifiers and make them interpretable by having provide justification–a rationale–for their predictions. approach this problem jointly training two network models: latent model that selects rationale (i.e. short informative part the input text), classifier learns from words alone. Previous work proposed to assign binary masks positions promote selections via sparsity-inducing penalties...
Though machine translation errors caused by the lack of context beyond one sentence have long been acknowledged, development context-aware NMT systems is hampered several problems. Firstly, standard metrics are not sensitive to improvements in consistency document-level translations. Secondly, previous work on assumed that sentence-aligned parallel data consisted complete documents while most practical scenarios such constitutes only a fraction available data. To address first issue, we...
To measure how well pretrained representations encode some linguistic property, it is common to use accuracy of a probe, i.e. classifier trained predict the property from representations. Despite widespread adoption probes, differences in their fail adequately reflect For example, they do not substantially favour over randomly initialized ones. Analogously, can be similar when probing for genuine labels and random synthetic tasks. see reasonable with respect these baselines, previous work...
Learning to communicate through interaction, rather than relying on explicit supervision, is often considered a prerequisite for developing general AI. We study setting where two agents engage in playing referential game and, from scratch, develop communication protocol necessary succeed this game. Unlike previous work, we require that messages they exchange, both at train and test time, are the form of language (i.e. sequences discrete symbols). compare reinforcement learning approach one...
meaning representations (AMRs) are broad-coverage sentence-level semantic representations. AMRs represent sentences as rooted labeled directed acyclic graphs. AMR parsing is challenging partly due to the lack of annotated alignments between nodes in graphs and words corresponding sentences. We introduce a neural parser which treats latent variables within joint probabilistic model concepts, relations alignments. As exact inference requires marginalizing over infeasible, we use variational...
Elena Voita, Rico Sennrich, Ivan Titov. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.
We introduce a simple and accurate neural model for dependency-based semantic role labeling. Our predicts predicate-argument dependencies relying on states of bidirectional LSTM encoder. The labeler achieves competitive performance English, even without any kind syntactic information only using local inference. However, when automatically predicted part-of-speech tags are provided as input, it substantially outperforms all previous models approaches the best reported results English...
Opinion summarization is the task of automatically creating summaries that reflect subjective information expressed in multiple documents, such as product reviews. While majority previous work has focused on extractive setting, i.e., selecting fragments from input reviews to produce a summary, we let model generate novel sentences and hence abstractive summaries. Recent progress seen development supervised models which rely large quantities document-summary pairs. Since training data...
In this paper we conceptualize singledocument extractive summarization as a tree induction problem.In contrast to previous approaches (Marcu, 1999;Yoshida et al., 2014) which have relied on linguistically motivated document representations generate summaries, our model induces multi-root dependency while predicting the output summary.Each root node in is summary sentence, and subtrees attached it are sentences whose content relates or explains sentence.We design new iterative refinement...
Knowledge graphs enable a wide variety of applications, including question answering and information retrieval. Despite the great effort invested in their creation maintenance, even largest (e.g., Yago, DBPedia or Wikidata) remain incomplete. We introduce Relational Graph Convolutional Networks (R-GCNs) apply them to two standard knowledge base completion tasks: Link prediction (recovery missing facts, i.e. subject-predicate-object triples) entity classification attributes). R-GCNs are...