- Natural Language Processing Techniques
- Topic Modeling
- Multimodal Machine Learning Applications
- Speech Recognition and Synthesis
- Language and cultural evolution
- Semantic Web and Ontologies
- Bayesian Modeling and Causal Inference
- Domain Adaptation and Few-Shot Learning
- Speech and dialogue systems
- Biomedical Text Mining and Ontologies
- Computational and Text Analysis Methods
- Text and Document Classification Technologies
- Linguistic research and analysis
- Syntax, Semantics, Linguistic Variation
- Multi-Agent Systems and Negotiation
- Machine Learning in Healthcare
- Categorization, perception, and language
- Advanced Graph Neural Networks
- Advanced Image and Video Retrieval Techniques
- Human Pose and Action Recognition
- Sentiment Analysis and Opinion Mining
- Linguistics and language evolution
- Philosophy and History of Science
- Language, Metaphor, and Cognition
- Advanced Text Analysis Techniques
IT University of Copenhagen
2023
Tokyo Institute of Technology
2023
Administration for Community Living
2023
American Jewish Committee
2023
University of Cambridge
2015-2023
Institute of Technology of Cambodia
2023
PRG S&Tech (South Korea)
2019-2022
University College London
2022
Bar-Ilan University
2021
University of Helsinki
2021
Abstract Spatial relations are a basic part of human cognition. However, they expressed in natural language variety ways, and previous work has suggested that current vision-and-language models (VLMs) struggle to capture relational information. In this paper, we present Visual Reasoning (VSR), dataset containing more than 10k text-image pairs with 66 types spatial English (e.g., under, front of, facing). While using seemingly simple annotation format, show how the includes challenging...
Distributional semantic models have become a mainstay in NLP, providing useful features for downstream tasks. However, assessing long-term progress requires explicit goals. In this paper, I take broad linguistic perspective, looking at how well current can deal with various challenges. Given stark differences between proposed different subfields, perspective is needed to see we could integrate them. conclude that, while insights guide the design of model architectures, future will require...
Vector space models have become popular in distributional semantics, despite the challenges they face capturing various semantic phenomena.We propose a novel probabilistic framework which draws on both formal semantics and recent advances machine learning.In particular, we separate predicates from entities refer to, allowing us to perform Bayesian inference based logical forms.We describe an implementation of this using combination Restricted Boltzmann Machines feedforward neural...
A broad-coverage corpus such as the Human Language Project envisioned by Abney and Bird (2010) would be a powerful resource for study of endangered languages. Existing corpora are limited in range languages covered, standardisation, or machine-readability. In this paper we present SeedLing, seed Project. We first survey existing efforts to compile cross-linguistic resources, then describe our own approach. To build foundation text Universal Corpus, crawl clean texts from several web sources...
Many approaches to sentiment analysis rely on a lexicon that labels words with prior polarity.This is particularly true for languages other than English, where labelled training data not easily available.Existing efforts produce such lexicons exist, and avoid duplicated effort, principled way combine multiple resources required.In this paper, we introduce Bayesian probabilistic model, which can simultaneously polarity scores from several sources estimate the quality of each source.We apply...
We propose a method for natural language generation, choosing the most representative output rather than likely output. By viewing generation process from voting theory perspective, we define representativeness using range and similarity measure. The proposed can be applied when generating any probabilistic model, including n-gram models neural network models. evaluate different measures on an image captioning task machine translation task, show that our generates longer more diverse...
Distributional Semantic Models (DSMs) construct vector representations of word meanings based on their contexts. Typically, the contexts a are defined as its closest neighbours, but they can also be retrieved from syntactic dependency relations. In this work, we propose new dependency-based DSM. The novelty our model lies in associating an independent meaning representation, matrix, with each dependency-label. This allows it to capture specifics relations between words and contexts, leading...
Semantic composition remains an open problem for vector space models of semantics. In this paper, we explain how the probabilistic graphical model used in framework Functional Distributional Semantics can be interpreted as a version theory. Building on this, various semantic phenomena recast terms conditional probabilities model. This connection between formal semantics and machine learning is helpful both directions: it gives us explicit mechanism modelling context-dependent meanings (a...
Functional Distributional Semantics provides a linguistically interpretable framework for distributional semantics, by representing the meaning of word as function (a binary classifier), instead vector. However, large number latent variables means that inference is computationally expensive, and training model therefore slow to converge. In this paper, I introduce Pixie Autoencoder, which augments generative with graph-convolutional neural network perform amortised variational inference....
Functional Distributional Semantics is a framework that aims to learn, from text, semantic representations which can be interpreted in terms of truth. Here we make two contributions this framework. The first show how type logical inference performed by evaluating conditional probabilities. second these calculations tractable means variational approximation. This approximation also enables faster convergence during training, allowing us close the gap with state-of-the-art vector space models...
Semantic composition remains an open problem for vector space models of semantics. In this paper, we explain how the probabilistic graphical model used in framework Functional Distributional Semantics can be interpreted as a version theory. Building on this, various semantic phenomena recast terms conditional probabilities model. This connection between formal semantics and machine learning is helpful both directions: it gives us explicit mechanism modelling context-dependent meanings (a...
Across languages, multiple consecutive adjectives modifying a noun (e.g. “the big red dog”) follow certain unmarked ordering rules. While explanatory accounts have been put forward, much of the work done in this area has relied primarily on intuitive judgment native speakers, rather than corpus data. We present first purely corpus-driven model multi-lingual adjective form latent-variable that can accurately order across 24 different even when training and testing languages are different....
The performance of natural language generation systems has improved substantially with modern neural networks. At test time they typically employ beam search to avoid locally optimal but globally suboptimal predictions. However, due model errors, a larger size can lead deteriorating according the evaluation metric. For this reason, it is common rerank output search, relies on produce good set hypotheses, which limits potential gains. Other alternatives require changes training model,...
Word embeddings are an essential component in a wide range of natural language processing applications. However, distributional semantic models known to struggle when only small number context sentences available. Several methods have been proposed obtain higher-quality vectors for these words, leveraging both this information and sometimes the word forms themselves through hybrid approach. We show that current tasks do not suffice evaluate use word-form information, as such can easily...
Functional Distributional Semantics is a recently proposed framework for learning distributional semantics that provides linguistic interpretability. It models the meaning of word as binary classifier rather than numerical vector. In this work, we propose method to train model with grounded visual data. We it on Visual Genome dataset, which closer kind data encountered in human language acquisition large text corpus. On four external evaluation datasets, our outperforms previous work from Genome.
Functional Distributional Semantics is a framework that aims to learn, from text, semantic representations which can be interpreted in terms of truth. Here we make two contributions this framework. The first show how type logical inference performed by evaluating conditional probabilities. second these calculations tractable means variational approximation. This approximation also enables faster convergence during training, allowing us close the gap with state-of-the-art vector space models...
Functional Distributional Semantics provides a computationally tractable framework for learning truth-conditional semantics from corpus. Previous work in this has provided probabilistic version of first-order logic, recasting quantification as Bayesian inference. In paper, I show how the previous formulation gives trivial truth values when precise quantifier is used with vague predicates. propose an improved account, avoiding problem by treating predicate distribution over connect account to...