NFDI4DS | UHH-SEMS - Publication Details

Trevor Cohn

ORCID: 0000-0003-4363-1673

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5078530959

Research Areas

Topic Modeling
Natural Language Processing Techniques
Multimodal Machine Learning Applications
Speech Recognition and Synthesis
Advanced Text Analysis Techniques
Text Readability and Simplification
Speech and dialogue systems
Adversarial Robustness in Machine Learning
Biomedical Text Mining and Ontologies
Algorithms and Data Compression
Hate Speech and Cyberbullying Detection
Semantic Web and Ontologies
Authorship Attribution and Profiling
Sentiment Analysis and Opinion Mining
Computational and Text Analysis Methods
Domain Adaptation and Few-Shot Learning
Ethics and Social Impacts of AI
Gaussian Processes and Bayesian Inference
Misinformation and Its Impacts
Complex Network Analysis Techniques
Human Mobility and Location-Based Analysis
Machine Learning and Data Classification
Geographic Information Systems Studies
Text and Document Classification Technologies
Software Engineering Research

The University of Melbourne
2015-2024

IT University of Copenhagen
2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

Australian National University
2023

Iran University of Science and Technology
2022

Information Technology University
2021

Amazon (Germany)
2021

Twitter (United States)
2021

DyNet: The Dynamic Neural Network Toolkit

OPENALEX - Publications

Graham Neubig Chris Dyer Yoav Goldberg Austin Matthews Waleed Ammar and 20 more

We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of structure. In the static strategy that is used in toolkits like Theano, CNTK, and TensorFlow, user first defines computation graph (a symbolic representation computation), then examples are fed into an engine executes this computes its derivatives. DyNet's strategy, construction mostly transparent, being implicitly constructed by executing procedural code outputs, free to use different...

10.48550/arxiv.1701.03980 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Graph-to-Sequence Learning using Gated Graph Neural Networks

OPENALEX - Publications

Daniel Beck Gholamreza Haffari Trevor Cohn

Many NLP applications can be framed as a graph-to-sequence learning problem. Previous work proposing neural architectures on obtained promising results compared to grammar-based approaches but still rely linearisation heuristics and/or standard recurrent networks achieve the best performance. In this propose new model that encodes full structural information contained in graph. Our architecture couples recently proposed Gated Graph Neural Networks with an input transformation allows nodes...

10.18653/v1/p18-1026 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Low Resource Dependency Parsing: Cross-lingual Parameter Sharing in a Neural Network Parser

OPENALEX - Publications

Long Duong Trevor Cohn Steven Bird Paul Cook

Long Duong, Trevor Cohn, Steven Bird, Paul Cook. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2139 article EN 2015-01-01

Learning how to Active Learn: A Deep Reinforcement Learning Approach

OPENALEX - Publications

Meng Fang Yuan Li Trevor Cohn

Active learning aims to select a small subset of data for annotation such that classifier learned on the is highly accurate. This usually done using heuristic selection methods, however effectiveness methods limited and moreover, performance heuristics varies between datasets. To address these shortcomings, we introduce novel formulation by reframing active as reinforcement problem explicitly policy, where policy takes role heuristic. Importantly, our method allows simulation one language be...

10.18653/v1/d17-1063 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Iterative Back-Translation for Neural Machine Translation

OPENALEX - Publications

Vu Cong Duy Hoang Philipp Koehn Gholamreza Haffari Trevor Cohn

We present iterative back-translation, a method for generating increasingly better synthetic parallel data from monolingual to train neural machine translation systems. Our proposed is very simple yet effective and highly applicable in practice. demonstrate improvements quality both high low resourced scenarios, including the best reported BLEU scores WMT 2017 German↔English tasks.

10.18653/v1/w18-2703 article EN cc-by 2018-01-01

Massively Multilingual Transfer for NER

OPENALEX - Publications

Afshin Rahimi Li Yan Yuan Trevor Cohn

In cross-lingual transfer, NLP models over one or more source languages are applied to a low-resource target language. While most prior work has used single model few carefully selected models, here we consider "massive" setting with many such models. This raises the problem of poor particularly from distant languages. We propose two techniques for modulating suitable zero-shot few-shot learning, respectively. Evaluating on named entity recognition, show that our much effective than strong...

10.18653/v1/p19-1015 article EN cc-by 2019-01-01

Incorporating Structural Alignment Biases into an Attentional Neural Translation Model

OPENALEX - Publications

Trevor Cohn Cong Duy Vu Hoang Ekaterina Vymolova Kaisheng Yao Chris Dyer and 1 more

Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional models. However their modelling formulation is overly simplistic, and omits several key inductive biases built into In this paper we extend the attentional neural model to include structural from word based alignment models, including positional bias, Markov conditioning, fertility agreement over directions. We show improvements a baseline standard phrase-based language pairs,...

10.18653/v1/n16-1102 article EN Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics

OPENALEX - Publications

Nitika Mathur Timothy Baldwin Trevor Cohn

Automatic metrics are fundamental for the development and evaluation of machine translation systems. Judging whether, to what extent, automatic concur with gold standard human is not a straightforward problem. We show that current methods judging highly sensitive translations used assessment, particularly presence outliers, which often leads falsely confident conclusions about metric's efficacy. Finally, we turn pairwise system ranking, developing method thresholding performance improvement...

10.18653/v1/2020.acl-main.448 article EN cc-by 2020-01-01

Semi-supervised User Geolocation via Graph Convolutional Networks

OPENALEX - Publications

Afshin Rahimi Trevor Cohn Timothy Baldwin

Social media user geolocation is vital to many applications such as event detection. In this paper, we propose GCN, a multiview model based on Graph Convolutional Networks, that uses both text and network context. We compare GCN the state-of-the-art, two baselines propose, show our achieves or competitive with state-of-the-art over three benchmark datasets when sufficient supervision available. also evaluate under minimal scenario, it outperforms baselines. find highway gates are essential...

10.18653/v1/p18-1187 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Sentence compression beyond word deletion

OPENALEX - Publications

Trevor Cohn Mirella Lapata

In this paper we generalise the sentence compression task.Rather than simply shorten a by deleting words or constituents, as in previous work, rewrite it using additional operations such substitution, reordering, and insertion.We present new corpus that is suited to our task discriminative tree-totree transduction model can naturally account for structural lexical mismatches.The incorporates novel grammar extraction method, uses language coherent output, be easily tuned wide range of...

10.3115/1599081.1599099 article EN 2008-01-01

An Attentional Model for Speech Translation Without Transcription

OPENALEX - Publications

Long Duong Antonios Anastasopoulos David Chiang Steven Bird Trevor Cohn

Long Duong, Antonios Anastasopoulos, David Chiang, Steven Bird, Trevor Cohn. Proceedings of the 2016 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2016.

10.18653/v1/n16-1109 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2016-01-01

Towards Robust and Privacy-preserving Text Representations

OPENALEX - Publications

Yitong Li Timothy Baldwin Trevor Cohn

Written text often provides sufficient clues to identify the author, their gender, age, and other important attributes. Consequently, authorship of training evaluation corpora can have unforeseen impacts, including differing model performance for different user groups, as well privacy implications. In this paper, we propose an approach explicitly obscure author characteristics at time, such that representations learned are invariant these Evaluating on two tasks, show leads increased in...

10.18653/v1/p18-2005 article EN cc-by 2018-01-01

Cross-Lingual Word Embeddings for Low-Resource Language Modeling

OPENALEX - Publications

Oliver Adams Adam J. Makarucha Graham Neubig Steven Bird Trevor Cohn

Oliver Adams, Adam Makarucha, Graham Neubig, Steven Bird, Trevor Cohn. Proceedings of the 15th Conference European Chapter Association for Computational Linguistics: Volume 1, Long Papers. 2017.

10.18653/v1/e17-1088 article EN cc-by 2017-01-01

Discourse-aware rumour stance classification in social media using sequential classifiers

OPENALEX - Publications

Arkaitz Zubiaga Elena Kochkina Maria Liakata Rob Procter Michał Łukasik and 3 more

10.1016/j.ipm.2017.11.009 article EN Information Processing & Management 2017-12-06

Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter

OPENALEX - Publications

Michał Łukasik P. K. Srijith Duy Vu Kalina Bontcheva Arkaitz Zubiaga and 1 more

Michal Lukasik, P. K. Srijith, Duy Vu, Kalina Bontcheva, Arkaitz Zubiaga, Trevor Cohn. Proceedings of the 54th Annual Meeting Association for Computational Linguistics (Volume 2: Short Papers). 2016.

10.18653/v1/p16-2064 article EN cc-by 2016-01-01

Take and Took, Gaggle and Goose, Book and Read: Evaluating the Utility of Vector Differences for Lexical Relation Learning

OPENALEX - Publications

Ekaterina Vylomova Laura Rimell Trevor Cohn Timothy Baldwin

Recent work has shown that simple vector subtraction over word embeddings is surprisingly effective at capturing different lexical relations, despite lacking explicit supervision.Prior evaluated this intriguing result using a analogy prediction formulation and hand-selected but the generality of finding broader range relation types learning settings not been evaluated.In paper, we carry out such an evaluation in two settings:(1) spectral clustering to induce (2) supervised classify...

10.18653/v1/p16-1158 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016-01-01

Where's @wally?

OPENALEX - Publications

Dominic Rout Kalina Bontcheva Daniel Preoţiuc-Pietro Trevor Cohn

This paper presents an approach to geolocating users of online social networks, based solely on their 'friendship' connections. We observe that interact more regularly with those closer themselves and hypothesise that, in many cases, a person's network is sufficient reveal location.

10.1145/2481492.2481494 article EN 2013-05-01

Learning Crosslingual Word Embeddings without Bilingual Corpora

OPENALEX - Publications

Long Duong Hiroshi Kanayama Tengfei Ma Steven Bird Trevor Cohn

Crosslingual word embeddings represent lexical items from different languages in the same vector space, enabling transfer of NLP tools. However, previous attempts had expensive resource requirements, difficulty incorporating monolingual data or were unable to handle polysemy. We address these drawbacks our method which takes advantage a high coverage dictionary an EM style training algorithm over corpora two languages. Our model achieves state-of-the-art performance on bilingual lexicon...

10.18653/v1/d16-1136 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary

OPENALEX - Publications

Meng Fang Trevor Cohn

Cross-lingual model transfer is a compelling and popular method for predicting annotations in low-resource language, whereby parallel corpora provide bridge to high-resource its associated annotated corpora. However, data not readily available many languages, limiting the applicability of these approaches. We address drawbacks our framework which takes advantage cross-lingual word embeddings trained solely on high coverage dictionary. propose novel neural network joint training from both...

10.18653/v1/p17-2093 preprint EN cc-by 2017-01-01

Trendminer: An Architecture for Real Time Analysis of Social Media Text

OPENALEX - Publications

Daniel Preoţiuc-Pietro Sina Samangooei Trevor Cohn Nicholas Gibbins Mahesan Niranjan

The emergence of online social networks (OSNs) and the accompanying availability large amounts data, pose a number new natural language processing (NLP) computational challenges. Data from OSNs is different to data traditional sources (e.g. newswire). texts are short, noisy conversational. Another important issue that occurs in real-time streams, needing immediate analysis grounded time context. In this paper we describe open-source framework for efficient text streaming OSN (available at...

10.1609/icwsm.v6i3.14348 article EN Proceedings of the International AAAI Conference on Web and Social Media 2021-08-03

Discriminative word alignment with conditional random fields

OPENALEX - Publications

Phil Blunsom Trevor Cohn

In this paper we present a novel approach for inducing word alignments from sentence aligned data. We use Conditional Random Field (CRF), discriminative model, which is estimated on small supervised training set. The CRF conditioned both the source and target texts, thus allows of arbitrary overlapping features over these Moreover, has efficient decoding processes find globally optimal solutions.We apply alignment model to French-English Romanian-English language pairs. show how large number...

10.3115/1220175.1220184 article EN 2006-01-01

Coming Soon ...