Shay B. Cohen

ORCID: 0000-0003-4753-8353
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Advanced Text Analysis Techniques
  • Text Readability and Simplification
  • Speech Recognition and Synthesis
  • Speech and dialogue systems
  • Machine Learning and Algorithms
  • Algorithms and Data Compression
  • Semantic Web and Ontologies
  • Neural Networks and Applications
  • Adversarial Robustness in Machine Learning
  • Multimodal Machine Learning Applications
  • Explainable Artificial Intelligence (XAI)
  • Biomedical Text Mining and Ontologies
  • semigroups and automata theory
  • Text and Document Classification Technologies
  • Advanced Graph Neural Networks
  • Complex Network Analysis Techniques
  • Software Engineering Research
  • Machine Learning in Bioinformatics
  • Multi-Agent Systems and Negotiation
  • Aerospace Engineering and Control Systems
  • Domain Adaptation and Few-Shot Learning
  • Language and cultural evolution
  • DNA and Biological Computing

University of Edinburgh
2016-2025

Edinburgh College
2020-2023

IT University of Copenhagen
2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

Tel Aviv University
2005-2021

Israel Institute for Biological Research
2018-2021

Language Science (South Korea)
2008-2021

Bar-Ilan University
2021

We introduce “extreme summarization”, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach. The idea is to create short, one-sentence news summary answering the question “What article about?”. collect real-world, large-scale dataset this by harvesting online articles from British Broadcasting Corporation (BBC). propose novel model conditioned on article’s topics based entirely convolutional neural networks....

10.18653/v1/d18-1206 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Shashi Narayan, Shay B. Cohen, Mirella Lapata. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1158 article EN cc-by 2018-01-01

Stock movement prediction is a challenging problem: the market highly stochastic, and we make temporally-dependent predictions from chaotic data. We treat these three complexities present novel deep generative model jointly exploiting text price signals for this task. Unlike case with discriminative or topic modeling, our introduces recurrent, continuous latent variables better treatment of stochasticity, uses neural variational inference to address intractable posterior inference. also...

10.18653/v1/p18-1183 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Meaning Representation (AMR) is a semantic representation for natural language that embeds annotations related to traditional tasks such as named entity recognition, role labeling, word sense disambiguation and co-reference resolution. We describe transition-based parser AMR parses sentences left-to-right, in linear time. further propose test-suite assesses specific subtasks are helpful comparing parsers, show our competitive with the state of art on LDC2015E86 dataset it outperforms...

10.18653/v1/e17-1051 article EN cc-by 2017-01-01

This article deals with adversarial attacks towards deep learning systems for Natural Language Processing (NLP), in the context of privacy protection. We study a specific type attack: an attacker eavesdrops on hidden representations neural text classifier and tries to recover information about input text. Such scenario may arise situations when computation network is shared across multiple devices, e.g. some representation computed by user's device sent cloud-based model. measure ability...

10.18653/v1/d18-1001 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Marco Damonte, Shay B. Cohen. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1366 article EN 2019-01-01

It is well-known that abstractive summaries are subject to hallucination—including material not supported by the original text. While can be made hallucination-free limiting them general phrases, such would fail very informative. Alternatively, one try avoid hallucinations verifying any specific entities in summary appear text a similar context. This approach taken our system, Herman. The system learns recognize and verify quantity (dates, numbers, sums of money, etc.) beam-worth produced...

10.18653/v1/2020.findings-emnlp.203 article EN 2020-01-01

We present and study the contribution-selection algorithm (CSA), a novel for feature selection. The is based on multiperturbation shapley analysis (MSA), framework that relies game theory to estimate usefulness. iteratively estimates usefulness of features selects them accordingly, using either forward selection or backward elimination. It can optimize various performance measures over unseen data such as accuracy, balanced error rate, area under receiver-operator-characteristic curve....

10.1162/neco.2007.19.7.1939 article EN Neural Computation 2007-05-24

We present a family of priors over probabilistic grammar weights, called the shared logistic normal distribution. This extends partitioned distribution, enabling factored covariance between probabilities different derivation events in grammar, providing new way to encode prior knowledge about an unknown grammar. describe variational EM algorithm for learning based on this priors. then experiment with unsupervised dependency induction and show significant improvements using our model both...

10.3115/1620754.1620766 article EN 2009-01-01

We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split complex into meaning preserving sequence of shorter sentences. Like simplification, splitting-and-rephrasing has potential benefiting both natural language processing and societal applications. Because sentences are generally better processed by NLP systems, it could be used as preprocessing step which facilitates improves performance parsers, semantic role labellers machine translation systems. It...

10.18653/v1/d17-1064 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Marco Damonte, Shay B. Cohen. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1104 article EN cc-by 2018-01-01

Most extractive summarization methods focus on the main body of document from which sentences need to be extracted. However, gist may lie in side information, such as title and image captions are often available for newswire articles. We propose explore information context single-document summarization. develop a framework composed hierarchical encoder an attention-based extractor with attention over information. evaluate our model large scale news dataset. show that consistently outperforms...

10.48550/arxiv.1704.04530 preprint EN other-oa arXiv (Cornell University) 2017-01-01

The goal of medical relation extraction is to detect relations among entities, such as genes, mutations and drugs in texts. Dependency tree structures have been proven useful for this task. Existing approaches leverage off-the-shelf dependency parsers obtain a syntactic or forest the text. However, domain, low parsing accuracy may lead error propagation downstream pipeline. In work, we propose novel model which treats structure latent variable induces it from unstructured text an end-to-end...

10.24963/ijcai.2020/505 article EN 2020-07-01

Afonso Mendes, Shashi Narayan, Sebastião Miranda, Zita Marinho, André F. T. Martins, Shay B. Cohen. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1397 article EN 2019-01-01

We introduce an open-domain neural semantic parser which generates formal meaning representations in the style of Discourse Representation Theory (DRT; Kamp and Reyle 1993). propose a method transforms Structures (DRSs) to trees develop structure-aware model decomposes decoding process into three stages: basic DRS structure prediction, condition prediction (i.e., predicates relations), referent variables). Experimental results on Groningen Meaning Bank (GMB) show that our outperforms...

10.18653/v1/p18-1040 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01
Coming Soon ...