Mike Lewis

ORCID: 0000-0003-0679-6612
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Multimodal Machine Learning Applications
  • Speech and dialogue systems
  • Speech Recognition and Synthesis
  • Text Readability and Simplification
  • Multi-Agent Systems and Negotiation
  • Modular Robots and Swarm Intelligence
  • Distributed Control Multi-Agent Systems
  • Semantic Web and Ontologies
  • Domain Adaptation and Few-Shot Learning
  • Robotic Path Planning Algorithms
  • AI-based Problem Solving and Planning
  • Robotics and Sensor-Based Localization
  • Artificial Intelligence in Games
  • Robotics and Automated Systems
  • Reinforcement Learning in Robotics
  • Advanced Text Analysis Techniques
  • Robot Manipulation and Learning
  • Evacuation and Crowd Dynamics
  • Human Pose and Action Recognition
  • Machine Learning and Algorithms
  • Educational Assessment and Pedagogy
  • Biomedical Text Mining and Ontologies
  • Artificial Intelligence in Law

IT University of Copenhagen
2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

Meta (Israel)
2017-2022

University of Southern California
2022

University of Washington
2015-2022

University of California, Irvine
2022

Allen Institute for Artificial Intelligence
2022

Southern California University for Professional Studies
2022

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training computationally expensive, often done on private datasets of sizes, and, as we will show, hyperparameter choices have impact the final results. We present a replication study BERT (Devlin et al., 2019) that carefully measures many key hyperparameters and training data size. find was significantly undertrained, can match or exceed every published...

10.48550/arxiv.1907.11692 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer. Proceedings of the 58th Annual Meeting Association for Computational Linguistics. 2020.

10.18653/v1/2020.acl-main.703 article EN cc-by 2020-01-01

Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, ability access precisely manipulate is still limited, hence knowledge-intensive tasks, performance lags behind task-specific architectures. Additionally, providing provenance for decisions updating world remain open research problems. Pre-trained with a differentiable mechanism explicit non-parametric memory...

10.48550/arxiv.2005.11401 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. collect large dataset 300K human-written stories paired with writing prompts from an online forum. Our enables hierarchical generation, where the model first generates premise, then transforms it into passage text. gain further improvements novel form fusion improves relevance to prompt, adding new gated multi-scale self-attention mechanism long-range context. Experiments show...

10.18653/v1/p18-1082 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector. The key idea is to directly consider spans in document as potential mentions learn distributions over possible antecedents for each. computes span embeddings combine context-dependent boundary representations with head-finding attention mechanism. It trained maximize marginal likelihood of gold...

10.18653/v1/d17-1018 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART—a sequence-to-sequence auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective (Lewis et al., 2019 ). mBART is first method for complete model by full texts multiple languages, whereas previous approaches have focused only encoder, decoder, or reconstructing parts text....

10.1162/tacl_a_00343 article EN cc-by Transactions of the Association for Computational Linguistics 2020-11-25

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART -- sequence-to-sequence auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. is one first methods for complete model by full texts multiple languages, while previous approaches have focused only encoder, decoder, or reconstructing parts text. Pre-training allows it...

10.48550/arxiv.2001.08210 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of art, along with detailed analyses to reveal its strengths and limitations. use highway BiLSTM architecture constrained decoding, while observing number recent best practices initialization regularization. Our 8-layer ensemble achieves 83.2 F1 on theCoNLL 2005 test set 83.4 CoNLL 2012, roughly 10% relative error reduction over previous art. Extensive empirical analysis these gains...

10.18653/v1/p17-1044 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

This paper presents USARSim, an open source high fidelity robot simulator that can be used both for research and education. USARSim offers many characteristics differentiate it from most existing simulators. Most notably, constitutes the simulation engine to run virtual robots competition within Robocup initiative. We describe its general architecture, examples of utilization, provide a comprehensive overview those interested in simulations education, competitions.

10.1109/robot.2007.363180 article EN Proceedings - IEEE International Conference on Robotics and Automation/Proceedings 2007-04-01

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions. Negotiations require complex communication and reasoning skills, but success is easy measure, making this an interesting task for AI. We gather a large dataset human-human negotiations multi-issue bargaining task, who cannot observe each other’s reward functions must reach agreement (or deal) via natural language dialogue. For the first time, we show it possible...

10.18653/v1/d17-1259 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Practical applications of abstractive summarization models are limited by frequent factual inconsistencies with respect to their input. Existing automatic evaluation metrics for largely insensitive such errors. We propose QAGS (pronounced “kags”), an protocol that is designed identify in a generated summary. based on the intuition if we ask questions about summary and its source, will receive similar answers factually consistent source. To evaluate QAGS, collect human judgments consistency...

10.18653/v1/2020.acl-main.450 preprint EN cc-by 2020-01-01

Sebastian Schuster, Sonal Gupta, Rushin Shah, Mike Lewis. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1380 article EN 2019-01-01

This paper introduces the task of questionanswer driven semantic role labeling (QA-SRL), where question-answer pairs are used to represent predicate-argument structure.For example, verb "introduce" in previous sentence would be labeled with questions "What is introduced?", and something?", each paired phrase from that gives correct answer.Posing problem this way allows themselves define set possible roles, without need for predefined frame or thematic ontologies.It also scalable data...

10.18653/v1/d15-1076 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

We present a new visual reasoning language dataset, containing 92,244 pairs of examples natural statements grounded in synthetic images with 3,962 unique sentences. describe method crowdsourcing linguistically-diverse data, and an analysis our data. The data demonstrates broad set linguistic phenomena, requiring set-theoretic reasoning. experiment various models, show the presents strong challenge for future research.

10.18653/v1/p17-2034 article EN cc-by 2017-01-01

Writers often rely on plans or sketches to write long stories, but most current language models generate word by from left right. We explore coarse-to-fine for creating narrative texts of several hundred words, and introduce new which decompose stories abstracting over actions entities. The model first generates the predicate-argument structure text, where different mentions same entity are marked with placeholder tokens. It then a surface realization structure, finally replaces placeholders...

10.18653/v1/p19-1254 preprint EN 2019-01-01

Task oriented dialog systems typically first parse user utterances to semantic frames comprised of intents and slots. Previous work on task intent slot-filling has been restricted one per query slot label token, thus cannot model complex compositional requests. Alternative parsing have represented queries as logical forms, but these are challenging annotate parse. We propose a hierarchical annotation scheme for that allows the representation queries, can be efficiently accurately parsed by...

10.18653/v1/d18-1300 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Despite much progress in training artificial intelligence (AI) systems to imitate human language, building agents that use language communicate intentionally with humans interactive environments remains a major challenge. We introduce Cicero, the first AI agent achieve human-level performance Diplomacy, strategy game involving both cooperation and competition emphasizes natural negotiation tactical coordination between seven players. Cicero integrates model planning reinforcement learning...

10.1126/science.ade9097 article EN Science 2022-11-22

Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.201 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, Hannaneh Hajishirzi. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.

10.18653/v1/2023.emnlp-main.741 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

We introduce a new approach to semantics which combines the benefits of distributional and formal logical semantics. Distributional models have been successful in modelling meanings content words, but is necessary adequately represent many function words. follow mapping language representations, differ that relational constants used are induced by offline clustering at level predicate-argument structure. Our algorithm highly scalable, allowing us run on corpora size Gigaword. Different...

10.1162/tacl_a_00219 article EN cc-by Transactions of the Association for Computational Linguistics 2013-12-01

We explore story generation: creative systems that can build coherent and fluent passages of text about a topic. collect large dataset 300K human-written stories paired with writing prompts from an online forum. Our enables hierarchical generation, where the model first generates premise, then transforms it into passage text. gain further improvements novel form fusion improves relevance to prompt, adding new gated multi-scale self-attention mechanism long-range context. Experiments show...

10.48550/arxiv.1805.04833 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We introduce a new CCG parsing model which is factored on lexical category assignments.Parsing then simply deterministic search for the most probable sequence that supports derivation.The parser extremely simple, with tiny feature set, no POS tagger, and statistical of derivation or dependencies.Formulating in this way allows highly effective heuristic A * parsing, makes fast.Compared to standard C&C parser, our more accurate out-of-domain, four times faster, has higher coverage, greatly...

10.3115/v1/d14-1107 article EN cc-by 2014-01-01
Coming Soon ...