Aaron Steven White

ORCID: 0000-0003-0057-9246
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Semantic Web and Ontologies
  • Text Readability and Simplification
  • Syntax, Semantics, Linguistic Variation
  • Speech and dialogue systems
  • Advanced Text Analysis Techniques
  • Language, Discourse, Communication Strategies
  • Neurobiology of Language and Bilingualism
  • Language and cultural evolution
  • Multimodal Machine Learning Applications
  • Reading and Literacy Development
  • Speech Recognition and Synthesis
  • Language Development and Disorders
  • Logic, Reasoning, and Knowledge
  • Sentiment Analysis and Opinion Mining
  • EEG and Brain-Computer Interfaces
  • Neuroscience and Music Perception
  • Domain Adaptation and Few-Shot Learning
  • Biomedical Text Mining and Ontologies
  • Neural dynamics and brain function
  • Authorship Attribution and Profiling
  • Wikis in Education and Collaboration
  • Neural Networks and Applications
  • Data Quality and Management

University of Rochester
2018-2025

Mississippi State University
2024

Johns Hopkins University
2016-2021

University of Maryland, College Park
2015

Aaron Steven White, Drew Reisinger, Keisuke Sakaguchi, Tim Vieira, Sheng Zhang, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.

10.18653/v1/d16-1177 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

Adam Poliak, Aparajita Haldar, Rachel Rudinger, J. Edward Hu, Ellie Pavlick, Aaron Steven White, Benjamin Van Durme. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.

10.18653/v1/d18-1007 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

We develop a probabilistic model of S(emantic)-selection that encodes both the notion systematic mappings from semantic type signature to syntactic distribution—i.e., projection rules—and selectional noise—e.g., C(ategory)-selection, L(exical)-selection, and/or other independent processes. train this on data large-scale judgment study assessing acceptability 1,000 English clause-taking verbs in 50 distinct frames, finding infers coherent signatures. focus signatures relevant interrogative...

10.3765/salt.v26i0.3819 article EN Proceedings from Semantics and Linguistic Theory 2016-10-15

Rachel Rudinger, Aaron Steven White, Benjamin Van Durme. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 2018.

10.18653/v1/n18-1067 article EN cc-by 2018-01-01

Abstract Propositional attitude verbs, such as think and want , have long held interest for both theoretical linguists language acquisitionists because their syntactic, semantic, pragmatic properties display complex interactions that proven difficult to fully capture from either perspective. This paper explores the granularity with which these verbs’ semantic are recoverable syntactic distributions, using three behavioral experiments aimed at explicitly quantifying relationship between two...

10.1111/cogs.12512 article EN publisher-specific-oa Cognitive Science 2017-10-19

We ask whether text understanding has progressed to where we may extract event information through incremental refinement of bleached statements derived from annotation manuals. Such a capability would allow for the trivial construction and extension an extraction framework by intended end-users declarations such as, “Some person was born in some location at time.” introduce example model that employs statements, with experiments illustrating can events under closed ontologies generalize...

10.18653/v1/2020.spnlp-1.9 article EN cc-by 2020-01-01

We present a novel semantic framework for modeling temporal relations and event durations that maps pairs of events to real-valued scales. use this construct the largest dataset date, covering entirety Universal Dependencies English Web Treebank. train models jointly predicting fine-grained durations. report strong results on our data show efficacy transfer-learning approach categorical relations.

10.18653/v1/p19-1280 article EN cc-by 2019-01-01

Adam Poliak, Aparajita Haldar, Rachel Rudinger, J. Edward Hu, Ellie Pavlick, Aaron Steven White, Benjamin Van Durme. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 2018.

10.18653/v1/w18-5441 preprint EN cc-by 2018-01-01

Theories of clause selection that aim to explain the distribution interrogative and declarative complement clauses often take as a starting point predicates like think, believe, hope, fear are incompatible with complements. After discussing experimental evidence against generalizations on which these theories rest, I give corpus even core data faulty: in fact compatible complements, suggesting any theory predicting they should not be must jettisoned. EARLY ACCESS

10.3765/sp.14.6 article EN cc-by Semantics and Pragmatics 2021-06-04

We investigate which patterns of lexically triggered doxastic, bouletic, neg(ation)-raising, and veridicality inferences are (un)attested across clause-embedding verbs in English. To carry out this investigation, we use a multiview mixed effects mixture model to discover the inference captured three lexicon-scale judgment datasets: two existing datasets, MegaVeridicality MegaNegRaising, capture neg-raising wide swath English lexicon, new dataset, MegaIntensionality, similarly captures...

10.3765/salt.v31i0.5137 article EN Proceedings from Semantics and Linguistic Theory 2022-01-05

We investigate neural models’ ability to capture lexicosyntactic inferences: inferences triggered by the interaction of lexical and syntactic information. take task event factuality prediction as a case study build judgment dataset for all English clause-embedding verbs in various contexts. use this dataset, which we make publicly available, probe behavior current state-of-the-art systems, showing that these systems certain systematic errors are clearly visible through lens prediction.

10.18653/v1/d18-1501 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

We introduce five new natural language inference (NLI) datasets focused on temporal reasoning. recast four existing annotated for event duration—how long an lasts—and ordering—how events are temporally arranged—into more than one million NLI examples. use these to investigate how well neural models trained a popular corpus capture forms of

10.18653/v1/2020.findings-emnlp.363 article EN cc-by 2020-01-01

We investigate the relationship between frequency with which verbs are found in particular subcategorization frames and acceptability of those frames, focusing on subordinate clause-taking verbs, such as think, want, tell. show that verbs’ frame distributions poor predictors their frames—explaining, at best, less than ⅓ total information about across lexicon—and, further, common matrix factorization techniques used to model acquisition fare only marginally better.

10.5334/gjgl.1001 article EN cc-by Glossa a journal of general linguistics 2020-11-04

Patrick Xia, Guanghui Qin, Siddharth Vashishtha, Yunmo Chen, Tongfei Chandler May, Craig Harman, Kyle Rawlins, Aaron Steven White, Benjamin Van Durme. Proceedings of the 16th Conference European Chapter Association for Computational Linguistics: System Demonstrations. 2021.

10.18653/v1/2021.eacl-demos.19 article EN cc-by 2021-01-01

Mahsa Yarmohammadi, Shijie Wu, Marc Marone, Haoran Xu, Seth Ebner, Guanghui Qin, Yunmo Chen, Jialiang Guo, Craig Harman, Kenton Murray, Aaron Steven White, Mark Dredze, Benjamin Van Durme. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021.

10.18653/v1/2021.emnlp-main.149 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

We show that when analyzing data from inference judgment tasks, it can be important to incorporate into one's analysis regime an explicit representation of the semantics natural language prompt used guide participants on task. To demonstrate this, we conduct two experiments within existing experimental paradigm focused measuring factive inferences, while manipulating receive in small but semantically potent ways. In statistical model comparisons couched framework probabilistic dynamic...

10.3765/elm.3.5857 article EN cc-by Experiments in Linguistic Meaning 2025-01-24

In recent years, it has become clear that EEG indexes the comprehension of natural, narrative speech. One particularly compelling demonstration this fact can be seen by regressing responses to speech against measures how individual words in linguistically relate their preceding context. This approach produces a so-called temporal response function displays centro-parietal negativity reminiscent classic N400 component event-related potential. shortcoming previous implementations is they have...

10.1371/journal.pcbi.1013006 article EN cc-by PLoS Computational Biology 2025-04-28

Fine-tuning is known to improve NLP models by adapting an initial model trained on more plentiful but less domain-salient examples data in a target domain. Such domain adaptation typically done using one stage of fine-tuning. We demonstrate that gradually fine-tuning multi-stage process can yield substantial further gains and be applied without modifying the or learning objective.

10.48550/arxiv.2103.02205 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We present a novel semantic framework for modeling linguistic expressions of generalization— generic, habitual, and episodic statements—as combinations simple, real-valued referential properties predicates their arguments. use this to construct dataset covering the entirety Universal Dependencies English Web Treebank. probe efficacy type-level token-level information—including hand-engineered features static (GloVe) contextual (ELMo) word embeddings—for predicting generalization.

10.1162/tacl_a_00285 article EN cc-by Transactions of the Association for Computational Linguistics 2019-09-11

We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances UDS graph structures and annotate the with decompositional semantic attribute scores. also strong pipeline structure, show that our parser performs comparably while additionally performing prediction. By analyzing prediction errors, we find captures relationships between groups.

10.18653/v1/2020.acl-main.746 article EN 2020-01-01

We propose the semantic proto-role linking model, which jointly induces both predicate-specific roles and predicate-general proto-roles based on property likelihood judgments. use this model to empirically evaluate Dowty’s thematic theory.

10.18653/v1/e17-2015 article EN cc-by 2017-01-01

We present a novel iterative extraction model, IterX, for extracting complex relations, or templates, i.e., N-tuples representing mapping from named slots to spans of text within document. Documents may feature zero more instances template any given type, and the task entails identifying templates in document each template's slot values. Our imitation learning approach casts problem as Markov decision process (MDP), relieves need use predefined orders train an extractor. It leads...

10.18653/v1/2023.eacl-main.136 article EN cc-by 2023-01-01
Coming Soon ...