Byron Wallace

ORCID: 0000-0003-2409-7735
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Mental Health Research Topics
  • Health, Environment, Cognitive Aging
  • Biomedical Text Mining and Ontologies
  • Machine Learning in Healthcare
  • Meta-analysis and systematic reviews
  • Explainable Artificial Intelligence (XAI)
  • Advanced Text Analysis Techniques
  • Text Readability and Simplification
  • Machine Learning and Data Classification
  • Machine Learning and Algorithms
  • Artificial Intelligence in Healthcare and Education
  • Computational and Text Analysis Methods
  • Text and Document Classification Technologies
  • Data Quality and Management
  • Scientific Computing and Data Management
  • Semantic Web and Ontologies
  • Mobile Crowdsensing and Crowdsourcing
  • Sentiment Analysis and Opinion Mining
  • Imbalanced Data Classification Techniques
  • Data Stream Mining Techniques
  • Artificial Intelligence in Healthcare
  • Software Engineering Research
  • Mental Health via Writing

Northeastern University
2015-2024

Universidad del Noreste
2016-2023

IT University of Copenhagen
2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

John Brown University
2013-2023

Carnegie Mellon University
2023

University of Massachusetts Amherst
2023

Accenture (Switzerland)
2023

The R environment provides a natural platform for developing new statistical methods due to the mathematical expressiveness of language, large number existing libraries, and active developer community. One drawback R, however, is learning curve; programming deterrent non-technical users, who typically prefer graphical user interfaces (GUIs) command line environments. Thus, while statisticians develop in practitioners are often behind terms techniques they use as rely on GUI applications....

10.18637/jss.v049.i05 article EN cc-by Journal of Statistical Software 2012-01-01

Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on the practically important task of sentence classification (kim 2014, kalchbrenner johnson 2014). However, these models require practitioners to specify an exact model architecture and set accompanying hyperparameters, including filter region size, regularization parameters, so on. It is currently unknown how sensitive changes in configurations for classification. We thus conduct a sensitivity...

10.48550/arxiv.1510.03820 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Meta-analysis is increasingly used as a key source of evidence synthesis to inform clinical practice. The theory and statistical foundations meta-analysis continually evolve, providing solutions many new challenging problems. In practice, most meta-analyses are performed in general packages or dedicated programs.Herein, we introduce Meta-Analyst, novel, powerful, intuitive, free program for the variety Meta-Analyst implemented C# atop Microsoft .NET framework, features graphical user...

10.1186/1471-2288-9-80 article EN cc-by BMC Medical Research Methodology 2009-12-01

Sarthak Jain, Byron C. Wallace. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1357 article EN 2019-01-01

Attention mechanisms have seen wide adoption in neural NLP models. In addition to improving predictive performance, these are often touted as affording transparency: models equipped with attention provide a distribution over attended-to input units, and this is presented (at least implicitly) communicating the relative importance of inputs. However, it unclear what relationship exists between weights model outputs. work, we perform extensive experiments across variety tasks that aim assess...

10.48550/arxiv.1902.10186 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Medical researchers looking for evidence pertinent to a specific clinical question must navigate an increasingly voluminous corpus of published literature. This data deluge has motivated the development machine learning and mining technologies facilitate efficient biomedical research. Despite obvious labor-saving potential these concomitant academic interest therein, however, adoption techniques by medical been relatively sluggish. One explanation this is that while many methods have...

10.1145/2110363.2110464 article EN 2012-01-28

Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace. Proceedings of the 58th Annual Meeting Association for Computational Linguistics. 2020.

10.18653/v1/2020.acl-main.408 article EN cc-by 2020-01-01

Summary Meta‐analysis and meta‐regression are statistical methods for synthesizing modelling the results of different studies, critical research synthesis tools in ecology evolutionary biology (E&E). However, many E&E researchers carry out meta‐analyses using software that is limited its functionality not easily updatable. It likely these limitations have slowed uptake new scope quality inferences from syntheses. We developed OpenMEE: Open Meta‐analyst Ecology Evolution to address...

10.1111/2041-210x.12708 article EN publisher-specific-oa Methods in Ecology and Evolution 2016-11-29

Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, part because the best way to make use of technology a typical workflow is unclear. In this work, we evaluate ML models RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained optimized support machine network on titles abstracts Cochrane Crowd set. evaluated an external dataset (Clinical...

10.1002/jrsm.1287 article EN cc-by Research Synthesis Methods 2018-01-09

Systematic reviews address a specific clinical question by unbiasedly assessing and analyzing the pertinent literature. Citation screening is time-consuming critical step in systematic reviews. Typically, reviewers must evaluate thousands of citations to identify articles eligible for given review. We explore application machine learning techniques semi-automate citation screening, thereby reducing reviewers' workload. present novel online classification strategy automatically discriminate...

10.1186/1471-2105-11-55 article EN cc-by BMC Bioinformatics 2010-01-26
James Thomas Anna H Noel-Storr Iain Marshall Byron Wallace Steve McDonald and 95 more Chris Mavergames Paul Glasziou Ian Shemilt Anneliese Synnot Tari Turner Julian Elliott Thomas Agoritsas John Hilton Caroline Perron Elie A. Akl Rebecca K Hodder Charlotte Pestridge Lauren Albrecht Tanya Horsley Joanne Platt Rebecca Armstrong Phi Hùng Nguyễn Robert M. Plovnick Anneliese Arno Noah Ivers Gail Quinn Agnes Au Renea V Johnston Gabriel Rada Matthew K. Bagg Arwel W. Jones Philippe Ravaud Catherine Boden Lara A Kahale Bernt Richter Isabelle Boisvert Homa Keshavarz Rebecca Ryan Linn Brandt Stephanie A. Kolakowsky‐Hayner Dina H. Salama Alexandra Bražinová Sumanth Kumbargere Nagraj Georgia Salanti Rachelle Buchbinder Toby J Lasserson Lina Santaguida Chris Champion Rebecca Lawrence Nancy Santesso Jackie Chandler Zbigniew Leś Holger J. Schünemann Andreas Charidimou Stefan Leucht Ian Shemilt Roger Chou Nicola Low Diana Sherifali Rachel Churchill Andrew I.R. Maas Reed Siemieniuk Maryse C. Cnossen Harriet MacLehose Mark Simmonds Marie-Joëlle Cossi Malcolm Macleod Nicole Skoetz Michel Jacques Counotte Iain Marshall Karla Soares‐Weiser Samantha Craigie Iain Marshall Velandai Srikanth Philipp Dahm Nicole Martin Katrina Sullivan Alanna Danilkewich Laura Martínez García Anneliese Synnot Kristen Danko Chris Mavergames Mark J. Taylor Emma Donoghue Lara Maxwell Kris Thayer Corinna Dressler James H. McAuley James Thomas Cathy Egan Steve McDonald Roger Tritton Julian Elliott Joanne E. McKenzie Guy Tsafnat Sarah A. Elliott Joerg J Meerpohl Peter Tugwell Itziar Etxeandia‐Ikobaltzeta Bronwen Merner

10.1016/j.jclinepi.2017.08.011 article EN cc-by-nc-nd Journal of Clinical Epidemiology 2017-09-11

We introduce a deep neural network for automated sarcasm detection.Recent work has emphasized the need models to capitalize on contextual features, beyond lexical and syntactic cues present in utterances.For example, different speakers will tend employ regarding subjects and, thus, detection ought encode such speaker information.Current methods have achieved this by way of laborious feature engineering.By contrast, we propose automatically learn then exploit user embeddings, be used concert...

10.18653/v1/k16-1017 article EN cc-by 2016-01-01

We present a new Convolutional Neural Network (CNN) model for text classification that jointly exploits labels on documents and their constituent sentences.Specifically, we consider scenarios in which annotators explicitly mark sentences (or snippets) support overall document categorization, i.e., they provide rationales.Our such supervision via hierarchical approach each is represented by linear combination of the vector representations its component sentences.We propose sentence-level...

10.18653/v1/d16-1076 preprint EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

Abstract Objective To develop and evaluate RobotReviewer, a machine learning (ML) system that automatically assesses bias in clinical trials. From (PDF-formatted) trial report, the should determine risks of for domains defined by Cochrane Risk Bias (RoB) tool, extract supporting text these judgments. Methods We algorithmically annotated 12,808 PDFs using data from Database Systematic Reviews (CDSR). Trials were labeled as being at low or high/unclear risk each domain, sentences informative...

10.1093/jamia/ocv044 article EN cc-by-nc Journal of the American Medical Informatics Association 2015-06-22

We present a corpus of 5,000 richly annotated abstracts medical articles describing clinical randomized controlled trials. Annotations include demarcations text spans that describe the Patient population enrolled, Interventions studied and to what they were Compared, Outcomes measured (the 'PICO' elements). These are further at more granular level, e.g., individual interventions within them marked mapped onto structured vocabulary. acquired annotations from diverse set workers with varying...

10.18653/v1/p18-1019 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

In many settings it is important for one to be able understand why a model made particular prediction. NLP this often entails extracting snippets of an input text ‘responsible for’ corresponding output; when such snippet comprises tokens that indeed informed the model’s prediction, faithful explanation. some settings, faithfulness may critical ensure transparency. Lei et al. (2016) proposed produce rationales neural classification by defining independent extraction and prediction modules....

10.18653/v1/2020.acl-main.409 article EN 2020-01-01

Relation extraction (RE) is the core NLP task of inferring semantic relationships between entities from text. Standard supervised RE techniques entail training modules to tag tokens comprising entity spans and then predict relationship them. Recent work has instead treated problem as a sequence-to-sequence task, linearizing relations target strings be generated conditioned on input. Here we push limits this approach, using larger language models (GPT-3 Flan-T5 large) than considered in prior...

10.18653/v1/2023.acl-long.868 article EN cc-by 2023-01-01

Class imbalance (i.e., scenarios in which classes are unequally represented the training data) occurs many real-world learning tasks. Yet despite its practical importance, there is no established theory of class imbalance, and existing methods for handling it therefore not well motivated. In this work, we approach problem from a probabilistic perspective, vantage identify dataset characteristics (such as dimensionality, sparsity, etc.) that exacerbate problem. Motivated by theory, advocate...

10.1109/icdm.2011.33 article EN 2011-12-01

Identifying all published reports of randomized controlled trials (RCTs) is an important aim, but it requires extensive manual effort to separate RCTs from non-RCTs, even using current machine learning (ML) approaches. We aimed make this process more efficient via a hybrid approach both crowdsourcing and ML.We trained classifier discriminate between citations that describe those do not. then adopted simple strategy automatically excluding deemed very unlikely be by the deferring crowdworkers...

10.1093/jamia/ocx053 article EN cc-by-nc Journal of the American Medical Informatics Association 2017-05-18

Automatically detecting verbal irony (roughly, sarcasm) is a challenging task because ironists say something other than ‐ and often opposite to what they actually mean. Discerning ironic intent exclusively from the words syntax comprising texts (e.g., tweets, forum posts) therefore not always possible: additional contextual information about speaker and/or topic at hand necessary. We introduce new corpus that provides empirical evidence for this claim. show annotators frequently require...

10.3115/v1/p14-2084 article EN 2014-01-01

A recent "third wave" of neural network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work this area is referred to as deep learning. Recent years have witnessed an explosive growth research into NN-based information retrieval (IR). significant body has been created. In paper, we survey the current...

10.1007/s10791-017-9321-y article EN cc-by Information Retrieval 2017-11-10

We propose a new active learning (AL) method for text classification with convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled aim of maximizing model performance minimal effort. Neural models capitalize on word embeddings as representations (features), tuning these task at hand. argue that AL strategies multi-layered should focus selecting most affect embedding space (i.e., induce discriminative representations). This is in contrast traditional...

10.1609/aaai.v31i1.10962 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2017-02-12
Coming Soon ...