NFDI4DS | UHH-SEMS - Publication Details

Byron Wallace

ORCID: 0000-0003-2409-7735

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5036790226

Research Areas

Topic Modeling
Natural Language Processing Techniques
Mental Health Research Topics
Health, Environment, Cognitive Aging
Biomedical Text Mining and Ontologies
Machine Learning in Healthcare
Meta-analysis and systematic reviews
Explainable Artificial Intelligence (XAI)
Advanced Text Analysis Techniques
Text Readability and Simplification
Machine Learning and Data Classification
Machine Learning and Algorithms
Artificial Intelligence in Healthcare and Education
Computational and Text Analysis Methods
Text and Document Classification Technologies
Data Quality and Management
Scientific Computing and Data Management
Semantic Web and Ontologies
Mobile Crowdsensing and Crowdsourcing
Sentiment Analysis and Opinion Mining
Imbalanced Data Classification Techniques
Data Stream Mining Techniques
Artificial Intelligence in Healthcare
Software Engineering Research
Mental Health via Writing

Northeastern University
2015-2024

Universidad del Noreste
2016-2023

IT University of Copenhagen
2023

Tokyo Institute of Technology
2023

Administration for Community Living
2023

American Jewish Committee
2023

John Brown University
2013-2023

Carnegie Mellon University
2023

University of Massachusetts Amherst
2023

Accenture (Switzerland)
2023

Closing the Gap between Methodologists and End-Users:Ras a Computational Back-End

OPENALEX - Publications

Byron Wallace Issa J Dahabreh Thomas A Trikalinos Joseph Lau Paul Trow and 1 more

The R environment provides a natural platform for developing new statistical methods due to the mathematical expressiveness of language, large number existing libraries, and active developer community. One drawback R, however, is learning curve; programming deterrent non-technical users, who typically prefer graphical user interfaces (GUIs) command line environments. Thus, while statisticians develop in practitioners are often behind terms techniques they use as rely on GUI applications....

10.18637/jss.v049.i05 article EN cc-by Journal of Statistical Software 2012-01-01

A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification

OPENALEX - Publications

Ye Zhang Byron Wallace

Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on the practically important task of sentence classification (kim 2014, kalchbrenner johnson 2014). However, these models require practitioners to specify an exact model architecture and set accompanying hyperparameters, including filter region size, regularization parameters, so on. It is currently unknown how sensitive changes in configurations for classification. We thus conduct a sensitivity...

10.48550/arxiv.1510.03820 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Meta-Analyst: software for meta-analysis of binary, continuous and diagnostic data

OPENALEX - Publications

Byron Wallace Christopher H. Schmid Joseph Lau Thomas A Trikalinos

Meta-analysis is increasingly used as a key source of evidence synthesis to inform clinical practice. The theory and statistical foundations meta-analysis continually evolve, providing solutions many new challenging problems. In practice, most meta-analyses are performed in general packages or dedicated programs.Herein, we introduce Meta-Analyst, novel, powerful, intuitive, free program for the variety Meta-Analyst implemented C# atop Microsoft .NET framework, features graphical user...

10.1186/1471-2288-9-80 article EN cc-by BMC Medical Research Methodology 2009-12-01

OPENALEX - Publications

Sarthak Jain Byron Wallace

Sarthak Jain, Byron C. Wallace. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1357 article EN 2019-01-01

Attention is not Explanation

OPENALEX - Publications

Sarthak Jain Byron Wallace

Attention mechanisms have seen wide adoption in neural NLP models. In addition to improving predictive performance, these are often touted as affording transparency: models equipped with attention provide a distribution over attended-to input units, and this is presented (at least implicitly) communicating the relative importance of inputs. However, it unclear what relationship exists between weights model outputs. work, we perform extensive experiments across variety tasks that aim assess...

10.48550/arxiv.1902.10186 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Deploying an interactive machine learning system in an evidence-based practice center

OPENALEX - Publications

Byron Wallace Kevin Small Carla E. Brodley Joseph Lau Thomas A Trikalinos

Medical researchers looking for evidence pertinent to a specific clinical question must navigate an increasingly voluminous corpus of published literature. This data deluge has motivated the development machine learning and mining technologies facilitate efficient biomedical research. Despite obvious labor-saving potential these concomitant academic interest therein, however, adoption techniques by medical been relatively sluggish. One explanation this is that while many methods have...

10.1145/2110363.2110464 article EN 2012-01-28

ERASER: A Benchmark to Evaluate Rationalized NLP Models

OPENALEX - Publications

Jay DeYoung Sarthak Jain Nazneen Fatema Rajani Eric Lehman Caiming Xiong and 2 more

Jay DeYoung, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron C. Wallace. Proceedings of the 58th Annual Meeting Association for Computational Linguistics. 2020.

10.18653/v1/2020.acl-main.408 article EN cc-by 2020-01-01

OpenMEE: Intuitive, open‐source software for meta‐analysis in ecology and evolutionary biology

OPENALEX - Publications

Byron Wallace Marc J. Lajeunesse George Dietz Issa J Dahabreh Thomas A Trikalinos and 2 more

Summary Meta‐analysis and meta‐regression are statistical methods for synthesizing modelling the results of different studies, critical research synthesis tools in ecology evolutionary biology (E&E). However, many E&E researchers carry out meta‐analyses using software that is limited its functionality not easily updatable. It likely these limitations have slowed uptake new scope quality inferences from syntheses. We developed OpenMEE: Open Meta‐analyst Ecology Evolution to address...

10.1111/2041-210x.12708 article EN publisher-specific-oa Methods in Ecology and Evolution 2016-11-29

Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide

OPENALEX - Publications

Iain Marshall Anna H Noel-Storr Joël Kuiper James Thomas Byron Wallace

Machine learning (ML) algorithms have proven highly accurate for identifying Randomized Controlled Trials (RCTs) but are not used much in practice, part because the best way to make use of technology a typical workflow is unclear. In this work, we evaluate ML models RCT classification (support vector machines, convolutional neural networks, and ensemble approaches). We trained optimized support machine network on titles abstracts Cochrane Crowd set. evaluated an external dataset (Clinical...

10.1002/jrsm.1287 article EN cc-by Research Synthesis Methods 2018-01-09

Semi-automated screening of biomedical citations for systematic reviews

OPENALEX - Publications

Byron Wallace Thomas A Trikalinos Joseph Lau Carla E. Brodley Christopher H. Schmid

Systematic reviews address a specific clinical question by unbiasedly assessing and analyzing the pertinent literature. Citation screening is time-consuming critical step in systematic reviews. Typically, reviewers must evaluate thousands of citations to identify articles eligible for given review. We explore application machine learning techniques semi-automate citation screening, thereby reducing reviewers' workload. present novel online classification strategy automatically discriminate...

10.1186/1471-2105-11-55 article EN cc-by BMC Bioinformatics 2010-01-26

Living systematic reviews: 2. Combining human and machine effort

OPENALEX - Publications

James Thomas Anna H Noel-Storr Iain Marshall Byron Wallace Steve McDonald and 95 more

10.1016/j.jclinepi.2017.08.011 article EN cc-by-nc-nd Journal of Clinical Epidemiology 2017-09-11

Modelling Context with User Embeddings for Sarcasm Detection in Social Media

OPENALEX - Publications

Silvio Amir Byron Wallace Hao Lyu Paula Carvalho Mário J. Silva

We introduce a deep neural network for automated sarcasm detection.Recent work has emphasized the need models to capitalize on contextual features, beyond lexical and syntactic cues present in utterances.For example, different speakers will tend employ regarding subjects and, thus, detection ought encode such speaker information.Current methods have achieved this by way of laborious feature engineering.By contrast, we propose automatically learn then exploit user embeddings, be used concert...

10.18653/v1/k16-1017 article EN cc-by 2016-01-01

Rationale-Augmented Convolutional Neural Networks for Text Classification

OPENALEX - Publications

Ye Zhang Iain Marshall Byron Wallace

We present a new Convolutional Neural Network (CNN) model for text classification that jointly exploits labels on documents and their constituent sentences.Specifically, we consider scenarios in which annotators explicitly mark sentences (or snippets) support overall document categorization, i.e., they provide rationales.Our such supervision via hierarchical approach each is represented by linear combination of the vector representations its component sentences.We propose sentence-level...

10.18653/v1/d16-1076 preprint EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials

OPENALEX - Publications

Iain Marshall Joël Kuiper Byron Wallace

Abstract Objective To develop and evaluate RobotReviewer, a machine learning (ML) system that automatically assesses bias in clinical trials. From (PDF-formatted) trial report, the should determine risks of for domains defined by Cochrane Risk Bias (RoB) tool, extract supporting text these judgments. Methods We algorithmically annotated 12,808 PDFs using data from Database Systematic Reviews (CDSR). Trials were labeled as being at low or high/unclear risk each domain, sentences informative...

10.1093/jamia/ocv044 article EN cc-by-nc Journal of the American Medical Informatics Association 2015-06-22

A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature

OPENALEX - Publications

Benjamin D. Nye Junyi Jessy Li Roma Patel Yinfei Yang Iain Marshall and 2 more

We present a corpus of 5,000 richly annotated abstracts medical articles describing clinical randomized controlled trials. Annotations include demarcations text spans that describe the Patient population enrolled, Interventions studied and to what they were Compared, Outcomes measured (the 'PICO' elements). These are further at more granular level, e.g., individual interventions within them marked mapped onto structured vocabulary. acquired annotations from diverse set workers with varying...

10.18653/v1/p18-1019 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Learning to Faithfully Rationalize by Construction

OPENALEX - Publications

Sarthak Jain Sarah Wiegreffe Yuval Pinter Byron Wallace

In many settings it is important for one to be able understand why a model made particular prediction. NLP this often entails extracting snippets of an input text ‘responsible for’ corresponding output; when such snippet comprises tokens that indeed informed the model’s prediction, faithful explanation. some settings, faithfulness may critical ensure transparency. Lei et al. (2016) proposed produce rationales neural classification by defining independent extraction and prediction modules....

10.18653/v1/2020.acl-main.409 article EN 2020-01-01

Revisiting Relation Extraction in the era of Large Language Models

OPENALEX - Publications

Somin Wadhwa Silvio Amir Byron Wallace

Relation extraction (RE) is the core NLP task of inferring semantic relationships between entities from text. Standard supervised RE techniques entail training modules to tag tokens comprising entity spans and then predict relationship them. Recent work has instead treated problem as a sequence-to-sequence task, linearizing relations target strings be generated conditioned on input. Here we push limits this approach, using larger language models (GPT-3 Flan-T5 large) than considered in prior...

10.18653/v1/2023.acl-long.868 article EN cc-by 2023-01-01

Leveraging generative AI for clinical evidence synthesis needs to ensure trustworthiness

OPENALEX - Publications

Gongbo Zhang Qiao Jin Denis Jered McInerney Yong Chen Fei Wang and 9 more

10.1016/j.jbi.2024.104640 article EN Journal of Biomedical Informatics 2024-04-10

Class Imbalance, Redux

OPENALEX - Publications

Byron Wallace Kevin Small Carla E. Brodley Thomas A Trikalinos

Class imbalance (i.e., scenarios in which classes are unequally represented the training data) occurs many real-world learning tasks. Yet despite its practical importance, there is no established theory of class imbalance, and existing methods for handling it therefore not well motivated. In this work, we approach problem from a probabilistic perspective, vantage identify dataset characteristics (such as dimensionality, sparsity, etc.) that exacerbate problem. Motivated by theory, advocate...

10.1109/icdm.2011.33 article EN 2011-12-01

Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach

OPENALEX - Publications

Byron Wallace Anna H Noel-Storr Iain Marshall Aaron Cohen Neil R. Smalheiser and 1 more

Identifying all published reports of randomized controlled trials (RCTs) is an important aim, but it requires extensive manual effort to separate RCTs from non-RCTs, even using current machine learning (ML) approaches. We aimed make this process more efficient via a hybrid approach both crowdsourcing and ML.We trained classifier discriminate between citations that describe those do not. then adopted simple strategy automatically excluding deemed very unlikely be by the deferring crowdworkers...

10.1093/jamia/ocx053 article EN cc-by-nc Journal of the American Medical Informatics Association 2017-05-18

Humans Require Context to Infer Ironic Intent (so Computers Probably do, too)

OPENALEX - Publications

Byron Wallace Do Kook Choe Laura Kertz Eugene Charniak

Automatically detecting verbal irony (roughly, sarcasm) is a challenging task because ironists say something other than ‐ and often opposite to what they actually mean. Discerning ironic intent exclusively from the words syntax comprising texts (e.g., tweets, forum posts) therefore not always possible: additional contextual information about speaker and/or topic at hand necessary. We introduce new corpus that provides empirical evidence for this claim. show annotators frequently require...

10.3115/v1/p14-2084 article EN 2014-01-01

Neural information retrieval: at the end of the early years

OPENALEX - Publications

Kezban Dilek Onal Ye Zhang İsmail Sengör Altıngövde M.M. Rahman Pınar Karagöz and 14 more

A recent "third wave" of neural network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work this area is referred to as deep learning. Recent years have witnessed an explosive growth research into NN-based information retrieval (IR). significant body has been created. In paper, we survey the current...

10.1007/s10791-017-9321-y article EN cc-by Information Retrieval 2017-11-10

Active Discriminative Text Representation Learning

OPENALEX - Publications

Ye Zhang Matthew Lease Byron Wallace

We propose a new active learning (AL) method for text classification with convolutional neural networks (CNNs). In AL, one selects the instances to be manually labeled aim of maximizing model performance minimal effort. Neural models capitalize on word embeddings as representations (features), tuning these task at hand. argue that AL strategies multi-layered should focus selecting most affect embedding space (i.e., induce discriminative representations). This is in contrast traditional...

10.1609/aaai.v31i1.10962 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2017-02-12

Coming Soon ...