NFDI4DS | UHH-SEMS - Publication Details

Limitations of Transformers on Clinical Text Classification

OPENALEX - Publications

Shang Gao Mohammed Alawad M. Todd Young John Gounley Noah Schaefferkoetter and 7 more

Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods scale BERT, which by default can only handle input sequences up approximately 400 words long, perform several thousand long. We compare these against two much simpler architectures - a word-level...

10.1109/jbhi.2021.3062322 article EN cc-by IEEE Journal of Biomedical and Health Informatics 2021-02-26

Hierarchical attention networks for information extraction from cancer pathology reports

OPENALEX - Publications

Shang Gao M. Todd Young John X. Qiu Hong‐Jun Yoon James Blair Christian and 3 more

We explored how a deep learning (DL) approach based on hierarchical attention networks (HANs) can improve model performance for multiple information extraction tasks from unstructured cancer pathology reports compared to conventional methods that do not sufﬁciently capture syntactic and semantic contexts free-text documents.Data our analyses were obtained 942 deidentiﬁed collected by the National Cancer Institute Surveillance, Epidemiology, End Results program. The HAN was implemented 2...

10.1093/jamia/ocx131 article EN cc-by-nc Journal of the American Medical Informatics Association 2017-10-17

Deep clustering of protein folding simulations

OPENALEX - Publications

Debsindhu Bhowmik Shang Gao M. Todd Young Arvind Ramanathan

We examine the problem of clustering biomolecular simulations using deep learning techniques. Since simulation datasets are inherently high dimensional, it is often necessary to build low dimensional representations that can be used extract quantitative insights into atomistic mechanisms underlie complex biological processes. use a convolutional variational autoencoder (CVAE) learn biophysically relevant latent features from long time-scale protein folding in an unsupervised manner....

10.1186/s12859-018-2507-5 article EN cc-by BMC Bioinformatics 2018-12-01

Challenges in Markov Chain Monte Carlo for Bayesian Neural Networks

OPENALEX - Publications

Theodore Papamarkou Jacob Hinkle M. Todd Young David E. Womble

Markov chain Monte Carlo (MCMC) methods have not been broadly adopted in Bayesian neural networks (BNNs). This paper initially reviews the main challenges sampling from parameter posterior of a network via MCMC. Such culminate to lack convergence posterior. Nevertheless, this shows that nonconverged chain, generated MCMC space network, can yield marginalization valuable predictive distribution output network. Classification examples based on multilayer perceptrons showcase highly accurate...

10.1214/21-sts840 article EN Statistical Science 2022-06-22

Distributed Bayesian optimization of deep reinforcement learning algorithms

OPENALEX - Publications

M. Todd Young Jacob Hinkle Ramakrishnan Kannan Arvind Ramanathan

Significant strides have been made in supervised learning settings thanks to the successful application of deep learning. Now, recent work has brought techniques bear on sequential decision processes area reinforcement (DRL). Currently, little is known regarding hyperparameter optimization for DRL algorithms. Given that algorithms are computationally intensive train, and be sample inefficient, optimizing model hyperparameters presents significant challenges established techniques. We provide...

10.1016/j.jpdc.2019.07.008 article EN cc-by-nc-nd Journal of Parallel and Distributed Computing 2020-02-04

Exascale Deep Learning for Scientific Inverse Problems

OPENALEX - Publications

Nouamane Laanait Joshua Romero Junqi Yin M. Todd Young Sean Treichler and 4 more

We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping tensors. These new techniques produce an optimal overlap between computation result near-linear scaling (0.93) training up to 27,600 NVIDIA V100 GPUs on the Summit Supercomputer. demonstrate our context a Fully Convolutional Neural Network approximate solution longstanding scientific inverse problem materials...

10.48550/arxiv.1909.11150 preprint EN other-oa arXiv (Cornell University) 2019-01-01

HyperSpace: Distributed Bayesian Hyperparameter Optimization

OPENALEX - Publications

M. Todd Young Jacob Hinkle Arvind Ramanathan Ramakrishnan Kannan

As machine learning models continue to increase in complexity, so does the potential number of free model parameters commonly known as hyperparameters. While there has been considerable progress toward finding optimal configurations these hyperparameters, many optimization procedures are treated black boxes. We believe methods should not only return a set optimized but also give insight into effects hyperparameter settings. To this end, we present HyperSpace, parallel implementation Bayesian...

10.1109/cahpc.2018.8645954 article EN 2018-09-01

Information Extraction from Cancer Pathology Reports with Graph Convolution Networks for Natural Language Texts

OPENALEX - Publications

Hong-Jun Yoon John Gounley M. Todd Young Georgia D. Tourassi

Graph-of-words is a flexible and efficient text representation which addresses well-known challenges, such as word ordering variation of expressions, to natural language processing. In this paper, we consider the latest graph-based convolutional neural network technique, Text GraphConvolutional Network (Text GCN), in context performingclassification tasks on free-form texts. To do this, designed study multi-task information extraction from medical documents. We implemented learning GCN,...

10.1109/bigdata47090.2019.9006270 article EN 2021 IEEE International Conference on Big Data (Big Data) 2019-12-01

Distinct Structural Flexibility within SARS-CoV-2 Spike Protein Reveals Potential Therapeutic Targets

OPENALEX - Publications

Serena H. Chen M. Todd Young John Gounley Christopher B. Stanley Debsindhu Bhowmik

Abstract The emergence and rapid worldwide spread of the novel coronavirus disease, COVID-19, has prompted concerted efforts to find successful treatments. causative virus, severe acute respiratory syndrome 2 (SARS-CoV-2), uses its spike (S) protein gain entry into host cells. Therefore, S presents a viable target develop directed therapy. Here, we deployed an integrated artificial intelligence with molecular dynamics simulation approach provide new details structure. Based on comprehensive...

10.1101/2020.04.17.047548 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-04-18

Deep Generative Model Driven Protein Folding Simulation

OPENALEX - Publications

Heng Ma Debsindhu Bhowmik Hyungro Lee Matteo Turilli M. Todd Young and 2 more

Significant progress in computer hardware and software have enabled molecular dynamics (MD) simulations to model complex biological phenomena such as protein folding. However, enabling MD access biologically relevant timescales (e.g., beyond milliseconds) still remains challenging. These limitations include (1) quantifying which set of states already been (sufficiently) sampled an ensemble runs, (2) identifying novel from can be initiated sample rare events sampling folding events). With the...

10.48550/arxiv.1908.00496 preprint EN cc-by-nc-sa arXiv (Cornell University) 2019-01-01

How Distinct Structural Flexibility within SARS-CoV-2 Spike Protein Reveals Potential Therapeutic Targets

OPENALEX - Publications

Serena H. Chen M. Todd Young John Gounley Christopher B. Stanley Debsindhu Bhowmik

The emergence and rapid worldwide spread of the novel coronavirus disease, COVID-19, has prompted concerted efforts to find successful treatments. causative virus, severe acute respiratory syndrome 2 (SARS-CoV-2), uses its spike (S) protein gain entry into host cells. Therefore, S presents a viable target develop directed therapy. Here, we deployed an integrated artificial intelligence with all-atom molecular dynamics simulation approach provide new details structure. Based on comprehensive...

10.1109/bigdata52589.2021.9671323 article EN 2021 IEEE International Conference on Big Data (Big Data) 2021-12-15

Deep clustering of protein folding simulations

OPENALEX - Publications

Debsindhu Bhowmik Shang Gao M. Todd Young Arvind Ramanathan

Abstract We examine the problem of clustering biomolecular simulations using deep learning techniques. Since simulation datasets are inherently high dimensional, it is often necessary to build low dimensional representations that can be used extract quantitative insights into atomistic mechanisms underlie complex biological processes. In this paper, we use a convolutional variational autoencoder (CVAE) learn biophysically relevant latent features from long time-scale protein folding in an...

10.1101/339879 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-06-12

A Statewide Program to Evaluate the Quality of Care Provided to Persons with HIV Infection

OPENALEX - Publications

Bruce Agins M. Todd Young W. C. Ellis Gary R. Burke Frances F. Rotunno

10.1016/s1070-3241(16)30171-7 article EN The Joint Commission Journal on Quality Improvement 1995-09-01

Predicting respiratory decompensation in mechanically ventilated adult ICU patients

OPENALEX - Publications

Yvette Tan M. Todd Young Akanksha Girish Beini Hu Zina Kurian and 5 more

Introduction: Mechanical ventilation is a life-saving treatment in the Intensive Care Unit (ICU), but often causes patients to be at risk of further respiratory complication. We created statistical model utilizing electronic health record and physiologic vitals data predict Center for Disease Control Prevention (CDC) defined Ventilator Associated Complications (VACs). Further, we evaluated effect temporal resolution feature generation method choice on accuracy such constructed model....

10.3389/fphys.2023.1125991 article EN cc-by Frontiers in Physiology 2023-04-14

Computer-aided Abnormality Detection in Chest Radiographs in a Clinical Setting via Domain-adaptation

OPENALEX - Publications

Abhishek Dubey M. Todd Young Christopher B. Stanley Dalton Lunga Jacob Hinkle

Deep learning (DL) models are being deployed at medical centers to aid radiologists for diagnosis of lung conditions from chest radiographs. Such often trained on a large volume publicly available labeled These pre-trained DL models' ability generalize in clinical settings is poor because the changes data distributions between and privately held In radiographs, heterogeneity arises diverse X-ray equipment their configurations used generating images. machine community, challenges posed by...

10.5220/0010302500002865 article EN cc-by-nc-nd Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies 2021-01-01

Computer-aided Abnormality Detection in Chest Radiographs in a Clinical Setting via Domain-adaptation

OPENALEX - Publications

Abhishek Dubey M. Todd Young Christopher B. Stanley Dalton Lunga Jacob Hinkle

Deep learning (DL) models are being deployed at medical centers to aid radiologists for diagnosis of lung conditions from chest radiographs. Such often trained on a large volume publicly available labeled These pre-trained DL models' ability generalize in clinical settings is poor because the changes data distributions between and privately held In radiographs, heterogeneity arises diverse X-ray equipment their configurations used generating images. machine community, challenges posed by...

10.5220/0010302500650072 article EN cc-by-nc-nd Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies 2021-01-01

Towards the Development of Entropy-Based Anomaly Detection in an Astrophysics Simulation

OPENALEX - Publications

Drew Schmidt O. E. Bronson Messer M. Todd Young Michael A. Matheson

The use of AI and ML for scientific applications is currently a very exciting dynamic field. Much this excitement HPC has focused on whose analysis classification generate large numbers flops. Others seek to replace simulations with data-driven surrogate models. But another important case lies in the combination application improve simulation accuracy. To that end, we present an anomaly problem which arises from core-collapse supernovae simulation. We discuss strategies early successes...

10.48550/arxiv.2009.02430 preprint EN other-oa arXiv (Cornell University) 2020-01-01

HyperSpace

OPENALEX - Publications

Arvind Ramanathan Jacob Hinkle Ramakrishnan Kannan M. Todd Young

10.5281/zenodo.1401479 article OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) 2019-01-25

Computer-aided abnormality detection in chest radiographs in a clinical setting via domain-adaptation

OPENALEX - Publications

Abhishek Dubey M. Todd Young Christopher P. Stanley Dalton Lunga Jacob Hinkle

Deep learning (DL) models are being deployed at medical centers to aid radiologists for diagnosis of lung conditions from chest radiographs. Such often trained on a large volume publicly available labeled These pre-trained DL models' ability generalize in clinical settings is poor because the changes data distributions between and privately held In radiographs, heterogeneity arises diverse X-ray equipment their configurations used generating images. machine community, challenges posed by...

10.48550/arxiv.2012.10564 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Royal academy of medicine in Ireland section of bioengineering

OPENALEX - Publications

P. J. Prendergast Ian Callanan Ciaran Simms Christopher Lyons C. Brady and 95 more

10.1007/bf02937950 article EN Irish Journal of Medical Science (1971 -) 1998-04-01