NFDI4DS | UHH-SEMS - Publication Details

Andrew Thrasher

ORCID: 0000-0003-0139-4059

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5008754414

Research Areas

Cancer Genomics and Diagnostics
Genomics and Rare Diseases
RNA modifications and cancer
Genetic factors in colorectal cancer
Epigenetics and DNA Methylation
Distributed and Parallel Computing Systems
Scientific Computing and Data Management
Gene expression and cancer classification
Genetics, Bioinformatics, and Biomedical Research
Childhood Cancer Survivors' Quality of Life
Cancer-related molecular mechanisms research
Molecular Biology Techniques and Applications
Prenatal Screening and Diagnostics
Advanced Data Storage Technologies
Genomics and Phylogenetic Studies
Neuroblastoma Research and Treatments
Acute Lymphoblastic Leukemia research
Single-cell and spatial transcriptomics
Iron Metabolism and Disorders
Lung Cancer Research Studies
DNA Repair Mechanisms
Hemoglobinopathies and Related Disorders
Gene Regulatory Network Analysis
Bioinformatics and Genomic Networks
Cloud Computing and Resource Management

St. Jude Children's Research Hospital
2016-2025

Juno Therapeutics (Germany)
2018

Pfizer (United Kingdom)
2018

Gilead Sciences (Germany)
2018

Medtronic (United States)
2018

Cytokinetics (United States)
2018

Incyte (United States)
2018

Alpine Immune Sciences (United States)
2018

University of Notre Dame
2010-2014

Clinical cancer genomic profiling by three-platform sequencing of whole genome, whole exome and transcriptome

OPENALEX - Publications

Michael Rusch Joy Nakitandwe Sheila Shurtleff Scott Newman Zhaojie Zhang and 28 more

Abstract To evaluate the potential of an integrated clinical test to detect diverse classes somatic and germline mutations relevant pediatric oncology, we performed three-platform whole-genome (WGS), whole exome (WES) transcriptome (RNA-Seq) sequencing tumors normal tissue from 78 cancer patients in a CLIA-certified, CAP-accredited laboratory. Our analysis pipeline achieves high accuracy by cross-validating variants between types, thereby removing need for confirmatory testing, facilitates...

10.1038/s41467-018-06485-7 article EN cc-by Nature Communications 2018-09-21

St. Jude Cloud: A Pediatric Cancer Genomic Data-Sharing Ecosystem

OPENALEX - Publications

Clay McLeod Alexander M. Gout Xin Zhou Andrew Thrasher Delaram Rahbarinia and 68 more

Abstract Effective data sharing is key to accelerating research improve diagnostic precision, treatment efficacy, and long-term survival in pediatric cancer other childhood catastrophic diseases. We present St. Jude Cloud (https://www.stjude.cloud), a cloud-based data-sharing ecosystem for accessing, analyzing, visualizing genomic from &gt;10,000 patients with survivors, &gt;800 sickle cell patients. Harmonized totaling 1.25 petabytes are freely available, including 12,104 whole...

10.1158/2159-8290.cd-20-1230 article EN Cancer Discovery 2021-01-06

Genetic Risk for Subsequent Neoplasms Among Long-Term Survivors of Childhood Cancer

OPENALEX - Publications

Zhaoming Wang Carmen L. Wilson John Easton Andrew Thrasher Heather L. Mulder and 44 more

Purpose Childhood cancer survivors are at increased risk of subsequent neoplasms (SNs), but the germline genetic contribution is largely unknown. We assessed pathogenic/likely pathogenic (P/LP) mutations in predisposition genes to their SN risk. Patients and Methods Whole-genome sequencing (30-fold) was performed on samples from childhood who were ≥ 5 years since initial diagnosis participants St Jude Lifetime Cohort Study, a retrospective hospital-based study with prospective clinical...

10.1200/jco.2018.77.8589 article EN Journal of Clinical Oncology 2018-05-30

CICERO: a versatile method for detecting complex and diverse driver fusions using cancer RNA sequencing data

OPENALEX - Publications

Liqing Tian Yongjin Li Michael N. Edmonson Xin Zhou Scott Newman and 16 more

Abstract To discover driver fusions beyond canonical exon-to-exon chimeric transcripts, we develop CICERO, a local assembly-based algorithm that integrates RNA-seq read support with extensive annotation for candidate ranking. CICERO outperforms commonly used methods, achieving 95% detection rate 184 independently validated including internal tandem duplications and other non-canonical events in 170 pediatric cancer transcriptomes. Re-analysis of TCGA glioblastoma unveils previously...

10.1186/s13059-020-02043-x article EN cc-by Genome biology 2020-05-28

Genomes for Kids: The Scope of Pathogenic Mutations in Pediatric Cancer Revealed by Comprehensive DNA and RNA Sequencing

OPENALEX - Publications

Scott Newman Joy Nakitandwe Chimene Kesserwan Elizabeth M. Azzato David A. Wheeler and 52 more

Genomic studies of pediatric cancer have primarily focused on specific tumor types or high-risk disease. Here, we used a three-platform sequencing approach, including whole-genome (WGS), whole-exome (WES), and RNA (RNA-seq), to examine germline genomes from 309 prospectively identified children with newly diagnosed (85%) relapsed/refractory (15%) cancers, unselected for type. Eighty-six percent patients harbored diagnostic (53%), prognostic (57%), therapeutically relevant (25%), and/or...

10.1158/2159-8290.cd-20-1631 article EN cc-by-nc-nd Cancer Discovery 2021-07-23

NetBID2 provides comprehensive hidden driver analysis

OPENALEX - Publications

Xinran Dong Liang Ding Andrew Thrasher Xinge Wang Jingjing Liu and 14 more

Many signaling and other genes known as "hidden" drivers may not be genetically or epigenetically altered differentially expressed at the mRNA protein levels, but, rather, drive a phenotype such tumorigenesis via post-translational modification mechanisms. However, conventional approaches based on genomics differential expression are limited in exposing hidden drivers. Here, we present comprehensive algorithm toolkit NetBID2 (data-driven network-based Bayesian inference of drivers, version...

10.1038/s41467-023-38335-6 article EN cc-by Nature Communications 2023-05-04

Cancer germline predisposing variants and late mortality from subsequent malignant neoplasms among long-term childhood cancer survivors: a report from the St Jude Lifetime Cohort and the Childhood Cancer Survivor Study

OPENALEX - Publications

Cheng Chen Na Qin Mingjuan Wang Qian Dong Saima Sultana Tithi and 28 more

10.1016/s1470-2045(23)00403-5 article EN publisher-specific-oa The Lancet Oncology 2023-10-01

Abstract 1089: Developing subtype-specific pediatric cancer molecular targets by aggregating diverse genomic data resources to the pediatric cancer (PeCan) knowledge base portal

OPENALEX - Publications

David Finkelstein Delaram Rahbarinia Ramzi Alsallaq Stephanie R. Sandor Michael Macias and 14 more

Abstract Knowledge about molecular targets for pediatric cancer has accelerated exponentially in recent years thanks to the increased application of multi-omics profiling both research and clinical settings. The efficacy genomic-based interventions may depend on whether observed genomic abnormalities are fundamental pathogenesis a specific subtype. At present such information is limited due rapid evolution subtype discovery classification as well disease heterogeneity owing presence many...

10.1158/1538-7445.am2025-1089 article EN Cancer Research 2025-04-21

Harnessing parallelism in multicore clusters with the All-Pairs, Wavefront, and Makeflow abstractions

OPENALEX - Publications

Li Yu Christopher Moretti Andrew Thrasher Scott Emrich Kenneth L. Judd and 1 more

10.1007/s10586-010-0134-7 article EN Cluster Computing 2010-04-22

A polygenic score for acute vaso-occlusive pain in pediatric sickle cell disease

OPENALEX - Publications

Evadnie Rampersaud Guolian Kang Lance E. Palmer Sara R. Rashkin Shuoguo Wang and 38 more

Individuals with monogenic disorders can experience variable phenotypes that are influenced by genetic variation. To investigate this in sickle cell disease (SCD), we performed whole-genome sequencing (WGS) of 722 individuals hemoglobin HbSS or HbSβ0-thalassemia from Baylor College Medicine and the St. Jude Children's Research Hospital Sickle Cell Clinical Intervention Program (SCCRIP) longitudinal cohort study. We developed pipelines to identify variants modulate polymerization red blood...

10.1182/bloodadvances.2021004634 article EN cc-by-nc-nd Blood Advances 2021-07-20

A Framework for Scalable Genome Assembly on Clusters, Clouds, and Grids

OPENALEX - Publications

Christopher Moretti Andrew Thrasher Li Yu Mark A. Olson Scott Emrich and 1 more

Bioinformatics researchers need efficient means to process large collections of genomic sequence data. One application interest, genome assembly, has great potential for parallelization; however, most previous attempts at parallelization require uncommon high-end hardware. This paper introduces the Scalable Assembler Notre Dame (SAND) framework that can achieve significant speedup using numbers commodity machines harnessed from clusters, clouds, and grids. SAND interfaces with Celera...

10.1109/tpds.2012.80 article EN IEEE Transactions on Parallel and Distributed Systems 2012-03-06

Taming complex bioinformatics workflows with weaver, makeflow, and starch

OPENALEX - Publications

Andrew Thrasher Rory Carmichael Peter Bui Li Yu Douglas Thain and 1 more

In this paper we discuss challenges of common bioinformatics applications when deployed outside their initial development environments. We propose a three-tiered approach to mitigate some these issues by leveraging an encapsulation tool, high-level workflow language, and portable intermediary. As case study, apply refactor custom EST analysis pipeline. The Starch tool encapsulates program dependencies simplify task specification deployment. Weaver language provides abstractions for...

10.1109/works.2010.5671858 article EN 2010-11-01

St. Jude Cloud—a Pediatric Cancer Genomic Data Sharing Ecosystem

OPENALEX - Publications

Clay McLeod Alexander M. Gout Xin Zhou Delaram Rahbarinia Andrew Thrasher and 63 more

ABSTRACT Effective data sharing is key to accelerating research that will improve the precision of diagnoses, efficacy treatments and long-term survival pediatric cancer other childhood catastrophic diseases. We present St. Jude Cloud ( https://www.stjude.cloud ), a cloud-based ecosystem developed via collaboration between Children’s Research Hospital, DNAnexus, Microsoft, for accessing, analyzing visualizing genomic from >10,000 patients, survivors >800 sickle cell patients....

10.1101/2020.08.24.264614 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-08-24

Scaling up genome annotation using MAKER and work queue

OPENALEX - Publications

Andrew Thrasher Zachary Musgrave Brian Kachmarck Douglas Thain Scott Emrich

Next generation sequencing technologies have enabled many genomes. Because of the overall increasing demand and inherent parallelism available in required analyses, these bioinformatics applications should ideally run on clusters, clouds and/or grids. We present a modified annotation framework that achieves speed-up 45x using 50 workers Caenorhabditis japonica test case. also evaluate modifications within Amazon EC2 cloud framework. The underlying genome (MAKER) is parallelised as an MPI...

10.1504/ijbra.2014.062994 article EN International Journal of Bioinformatics Research and Applications 2014-01-01

MethylationToActivity: a deep-learning framework that reveals promoter activity landscapes from DNA methylomes in individual tumors

OPENALEX - Publications

Justin Williams Beisi Xu Daniel K. Putnam Andrew Thrasher Chun‐Liang Li and 2 more

Abstract Although genome-wide DNA methylomes have demonstrated their clinical value as reliable biomarkers for tumor detection, subtyping, and classification, direct biological impacts at the individual gene level remain elusive. Here we present MethylationToActivity (M2A), a machine learning framework that uses convolutional neural networks to infer promoter activities based on H3K4me3 H3K27ac enrichment, from methylation patterns genes. Using publicly available datasets in real-world test...

10.1186/s13059-020-02220-y article EN cc-by Genome biology 2021-01-19

Case Studies in Designing Elastic Applications

OPENALEX - Publications

Deepu Rajan Andrew Thrasher Badi’ Abdul-Wahid Jesús A. Izaguirre Scott Emrich and 1 more

Clusters, clouds, and grids offer access to large scale computational resources at low cost. This is especially appealing scientific applications that require a very compete in the research space. However, available across these platforms differ significantly their availability, hardware, environment, performance, cost of use, more. requires use elastic can adapt run-time, transparently handling heterogeneity failures. In this paper, we present case studies several built using Work Queue...

10.1109/ccgrid.2013.46 article EN 2013-05-01

Scripting distributed scientific workflows using Weaver

OPENALEX - Publications

Peter Bui Li Yu Andrew Thrasher Rory Carmichael Irena Lanc and 2 more

SUMMARY Weaver is a high‐level distributed computing framework that enables researchers to construct scalable scientific data‐processing workflows. Instead of developing new workflow language, we introduce domain‐specific language built on top Python called Weaver, which takes advantage users' familiarity with the programming minimizes barriers adoption, and allows for integration rich ecosystem existing software. In this paper, provide an overview Weaver's model, users organize specify...

10.1002/cpe.1871 article EN Concurrency and Computation Practice and Experience 2011-11-09

Shifting the bioinformatics computing paradigm: A case study in parallelizing genome annotation using MAKER and Work Queue

OPENALEX - Publications

Andrew Thrasher Douglas Thain Scott Emrich Zachary Musgrave

Next generation sequencing technologies have enabled various entities, ranging from large centers to individual laboratories, sequence organisms of choice and analyze them on demand. Sequencing analysis, however, is only part the equation: learn about a certain organism, scientists need annotate it. Each these problems highly parallel at basic level computation; few applications support single parallelization frameworks such as MPI. Because overall increasing demand for computational...

10.1109/iccabs.2012.6182647 article EN 2012-02-01

Precision Medicine for Sickle Cell Disease through Whole Genome Sequencing

OPENALEX - Publications

Evadnie Rampersaud Lance E. Palmer Jane S. Hankins Vivien A. Sheehan Wenjian Bi and 29 more

10.1182/blood-2018-99-117606 article EN Blood 2018-11-29

XenoCP: Cloud-based BAM cleansing tool for RNA and DNA from Xenograft

OPENALEX - Publications

Michael Rusch Liang Ding Sasi Arunachalam Andrew Thrasher Hongjian Jin and 6 more

ABSTRACT Summary Xenografts are important models for cancer research and the presence of mouse reads in xenograft next generation sequencing data can potentially confound interpretation experimental results. We present an efficient, cloud-based BAM-to-BAM cleaning tool called XenoCP to remove from BAM files. show application obtaining accurate gene expression quantification RNA-seq tumor heterogeneity WGS xenografts derived brain solid tumors. Availability Implementation St. Jude Cloud (...

10.1101/843250 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-11-15

Data from Genomes for Kids: The Scope of Pathogenic Mutations in Pediatric Cancer Revealed by Comprehensive DNA and RNA Sequencing

OPENALEX - Publications

Scott Newman Joy Nakitandwe Chimene Kesserwan Elizabeth M. Azzato David A. Wheeler and 52 more

<div>Abstract<p>Genomic studies of pediatric cancer have primarily focused on specific tumor types or high-risk disease. Here, we used a three-platform sequencing approach, including whole-genome (WGS), whole-exome (WES), and RNA (RNA-seq), to examine germline genomes from 309 prospectively identified children with newly diagnosed (85%) relapsed/refractory (15%) cancers, unselected for type. Eighty-six percent patients harbored diagnostic (53%), prognostic (57%), therapeutically...

10.1158/2159-8290.c.6549437 preprint EN 2023-04-03

Abstract 6252: Driver indel discovery and allelic imbalance in >9,000 tumor RNA-Seq samples from the Cancer Genome Atlas (TCGA)

OPENALEX - Publications

Kohei Hagiwara Andrew Thrasher Jinghui Zhang

Abstract High-throughput DNA sequencing technologies have enabled unbiased screening of genomic alterations such as single nucleotide variants (SNVs) and small insertions/deletions (indels). However, variant analysis using transcriptomic (RNA-seq) data has not become standard due to challenges in distinguishing true variants, particular indels, from artifacts that can arise RNA-seq mapping library preparation. We previously developed a tool, RNAIndel, which classifies indels somatic,...

10.1158/1538-7445.am2024-6252 article EN Cancer Research 2024-03-22

Abstract 3001: Germline mutations in cancer predisposition genes and risk for subsequent neoplasms among long-term survivors of childhood cancer in the St. Jude Lifetime Cohort

OPENALEX - Publications

Zhaoming Wang Carmen L. Wilson John Easton Dale J. Hedges Qi Liu and 37 more

Abstract Childhood cancer survivors are at increased risk of subsequent neoplasms (SN), largely considered to be therapy-related. Studies predisposition genes (CPGs) and SN among long-term lacking. We characterized germline mutations in CPGs childhood determine their contribution risk. Whole genome (30x) exome (100x) sequencing was performed for 2988 5+ year (1629 leukemia/lymphoma, 332 CNS, 1027 other solid tumors, 53% male, median follow-up 28 [range 6-55] years). Survivors underwent a...

10.1158/1538-7445.am2017-3001 article EN Cancer Research 2017-07-01

Forecasting a Volatility Tsunami

OPENALEX - Publications

Andrew Thrasher

The empirical aim of this paper is motivated by the anecdotal belief among professional and non-professional investment community, that a “low” reading in CBOE Volatility Index (VIX) or large decline alone are ample reasons to believe volatility will spike near future. While can be useful tool for investors traders, it often misinterpreted poorly used. This demonstrate dispersion acts as better predictor its future VIX spikes.

10.2139/ssrn.2949847 article EN SSRN Electronic Journal 2017-01-01

Abstract 5478: CICERO: An accurate method for detecting complex and diverse driver fusions using cancer transcriptome sequencing (RNA-seq) data

OPENALEX - Publications

Liqing Tian Yongjin Li Michael N. Edmonson Xin Zhou Scott Newman and 14 more

Abstract Gene fusions are important biomarkers for cancer diagnosis, subtype classification and therapeutic decision-making. While fusion detection using RNA-seq data has become a standard practice, existing computational methods primarily focus on identifying canonical exon-to-exon fusions. However, more complex events such as multi-partner fusions, truncations, enhancer hijacking internal tandem duplications (ITD) can also lead to abnormal function or aberrant transcription of driver...

10.1158/1538-7445.am2020-5478 article EN Cancer Research 2020-08-15

Coming Soon ...