- Distributed and Parallel Computing Systems
- Scientific Computing and Data Management
- Advanced Data Storage Technologies
- Cancer Genomics and Diagnostics
- Genomic variations and chromosomal abnormalities
- Parallel Computing and Optimization Techniques
- Research Data Management Practices
- Computational Physics and Python Applications
- Prenatal Screening and Diagnostics
- Genomics and Phylogenetic Studies
- Cloud Computing and Resource Management
- Advanced MRI Techniques and Applications
- Molecular Biology Techniques and Applications
- Viral-associated cancers and disorders
- Medical Imaging Techniques and Applications
- Plasma and Flow Control in Aerodynamics
- Combustion and Detonation Processes
- Gene expression and cancer classification
- Evolution and Genetic Dynamics
- Cell Image Analysis Techniques
- Acute Lymphoblastic Leukemia research
- Political and Economic history of UK and US
- Genomics and Rare Diseases
- Regulation of Appetite and Obesity
- Aquatic Ecosystems and Phytoplankton Dynamics
University of Chicago
2007-2024
BioNano Genomics (United States)
2021-2024
Cambridge Quantum Computing (United Kingdom)
2023
Southern Maine Community College
2021
University of Illinois Chicago
2019
University College London
2013
Argonne National Laboratory
2007-2009
University of Southern California
2005
University of Oklahoma
2004
We present Swift, a system that combines novel scripting language called SwiftScript with powerful runtime based on CoG Karajan, Falkon, and Globus to allow for the concise specification, reliable efficient execution, of large loosely coupled computations. Swift adopts adapts ideas first explored in GriPhyN virtual data system, improving many regards. describe its use XDTM logical structure complex file structures. also services dispatch manage execution tasks parallel grid environments....
High-level programming languages such as Python are increasingly used to provide intuitive interfaces libraries written in lower-level and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need parallel computing (e.g., due big data end of Moore's law), necessitates rethinking how parallelism is expressed programs. Here, we present Parsl, a scripting library that augments simple, scalable, flexible...
Abstract The first Provenance Challenge was set up in order to provide a forum for the community understand capabilities of different provenance systems and expressiveness their representations. To this end, functional magnetic resonance imaging workflow defined, which participants had either simulate or run produce some representation, from identified queries be implemented executed. Sixteen teams responded challenge, submitted inputs. In paper, we present challenge queries, summarize...
The Grid2003 Project has deployed a multivirtual organization, application-driven grid laboratory (Grid3) that sustained for several months the production-level services required by physics experiments of Large Hadron Collider at CERN (ATLAS and CMS), Sloan Digital Sky Survey project, gravitational wave search experiment LIGO, BTeV Fermilab, as well applications in molecular structure analysis genome analysis, computer science research projects such areas job data scheduling. infrastructure...
Scripting accelerates and simplifies the composition of existing codes to form more powerful applications. Parallel scripting extends this technique allow for rapid development highly parallel applications that can run efficiently on platforms ranging from multicore workstations petascale supercomputers.
We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful model. This work studies measures performance factors involved in applying this approach enable use of by broader user community, with greater ease. Our enables highly parallel computations composed serial jobs no modifications respective applications. allows new---and potentially far larger---class applications leverage systems, such as IBM Blue...
We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful model. This work studies measures performance factors involved in applying this approach enable use of by broader user community, with greater ease. Our enables highly parallel computations composed serial jobs no modifications respective applications. allows new-and potentially far larger-class applications leverage systems, such as IBM Blue Gene/P...
Genomic structural variants comprise a significant fraction of somatic mutations driving cancer onset and progression. However, such are not readily revealed by standard next-generation sequencing. Optical genome mapping (OGM) surpasses short-read sequencing in detecting large (>500 bp) complex (SVs) but requires isolation ultra-high-molecular-weight DNA from the tissue interest. We have successfully applied protocol involving paramagnetic nanobind disc to wide range solid tumors. Using as...
Abstract The virtual data model allows sets to be described prior to, and separately from, their physical materialization. We have implemented this in a Virtual Data Language (VDL) associated supporting tools, which provide for both the storage, query, retrieval of set descriptions, automated, on‐demand materialization sets. use standardized provenance challenge exercise illustrate powerful queries that can performed on maintained by these single include three elements: computational...
The Grid2003 Project has deployed a multivirtual organization, application-driven grid laboratory ("Grid3") that sustained for several months the production-level services required by physics experiments of Large Hadron Collider at CERN (ATLAS and CMS), Sloan Digital Sky Survey project, gravitational wave search experiment LIGO, BTeV Fermilab, as well applications in molecular structure analysis genome analysis, computer science research projects such areas job data scheduling....
Structural variations (SVs) play a key role in the pathogenicity of hematological malignancies. Standard-of-care (SOC) methods such as karyotyping and fluorescence situ hybridization (FISH), which have been employed globally for past three decades, significant limitations terms resolution number recurrent aberrations that can be simultaneously assessed, respectively. Next-generation sequencing (NGS)-based technologies are now widely used to detect clinically sequence variants but limited...
The recommended practice for individuals suspected of a genetic etiology disorders including unexplained developmental delay/intellectual disability (DD/ID), autism spectrum (ASD), and multiple congenital anomalies (MCA) involves testing workflow chromosomal microarray (CMA), Fragile-X testing, karyotype analysis, and/or sequencing-based gene panels. Since genomic imbalances are often found to be causative, CMA is as first tier many indications. Optical genome mapping (OGM) an emerging next...
Parallel scripting is a loosely-coupled programming model in which applications are composed of highly parallel scripts program invocations that process and exchange data via files. We characterize here the can benefit from on petascale-class machines, describe mechanisms make this feasible such systems, present results achieved with currently available petascale computers.
Python is increasingly the lingua franca of scientific computing. It used as a higher level language to wrap lower-level libraries and compose scripts from various independent components. However, scaling moving programs laptops supercomputers remains challenge. Here we present Parsl, parallel scripting library for Python. Parsl makes it straightforward developers implement parallelism in by annotating functions that can be executed asynchronously parallel, scale analyses laptop thousands...
The SREB (Super-conserved Receptors Expressed in Brain) family of G protein-coupled receptors is highly conserved across vertebrates and consists three members: SREB1 (orphan receptor GPR27), SREB2 (GPR85), SREB3 (GPR173). Ligands for these are largely unknown or only recently identified, functions all still beginning to be understood, including roles glucose homeostasis, neurogenesis, hypothalamic control reproduction. In addition the brain, expressed gonads, but relatively few studies have...
Large-scale HPC workflows are increasingly implemented in dynamic languages such as Python, which allow for more rapid development than traditional techniques. However, the cost of executing Python applications at scale is often dominated by distribution common datasets and complex software dependencies. As application scales up, data becomes a limiting factor that prevents scaling beyond few hundred nodes. To address this problem, we present integration Parsl (a Python-native parallel...
Abstract Genomic structural variants comprise a significant fraction of somatic mutations driving cancer onset and progression. However, such are not readily revealed by standard next generation sequencing. Optical genome mapping (OGM) surpasses short read sequencing in detecting large (>500bp) complex (SVs) but requires isolation ultra-high molecular weight DNA from the tissue interest. We have successfully applied protocol involving paramagnetic nanobind disc to wide range solid tumors....