- Scientific Computing and Data Management
- Research Data Management Practices
- Cancer Genomics and Diagnostics
- Genetics, Bioinformatics, and Biomedical Research
- Genomics and Phylogenetic Studies
- Distributed and Parallel Computing Systems
- Cell Image Analysis Techniques
- Bacteriophages and microbial interactions
- Bioinformatics and Genomic Networks
- RNA and protein synthesis mechanisms
- Computational Drug Discovery Methods
- Evolution and Genetic Dynamics
- Protein Structure and Dynamics
- SARS-CoV-2 detection and testing
- Single-cell and spatial transcriptomics
- Viral Infections and Outbreaks Research
- Advanced Numerical Analysis Techniques
- RNA Research and Splicing
- vaccines and immunoinformatics approaches
- Genetic diversity and population structure
- Image Processing and 3D Reconstruction
- Machine Learning in Bioinformatics
- Manufacturing Process and Optimization
- COVID-19 epidemiological studies
- Gene expression and cancer classification
Pennsylvania State University
2014-2023
Consumer Healthcare Products Association
2021
Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started 2005, continues focus on three key challenges data-driven science: making analyses accessible all researchers, ensuring are completely reproducible, it simple communicate so that they can be...
High-throughput data production technologies, particularly 'next-generation' DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical computational methods, as well substantial power. This has led an acute crisis life sciences, researchers without informatics training attempt perform computation-dependent analyses. Since 2005, Galaxy project worked address...
Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues use, maintain contribute the project, support from multiple national infrastructure providers that enable freely analysis training services. The Training Network supports free, self-directed, virtual >230 integrated tutorials. Project engagement metrics have continued grow...
Abstract HYpothesis testing using PHYlogenies (HyPhy) is a scriptable, open-source package for fitting broad range of evolutionary models to multiple sequence alignments, and conducting subsequent parameter estimation hypothesis testing, primarily in the maximum likelihood statistical framework. It has become popular choice characterizing various aspects process: natural selection, rates, recombination, coevolution. The 2.5 release (available from www.hyphy.org) includes completely...
Abstract The proliferation of web-based integrative analysis frameworks has enabled users to perform complex analyses directly through the web. Unfortunately, it also revoked freedom easily select most appropriate tools. To address this, we have developed Galaxy ToolShed.
Abstract We present Bioconda ( https://bioconda.github.io ), a distribution of bioinformatics software for the lightweight, multiplatform and language-agnostic package manager Conda. Currently, offers collection over 3000 packages, which is continuously maintained, updated, extended by growing global community more than 200 contributors. improves analysis reproducibility allowing users to define isolated environments with defined versions, all are easily installed managed without...
Abstract Motivation: RNAs fold into complex structures that are integral to the diverse mechanisms underlying RNA regulation of gene expression. Recent development transcriptome-wide structure profiling through application structure-probing enzymes or chemicals combined with high-throughput sequencing has opened a new field greatly expands amount in vitro and vivo structural information available. The resultant datasets provide opportunity investigate on global scale. However, analysis data...
The current state of much the Wuhan pneumonia virus (severe acute respiratory syndrome coronavirus 2 [SARS-CoV-2]) research shows a regrettable lack data sharing and considerable analytical obfuscation. This impedes global cooperation, which is essential for tackling public health emergencies requires unimpeded access to data, analysis tools, computational infrastructure. Here, we show that community efforts in developing open software tools over past 10 years, combined with national...
An important unmet need revealed by the COVID-19 pandemic is near-real-time identification of potentially fitness-altering mutations within rapidly growing SARS-CoV-2 lineages. Although powerful molecular sequence analysis methods are available to detect and characterize patterns natural selection modestly sized gene-sequence datasets, computational complexity these their sensitivity sequencing errors render them effectively inapplicable in large-scale genomic surveillance contexts....
The COVID-19 pandemic is the first global health crisis to occur in age of big genomic data.Although data generation capacity well established and sufficiently standardized, analytical not. To establish it necessary pull together computational resources deliver best open source tools analysis workflows within a ready use, universally accessible resource. Such resource should not be controlled by single research group, institution, or country. Instead maintained community users developers who...
Modern biology continues to become increasingly computational. Datasets are becoming progressively larger, more complex, and abundant. The computational savviness necessary analyze these data creates an ongoing obstacle for experimental biologists. Galaxy (galaxyproject.org) provides access tools in a web-based interface. It also major public biological repositories, allowing private be combined with datasets. is hosted on high-capacity servers worldwide accessible free, option installed...
An important component of efforts to manage the ongoing COVID19 pandemic is R apid A ssessment how natural selection contributes emergence and proliferation potentially dangerous S ARS-CoV-2 lineages CL ades (RASCL). The RASCL pipeline enables continuous comparative phylogenetics-based analyses rapidly growing clade-focused genome surveillance datasets, such as those produced following initial detection variants. From datasets automatically generates down-sampled codon alignments individual...
Abstract Summary Properly and effectively managing reference datasets is an important task for many bioinformatics analyses. Refgenie a asset management system that allows to easily organize, retrieve, share such datasets. Here, we describe the integration of refgenie into Galaxy platform. Server administrators are able configure make use made available on instance. Additionally, Data Manager tool has been developed provide graphical interface refgenie’s remote retrieval functionality. A...
Properly and effectively managing reference datasets is an important task for many bioinformatics analyses. Refgenie a asset management system that allows users to easily organize, retrieve share such datasets. Here, we describe the integration of refgenie into Galaxy platform. Server administrators are able configure make use made available on instance. In addition, Data Manager tool has been developed provide graphical interface refgenie's remote retrieval functionality. A large collection...
Abstract Background Protein–protein interactions play a crucial role in almost all cellular processes. Identifying interacting proteins reveals insight into living organisms and yields novel drug targets for disease treatment. Here, we present publicly available, automated pipeline to predict genome-wide protein–protein produce high-quality multimeric structural models. Results Application of our method the Human Yeast genomes yield interaction networks similar quality common experimental...
Abstract Protein-protein interactions play a crucial role in almost all cellular processes. Identifying interacting proteins reveals insight into living organisms and yields novel drug targets for disease treatment. Here, we present publicly available, automated pipeline to predict genome-wide protein-protein produce high-quality multimeric structural models. Application of our method the Human Yeast genomes yield interaction networks similar quality common experimental methods. We...