- Scientific Computing and Data Management
- Genomics and Phylogenetic Studies
- Distributed and Parallel Computing Systems
- Research Data Management Practices
- Plant nutrient uptake and metabolism
- Photosynthetic Processes and Mechanisms
- Metabolomics and Mass Spectrometry Studies
- Gamma-ray bursts and supernovae
- Plant and animal studies
- Data Visualization and Analytics
- Advanced Proteomics Techniques and Applications
- Pediatric Hepatobiliary Diseases and Treatments
- Reinforcement Learning in Robotics
- Mass Spectrometry Techniques and Applications
- RNA modifications and cancer
- Plant-Microbe Interactions and Immunity
- Parasitic Infections and Diagnostics
- Molecular Biology Techniques and Applications
- Protein Structure and Dynamics
- Cancer-related gene regulation
- Animal Genetics and Reproduction
- RNA and protein synthesis mechanisms
- Advanced Data Storage Technologies
- Amoebic Infections and Treatments
- Multimedia Communication and Technology
Texas Advanced Computing Center
2016-2023
J. Craig Venter Institute
2008-2017
National Human Genome Research Institute
1999
National Institutes of Health
1999
A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using tools. The high cognitive load required to navigate such a workflow is detrimental hypothesis generation. Accordingly, there need for robust platform that incorporates all provides integrated search, analysis, visualization features through single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), visual analytic tool exploring...
The Arabidopsis Information Portal (https://www.araport.org) is a new online resource for plant biology research. It houses the thaliana genome sequence and associated annotation. was conceived as framework that allows research community to develop release 'modules' integrate, analyze visualize data may reside at remote sites. current implementation provides an indexed database of core genomic information. These are made available through feature-rich web applications provide search, mining,...
Mass spectrometry (MS) based label-free protein quantitation has mainly focused on analysis of ion peak heights and peptide spectral counts. Most analyses tandem mass (MS/MS) data begin with an enzymatic digestion a complex mixture to generate smaller peptides that can be separated identified by MS/MS instrument. Peptide counting techniques attempt quantify abundance the number detected tryptic their corresponding MS spectra. However, is confounded fact physicochemical properties severely...
ThaleMine (https://apps.araport.org/thalemine/) is a comprehensive data warehouse that integrates wide array of genomic information the model plant Arabidopsis thaliana. The collection currently includes latest structural and functional annotation from Araport11 update, Col-0 genome sequence, RNA-seq expression, co-expression, protein interactions, homologs, pathways, publications, alleles, germplasm phenotypes. are collected variety public resources. Users can browse gene-specific through...
The outcome of an Entamoeba histolytica infection is variable and can result in either asymptomatic carriage, immediate or latent disease (diarrhea/dysentery/amebic liver abscess). An E. multilocus genotyping system based on tRNA gene-linked arrays has shown that genetic differences exist among parasites isolated from patients with different symptoms however, the cannot be located current assembly Reference genome (strain HM-1:IMSS) are highly variable. To probe population structure identify...
Summary Araport is an open‐source, online community resource for research on the Arabidopsis thaliana genome and related data. developed through a partnership between J. Craig Venter Institute, Texas Advanced Computing Center at The University of Austin, Cambridge. Part open architecture Science Applications Workspace. Taking ‘app store’ approach, users can choose applications both by team developers to create customized environment their work. also provides tooling support developing...
The Homeodomain Resource is a comprehensive collection of sequence, structure and genomic information on the homeodomain protein family. Available through are both full-length domain-only sequence data, as well X-ray NMR structural data for proteins protein-DNA complexes. Also available human genetic diseases disorders in which from family play an important role; includes relevant gene symbols, cytogenetic map locations, specific mutation data. Search engines provided to allow users easily...
Beginning with the initial release of DesignSafe JupyterHub in late 2015, TACC has been building and maintaining custom clusters for research groups across different domains science engineering. Today, maintains five production systems utilizing over half a terabyte memory hundreds CPU cores supporting nearly 1,600 unique users combined. In this paper, we describe our approach to these cyberinfrastructure projects collaborative integrating Jupyter into communities. For two such groups,...
Containers are becoming essential to support the diversity of scientific computing workloads at academic centers. Here, we offer perspectives and experiences from Texas Advanced Computing Center on: installation, configuration, select containerization platforms; incorporation containers into module system improve their discoverability usability; facilitation advanced use cases including MPI containers, GPU for multiple instruction set architectures; finally on best practices end users...
WebBLAST is a suite of programs intended to assist in organizing sequencing data and provide first-pass sequence analysis an automated fashion. Data processing fully automated, with end-users being presented both graphical tabular summaries that can be viewed using any Web browser.The program free available at http://genome.nhgri.nih. gov/webblast.
The Histone Sequence Database is an annotated and searchable collection of all available histone fold sequences structures. Particular emphasis has been placed on documenting conflicts between similar sequence entries from a number source databases, that are not necessarily documented in the databases themselves. New additions to database include compilations post-translational modifications for each core linker histones, as well genomic information form map loci human gene complement, with...
The Arabidopsis Information Portal (AIP) is an open-access online community resource for research on the thaliana genome and related data. AIP developed through a partnership between J. Craig Venter Institute, Texas Advanced Computing Center at University of Austin, Cambridge. Part open architecture science applications workspace. Researchers can select both by team from \app store" to construct customized environment their work. provides tooling support developing including application...
Virtual screening is a key step of the drug discovery process which utilizes computational resources to simulate behavior small molecules in binding site target protein. [13] Researchers often test millions when searching for an early hit compound, requiring significant CPU hours. An accessible, convenient, fast, and computationally efficient means virtual desirable order researchers conserve phase discovery. We developed application programming interface (API) integrated workflow that...
The standard process for software development has changed dramatically in the past decade. What was once a large effort of installing same across different systems become much more streamlined with rapid emergence and wide-scale adoption Docker as de facto container management ecosystem. Coincidentally, this had an impact HPC scientific computing community, allowing system maintainers to maintain install packages easier effort[12]. This can be seen through containers on many scale systems,...