- Genomics and Phylogenetic Studies
- Machine Learning in Bioinformatics
- Minerals Flotation and Separation Techniques
- Genetics, Bioinformatics, and Biomedical Research
- Bacteriophages and microbial interactions
- Neural dynamics and brain function
- Microbial infections and disease research
- Text and Document Classification Technologies
- RNA and protein synthesis mechanisms
- Topic Modeling
- Algorithms and Data Compression
- Time Series Analysis and Forecasting
- Gene expression and cancer classification
- Functional Brain Connectivity Studies
- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Scientific Computing and Data Management
- Advanced Proteomics Techniques and Applications
- Chemical and Environmental Engineering Research
- Particle accelerators and beam dynamics
- EEG and Brain-Computer Interfaces
University of Arizona
2024
University of Montana
2021
" Fast is fine, but accuracy final. -- Wyatt Earp.
Bacteriophages are viruses that infect bacteria. Many bacteriophages integrate their genomes into the bacterial chromosome and become prophages. Prophages may substantially burden or benefit host bacteria fitness, acting in some cases as parasites others mutualists. Some prophages have been demonstrated to increase virulence. The increasing ease of genome sequencing provides an opportunity deeply explore prophage prevalence insertion sites. Here we present VIBES (Viral Integrations Bacterial...
Abstract Annotation of a biological sequence is usually performed by aligning that to database known elements. When contains elements are highly similar each other, the proper annotation may be ambiguous, because several entries in produce high-scoring alignments. Typical methods work assigning label based on candidate with highest alignment score; this can overstate certainty, mislabel boundaries, and fails identify large scale rearrangements or insertions within annotated sequence. Here,...
Protein language models (PLMs) have recently demonstrated potential to supplant classical protein database search methods based on sequence alignment, but are slower than common alignment-based tools and appear be prone a high rate of false labeling. Here, we present NEAR, method neural representation learning that is designed improve both speed accuracy for likely homologs in large database. NEAR's ResNet embedding model trained using contrastive guided by trusted alignments. It computes...
We present SODA, a lightweight and open-source visualization library for biological sequence annotations that enables straightforward development of flexible, dynamic, interactive web graphics. SODA is implemented in TypeScript can be used as within JavaScript.