- Distributed systems and fault tolerance
- Ionosphere and magnetosphere dynamics
- Geophysics and Gravity Measurements
- Astrophysics and Cosmic Phenomena
- Genomics and Phylogenetic Studies
- Optimization and Search Problems
- Advanced Data Storage Technologies
- Bioinformatics and Genomic Networks
- Microbial Metabolic Engineering and Bioproduction
- Microbial Community Ecology and Physiology
- Gene expression and cancer classification
- Caching and Content Delivery
- Algorithms and Data Compression
- Complexity and Algorithms in Graphs
- Distributed and Parallel Computing Systems
- Cloud Computing and Resource Management
- Parallel Computing and Optimization Techniques
- Blockchain Technology Applications and Security
- Interconnection Networks and Systems
- Advanced Memory and Neural Computing
- Machine Learning in Bioinformatics
- Molecular Biology Techniques and Applications
- Methane Hydrates and Related Phenomena
- Cooperative Communication and Network Coding
- Cryptography and Data Security
Massachusetts Institute of Technology
2016-2024
Broad Institute
2020-2024
FIT Consulting (Italy)
2024
Harvard University
2022
IIT@MIT
2016-2021
Moscow Institute of Thermal Technology
2018
University of British Columbia
2011-2017
University of Northern British Columbia
2014
University of Connecticut
2002-2012
Bipar
2012
We present a programmable droplet-based microfluidic device that combines the reconfigurable flow-routing capabilities of integrated microvalve technology with sample compartmentalization and dispersion-free transport is inherent to droplets. The allows for execution user-defined multistep reaction protocols in 95 individually addressable nanoliter-volume storage chambers by consecutively merging sequences picoliter-volume droplets containing reagents or cells. This functionality enabled...
Oil in subsurface reservoirs is biodegraded by resident microbial communities. Water-mediated, anaerobic conversion of hydrocarbons to methane and CO2, catalyzed syntrophic bacteria methanogenic archaea, thought be one the dominant processes. We compared 160 community compositions ten hydrocarbon resource environments (HREs) sequenced twelve metagenomes characterize their metabolic potential. Although communities were common, cores from oil sands coal beds had unexpectedly high proportions...
Despite recent advances in metagenomic and single-cell genomic sequencing to investigate uncultivated microbial diversity metabolic potential, fundamental questions related population structure, interactions, biogeochemical roles of candidate divisions remain. Numerous molecular surveys suggest that stratified ecosystems manifesting anoxic, sulfidic, and/or methane-rich conditions are enriched these enigmatic microbes. Here we describe diversity, abundance, cooccurrence patterns communities...
Abstract Background A central challenge to understanding the ecological and biogeochemical roles of microorganisms in natural human engineered ecosystems is reconstruction metabolic interaction networks from environmental sequence information. The dominant paradigm assign functional annotations using BLAST. Functional are then projected onto symbolic representations metabolism form KEGG pathways or SEED subsystems. Results Here we present MetaPathways, an open source pipeline for pathway...
Marine Group A (MGA) is a deeply branching and uncultivated phylum of bacteria. Although their functional roles remain elusive, MGA subgroups are particularly abundant diverse in oxygen minimum zones permanent or seasonally stratified anoxic basins, suggesting metabolic adaptation to oxygen-deficiency. Here, we expand previous survey diversity O2-deficient waters the Northeast subarctic Pacific Ocean (NESAP) include Saanich Inlet (SI), an fjord with seasonal O2 gradients periodic sulfide...
Recent evidence suggests an immunomodulatory role for commensal fungi (mycobiota) in the gut, yet little is known about composition and dynamics of early-life gut fungal communities. In this work, we show first time that mycobiota Canadian infants changes dramatically over course year life, associated with environmental factors such as geographical location, diet, season birth, can be used conjunction knowledge a small number key to predict inhalant atopy status at age 5 years.
A fuzzy knowledge-based network is developed based on the linguistic rules extracted from a decision tree. scheme for automatic discretization of continuous attributes, quantiles, formulated. novel concept measuring goodness tree, in terms its compactness (size) and efficient performance, introduced. Linguistic are quantitatively evaluated using new indices. The mapped to network, incorporating frequency samples depth attributes New fuzziness measures, class memberships, used at node level...
Abstract Marine Group A (MGA) is a candidate phylum of Bacteria that ubiquitous and abundant in the ocean. Despite being prevalent, structural functional properties MGA populations remain poorly constrained. Here, we quantified diversity population structure relation to nutrients O2 concentrations oxygen minimum zone (OMZ) Northeast subarctic Pacific Ocean using combination catalyzed reporter deposition fluorescence situ hybridization (CARD-FISH) 16S small subunit ribosomal RNA (16S rRNA)...
A convergence of high-throughput sequencing and computational power is transforming biology into information science. Despite these technological advances, converting bits bytes sequence meaningful insights remains a challenging enterprise. Biological systems operate on multiple hierarchical levels from genomes to biomes. Holistic understanding biological requires agile software tools that permit comparative analyses across (DNA, RNA, protein, metabolites) identify emergent properties,...
Abstract Summary: Next-generation sequencing is producing vast amounts of sequence information from natural and engineered ecosystems. Although this data deluge has an enormous potential to transform our lives, knowledge creation translation need software applications that scale with increasing processing analysis requirements. Here, we present improvements MetaPathways, annotation pipeline for environmental expedites transformation. We specifically address pathway prediction hazards through...
The reconstruction of complete microbial metabolic pathways using 'omics data from environmental samples remains challenging. Computational pipelines for pathway that utilize machine learning methods to predict the presence or absence KEGG modules in incomplete genomes are lacking. Here, we present MetaPathPredict, a software tool incorporates models within bacterial genomic datasets. Using gene annotation and information module database, MetaPathPredict employs deep genome. can be used as...
Abstract Immunotherapy often relies on biologics with systemic administration, which can lead to adverse side effects. An innovative approach utilizes a wearable device deliver alternating magnetic fields (AMFs) induce localized immune responses in tumors. Here, we present results from preclinical evaluation of Asha™ Therapy, proprietary, low-intensity (1–3 mT), 50 kHz AMF application, developed locally cancer immunity. To explore the potential for Asha therapy as treatment solid tumors, an...
Pairwise comparison of time series data for both local and time-lagged relationships is a computationally challenging problem relevant to many fields inquiry. The Local Similarity Analysis (LSA) statistic identifies the existence lagged relationships, but determining significance through p-value has been algorithmically cumbersome due an intensive permutation test, shuffling rows columns repeatedly calculating statistic. Furthermore, this calculated with assumption normality -- statistical...
CA-Polar codes have been selected for all control channel communications in 5G NR, but accurate, computationally feasible decoders are still subject to development. Here we report the performance of a recently proposed class optimally precise Maximum Likelihood (ML) decoders, GRAND, that can be used with any block-code. As published theoretical results indicate GRAND is efficient short- length, high-rate and class, here consider GRAND's utility decoding them. Simulation by simple soft...
Motivation: A perennial problem in the analysis of environmental sequence information is assignment reads or assembled sequences, e.g. contigs scaffolds, to discrete taxonomic bins. In absence reference genomes for most microorganisms, use intrinsic nucleotide patterns and phylogenetic anchors can improve assembly-dependent binning needed more accurate functional annotation communities assist identifying mobile genetic elements lateral gene transfer events. Results: Here, we present a...
Summary: DNA-BAR is a software package for selecting DNA probes (henceforth referred to as distinguishers) that can be used in genomic-based identification of microorganisms. Given the genomic sequences microorganisms, finds near-minimum number distinguishers yielding distinct hybridization pattern each microorganism. Selected satisfy user specified bounds on length, melting temperature and GC content, well redundancy cross-hybridization constraints.
A fundamental step in the analysis of environmental sequence information is prediction potential genes or open reading frames (ORFs) encoding metabolic individual cells and entire microbial communities. FragGeneScan, a software designed to predict intact incomplete ORFs on short sequencing reads combines codon usage bias, error models start/stop patterns hidden Markov model find most likely path states from given input sequence, provides promising route for gene recovery datasets with...
Vineyards in wine regions around the world are reservoirs of yeast with oenological potential. Saccharomyces cerevisiae ferments grape sugars to ethanol and generates flavor aroma compounds wine. Wineries place a high-value on identifying native their region develop region-specific program. Commercial strains genetically very similar due population bottleneck in-breeding compared diversity S. from wild other industrial processes. We have isolated microsatellite-typed hundreds spontaneous...
In the era of large data, cloud is increasingly used as a computing environment, necessitating development cloud-compatible pipelines that can provide uniform analysis across disparate biological datasets. The WDL Analysis Research Pipelines (WARP) repository GitHub open-source, cloud-optimized workflows for data processing are semantically versioned, tested, and documented. A companion repository, WARP-Tools, hosts Docker containers custom tools in WARP workflows.
Accurate and fast image segmentation algorithms are of paramount importance for a wide range medical imaging applications. Level set based on narrow band implementation have been among the most widely used algorithms. However, accuracy standard level is compromised by fact that their evolution schemes deteriorate signed distance functions required accurate computation normals curvatures. The common remedy to use an ad-hoc reinitialization step rebuild function frequently. Meanwhile, complex...
The development of high-throughput sequencing technologies over the past decade has generated a tidal wave environmental sequence information from variety natural and human engineered ecosystems. resulting flood into public databases archived projects exponentially expanded computational resource requirements rendering most local homology-based search methods inefficient. We recently introduced MetaPathways v1.0, modular annotation analysis pipeline for constructing Pathway/Genome Databases...
Despite the hype about blockchains and distributed ledgers, formal abstractions of these objects are scarce1. To face this issue, in paper we provide a proper formulation ledger object. In brief, de ne object as sequence records, operations properties that such an should support. Implemen- tation on top multiple (possibly geographically dispersed) computing devices gives rise to contrast centralized object, dis- tribution allows be applied concurrently ledger, introducing challenges...
Abstract The reconstruction of complete microbial metabolic pathways using ‘omics data from environmental samples remains challenging. Computational pipelines for pathway that utilize machine learning methods to predict the presence or absence KEGG modules in incomplete genomes are lacking. Here, we present MetaPathPredict, a software tool incorporates models within bacterial genomic datasets. Using gene annotation and information module databases, MetaPathPredict employs neural network...