- Genomics and Phylogenetic Studies
- Bioinformatics and Genomic Networks
- Complex Network Analysis Techniques
- Topological and Geometric Data Analysis
- Data Visualization and Analytics
- Graph Theory and Algorithms
- Machine Learning in Bioinformatics
- Algorithms and Data Compression
- Chromosomal and Genetic Variations
- Cell Image Analysis Techniques
- Gene expression and cancer classification
- Cloud Computing and Resource Management
- Peer-to-Peer Network Technologies
- Interconnection Networks and Systems
- Opinion Dynamics and Social Influence
- Parallel Computing and Optimization Techniques
- Plant and animal studies
- Advanced Graph Neural Networks
- Caching and Content Delivery
- Hydrology and Watershed Management Studies
- RNA and protein synthesis mechanisms
- CRISPR and Genetic Engineering
- Advanced Graph Theory Research
- Genetics, Bioinformatics, and Biomedical Research
- Artificial Intelligence in Healthcare
Washington State University
2016-2025
Iowa State University
2003-2023
Washington State University Spokane
2018
University of Iowa
2009
Engineering Arts (United States)
2009
Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there only limited support community large-scale parallel computers, largely owing irregular and inherently sequential nature underlying heuristics. In this paper, we present parallelization heuristics...
In most real-world networks, the nodes/vertices tend to be organized into tightly-knit modules known as communities or clusters, such that nodes within a community are more likely "related" one another than they rest of network. The goodness partitioning is typically measured using well measure called modularity. However, modularity optimization an NP-complete problem. 2008, Blondel, et al. introduced multi-phase, iterative heuristic for optimization, Louvain method. Owing its speed and...
Snow Water-Equivalent (SWE)—the amount of water available if snowpack is melted—is a key decision variable used by management agencies to make irrigation, flood control, power generation, and drought decisions. SWE values vary spatiotemporally—affected weather, topography, other environmental factors. While daily can be measured Telemetry (SNOTEL) stations with requisite instrumentation, such are spatially sparse requiring interpolation techniques create spatiotemporal complete data. recent...
Transformer models have become widely popular in numerous applications, and especially for building foundation large language (LLMs). Recently, there has been a surge the exploration of transformer-based architectures non-LLM applications. In particular, self-attention mechanism within transformer architecture offers way to exploit any hidden relations data, making it applicable variety spatio-temporal tasks scientific computing domains (e.g., weather, traffic, agriculture). Most these...
Clustering expressed sequence tags (ESTs) is a powerful strategy for gene identification, expression studies and identifying important genetic variations such as single nucleotide polymorphisms. To enable fast clustering of large-scale EST data, we developed PaCE (for Parallel ESTs), software program on parallel computers. In this paper, report the design development its evaluation using Arabidopsis ESTs. The novel features our approach include: (i) memory efficient algorithms to reduce...
Most of our understanding plant genome structure and evolution has come from the careful annotation small (e.g., 100 kb) sequenced genomic regions or automated complete sequences. Here, we carefully annotated a contiguous 22 Mb region maize chromosome 4 using an improved pseudomolecule for annotation. The sequence segment was comprehensively ordered, oriented, confirmed optical map. Nearly 84% is composed transposable elements (TEs) that are mostly nested within each other, which most...
As managers of agricultural and natural resources are confronted with uncertainties in global change impacts, the complexities associated interconnected cycling nitrogen, carbon, water present daunting management challenges. Existing models provide detailed information on specific sub-systems (e.g., land, air, water, economics). An increasing awareness unintended consequences decisions resulting from interconnectedness these sub-systems, however, necessitates coupled regional earth system...
Virulence acquisition and loss is a dynamic adaptation of pathogens to thrive in changing milieus. We investigated the mechanisms virulence at whole genome level using Babesia bovis as model apicomplexan which genetically related attenuated parasites can be reliably derived from virulent parental strains natural host. expected accompanied by consistent changes gene level, that such would shared among diverse geographic genetic background. Surprisingly, while single nucleotide polymorphisms...
Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there only limited support community large-scale parallel computers, largely owing irregular and inherently sequential nature underlying heuristics. In this paper, we present parallelization heuristics...
Graph colouring is used to identify subsets of independent tasks in parallel scientific computing applications. Traditional heuristics aim reduce the number colours as that also corresponds steps application. However, if color classes produced have a skew their sizes, utilization hardware resources becomes inefficient, especially for smaller classes. Equitable theoretical formulation guarantees perfect balance among classes, and its practical relaxation referred balanced colouring. In this...
In an era when power constraints and data movement are proving to be significant barriers for the application of high-end computing, Tilera many-core architecture offers a low-power platform exhibiting many important characteristics future systems, including large number simple cores, sophisticated network-on-chip, fine-grained control over memory caching policies. While this emerging has been previously studied structured compute-intensive kernels, benchmarking data-bound, irregular...
Methods to efficiently uncover and extract community structures are required in a number of biological applications where networked data their interactions can be modeled as graphs, observing tightly-knit groups vertices ("communities") offer insights into the structural functional building blocks underlying network. Classical detection have largely focused on unipartite networks - i.e., graphs built out single type objects. However, due increased availability from various sources, there is...
Graph coloring-in a generic sense-is used to identify subsets of independent tasks in parallel scientific computing applications. Traditional coloring heuristics aim reduce the number colors as that also corresponds steps application. However, if color classes produced have skew their sizes, utilization hardware resources becomes inefficient, especially for smaller classes. Equitable is theoretical formulation guarantees perfect balance among classes, and its practical relaxation referred...
Traditional implementations of parallel graph operations on distributed memory platforms are written using Message Passing Interface (MPI) point-to-point communication primitives such as Send-Recv (blocking and nonblocking). Apart from this classical model, the MPI community has over years added other models; however, their suitability for handling irregular traffic workloads typical remain comparatively less explored. Our aim in paper is to study these relatively underutilized models...
Community detection is a discovery tool used by network scientists to analyze the structure of real-world networks. It seeks identify natural divisions that may exist in input networks partition vertices into coherent modules (or communities). While this problem space rich with efficient algorithms and software, most literature caters static use-case where underlying does not change. However, many emerging use-cases give rise need incorporate dynamic graphs as inputs. In paper, we present...
Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions sequences, the goal identify all pairs sequences that are highly similar (or "homologous") on basis alignment criteria. While there optimal algorithms compute pairwise homology, their deployment for large-scale currently not feasible; instead, heuristic methods used at expense quality. Here, we present design evaluation parallel...
Graph algorithms on parallel architectures present an interesting case study for irregular applications. In this paper, we address one such application – of clustering real-world graphs constructed out biological data using computers. We the design and evaluation two different implementations a serial graph heuristic called Shingling heuristic, which was developed by Gibson et al. OpenMP shared memory implementation pClust-sm, were able to improve both asymptotic runtime complexities...
Maximum Parsimony phylogenetic tree reconstruction is based on finding the breakpoint median, given a set of species, and represented by bounded edge-weight graph model. This reduces median problem to one solving multiple instances Traveling Salesman Problem (TSP), which classical NP-complete in theory. Exponential time algorithms that apply efficient runtime heuristics, such as branch-and-bound, dynamically prune search space are used solve TSP. In this paper, we present design performance...
Methods to uncover and extract community structures are required in a number of biological applications where networked data their interactions can be modeled as graphs, observing tightly-knit groups vertices ("communities") offer insights into the structural functional building blocks underlying network. While classical detection have focused largely on detecting molecular complexes from protein-protein networks other similar there is an increasing need for extending operation work...
Predicting the spatiotemporal variation in streamflow along with uncertainty quantification enables decision-making for sustainable management of scarce water resources. Process-based hydrological models (aka physics-based models) are based on physical laws, but use simplifying assumptions which can lead to poor accuracy. Data-driven approaches offer a powerful alternative, they require large amount training data and tend produce predictions that inconsistent laws. This paper studies...
The emergence of 2.5D chiplet platforms provides a new avenue for compact scale-out implementations deep learning (DL) workloads (WLs). Integrating multiple small chiplets using network-on-interposer (NoI) offers not only significant cost reduction and higher manufacturing yield than 2-D ICs but also better energy efficiency performance. However, defects in may compromise performance since they restrict the computing capability. Therefore, carefully designed NoI link placement, task mapping...
Phenomics is an emerging branch of modern biology that uses high throughput phenotyping tools to capture multiple environmental and phenotypic traits, often at massive spatial temporal scales. The resulting dimensional data represent a treasure trove information for providing in-depth understanding how factors interact contribute the overall growth behavior different genotypes. However, computational can parse through such complex aid in extracting plausible hypotheses are currently lacking....
Graph algorithms on parallel architectures present an interesting case study for irregular applications. Among the graph popular in scientific computing, clus tering or community detection has numerous applications computational biology. However, this operation also poses serious challenges because of memory access patterns, large requirements, and their dependence other auxiliary (also irregular) data structures to supplement processing. In paper, we address problem clustering shared...
Graph clustering, popularly known as community detection, is a fundamental graph operation used in many applications related to network analysis and cybersecurity. The goal of detection partition into “communities” such that each consists tightly-knit group nodes with relatively sparser connections the rest network. To compute clustering on large-scale networks, efficient parallel algorithms capable fully exploiting features modern architectures are needed. However, due their irregular...