- Genomics and Phylogenetic Studies
- Algorithms and Data Compression
- Genome Rearrangement Algorithms
- Blockchain Technology Applications and Security
- Caching and Content Delivery
- Data Mining Algorithms and Applications
- Anomaly Detection Techniques and Applications
- Graph Theory and Algorithms
- Cloud Computing and Resource Management
- Chromosomal and Genetic Variations
- Topological and Geometric Data Analysis
- Crime Patterns and Interventions
- Data Visualization and Analytics
- Topic Modeling
- Peer-to-Peer Network Technologies
- Machine Learning in Bioinformatics
- Advanced Graph Neural Networks
- Complex Network Analysis Techniques
- Distributed systems and fault tolerance
Hangzhou Institute of Applied Acoustics
2020
Georgia Institute of Technology
2013-2015
Peking University
2010
Graph algorithms are becoming increasingly important for analyzing large datasets in many fields. Real-world graph data follows a pattern of sparsity, that is not uniform but highly skewed towards few items. Implementing traversal, statistics and machine learning on such scalable manner quite challenging. As result, several analytics frameworks (GraphLab, CombBLAS, Giraph, SociaLite Galois among others) have been developed, each offering solution with different programming models targeted at...
The Viterbi algorithm is the compute-intensive kernel in Hidden Markov Model (HMM) based sequence alignment applications. In this paper, we investigate extending several parallel methods, such as wave-front and streaming methods for Smith-Waterman algorithm, to achieve a significant speed-up on GPU. method can take advantage of computing power GPU but it cannot handle long sequences because physical memory limit. On other hand, process with increased overhead due data transmission between...
The accuracy of Conditional Random Fields (CRF) is achieved at the cost huge amount computation to train model. In this paper we designed parallelized algorithm for Gradient Ascent based CRF training methods biological sequence alignment. Our contribution mainly on two aspects: 1) We flexibly different iterative patterns, and according optimization are presented. 2) As Gibbs Sampling method, a way automatically predict iteration round, so that parallel could be run in more efficient manner....
Near repeat (NR) is a well known phenomenon in crime analysis assuming that events exhibit correlations within given time and space frame. Traditional NR calculation generates 2 event pairs if happened limit. When the number of large, however, consuming how these are organized not yet explored. In this paper, we designed new approach to calculate clusters efficiently. To begin with, R-tree utilized index events, single represented by vertex whereas edges constructed range querying R-tree,...
The problem of finding the median three genomes is key process in building most parsimonious phylogenetic trees from genome rearrangement data. using Double-Cut-and-Join (DCJ) distance NP-hard and best exact algorithm based on a branch-and-bound best-first search strategy to explore sub-graph patterns Multiple BreakPoint Graph (MBG). In this paper, by taking advantage "streaming" property MBG, we introduce "footprint-based" data structure reduce space requirement single nodes O(v2) O(v);...
To achieve high throughput in the POW based blockchain systems, researchers proposed a series of methods, and DAG is one most active promising fields. We designed implemented StreamNet, aiming to engineer scalable endurable system. When attaching new block DAG, only two tips are selected. One parent tip whose definition same as Conflux[1]; another using Markov Chain Monte Carlo (MCMC) technique by which IOTA [2]. infer pivotal chain along path each epoch graph, total order graph could be...