- Statistical Methods and Inference
- Bayesian Methods and Mixture Models
- Computational Drug Discovery Methods
- Complex Network Analysis Techniques
- Markov Chains and Monte Carlo Methods
- Bioinformatics and Genomic Networks
- Advanced Clustering Algorithms Research
- Sparse and Compressive Sensing Techniques
- Machine Learning and Algorithms
- Metabolomics and Mass Spectrometry Studies
- Gene expression and cancer classification
- Face and Expression Recognition
- Gaussian Processes and Bayesian Inference
- Data Stream Mining Techniques
- Advanced Causal Inference Techniques
- Machine Learning and Data Classification
- Imbalanced Data Classification Techniques
- Pharmacogenetics and Drug Metabolism
- Statistical Methods and Bayesian Inference
- Domain Adaptation and Few-Shot Learning
- Topological and Geometric Data Analysis
- Coal Properties and Utilization
- Data-Driven Disease Surveillance
- vaccines and immunoinformatics approaches
- Machine Learning and ELM
Fudan University
2022-2025
Citadel
2023
Heilongjiang Provincial Academy of Agricultural Sciences
2023
Jianghan University
2023
Beijing Proteome Research Center
2022
Beijing Radiation Center
2021-2022
State Key Laboratory of Genetic Engineering
2022
The University of Texas at Austin
2016-2018
Peking University
2012
Abstract Combination therapy has shown an obvious efficacy on complex diseases and can greatly reduce the development of drug resistance. However, even with high-throughput screens, experimental methods are insufficient to explore novel combinations. In order search space combinations, there is urgent need develop more efficient computational predict recent decades, machine learning (ML) algorithms have been applied improve predictive performance. The object this study introduce discuss...
The toxic effects of compounds on environment, humans, and other organisms have been a major focus many research areas, including drug discovery ecological research. Identifying the potential toxicity in early stage compound/drug is critical. rapid development computational methods for evaluating various categories has increased need comprehensive system-level collection toxicological data, associated attributes, benchmarks. To contribute toward this goal, we proposed TOXRIC...
This paper introduces a novel framework, HodgeRank on Random Graphs, based paired comparison, for subjective video quality assessment. Two types of random graph models are studied, i.e., Erdös-Rényi graphs and regular graphs. Hodge decomposition comparison data may derive, from incomplete imbalanced data, scores videos inconsistency participants' judgments. We demonstrate the effectiveness proposed framework LIVE database. Both two designs promising sampling methods without jeopardizing...
Abstract Combination therapy has shown an obvious curative effect on complex diseases, whereas the search space of drug combinations is too large to be validated experimentally even with high-throughput screens. With increase number drugs, artificial intelligence techniques, especially machine learning methods, have become applicable for discovery synergistic significantly reduce experimental workload. In this study, in order predict novel various cancer cell lines, line-specific...
Abstract Amino acid metabolism is an important factor in regulating nitrogen source assimilation and source/sink transport soybean. Melatonin can improve plant stress resistance, but whether it affects amino not known. Therefore, this study investigated exogenous melatonin had effect on of soybean under drought conditions explored its relationship with yield. The treatments were normal water supply treatment (WW), (D), group (D + M), sprayed 100 μmol/L melatonin. effects grain filling...
Abstract Background The accumulation of various multi-omics data and computational approaches for integration can accelerate the development precision medicine. However, algorithm remains a pressing challenge. Results Here, we propose based on random walk with restart (RWR) multiplex network. We call resulting methodology Random Walk Restart multi-dimensional Fusion (RWRF). RWRF uses similarity network samples as basis integration. It constructs each type then connects corresponding multiple...
In the process of drug discovery, drug-induced liver injury (DILI) is still an active research field and one most common important issues in toxicity evaluation research. It directly leads to high wear attrition drug. At present, there are a variety computer algorithms based on molecular representations predict DILI. found that single representation method insufficient complete task prediction, multiple fingerprint fusion methods have been used as model input. order solve problem dimensional...
Abstract We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel network data. Since all are log-linear in form when block assignments known, latent model versions combine a membership estimator with algebraic statistics machinery testing models. describe Markov bases marginal polytopes discuss how both facilitate development understanding behaviour. The general methodology developed here extends to any finite...
Community detection is a fundamental unsupervised learning problem for unlabeled networks which has broad range of applications. Many community algorithms assume that the number clusters $r$ known apriori. In this paper, we propose an approach based on semi-definite relaxations, does not require prior knowledge model parameters like many existing convex relaxation methods and recovers clustering matrix exactly under parameter regime, with probability tending to one. On variety simulated real...
Complex performance measures, beyond the popular measure of accuracy, are increasingly being used in context binary classification. These complex measures typically not even decomposable, that is, loss evaluated on a batch samples cannot be expressed as sum or average losses at individual samples, which turn requires new theoretical and methodological developments standard treatments supervised learning. In this paper, we advance understanding classification for by identifying two key...
Clustering is one of the most important unsupervised problems in machine learning and statistics. Among many existing algorithms, kernel k-means has drawn much research attention due to its ability find non-linear cluster boundaries inherent simplicity. There are two main approaches for k-means: SVD matrix convex relaxations. Despite clustering received both from theoretical applied quarters, not known about robustness methods. In this paper we first introduce a semidefinite programming...
In this paper, we investigate community detection in networks the presence of node covariates. many instances, covariates and individually only give a partial view cluster structure. One needs to jointly infer full structure by considering both. statistics, an emerging body work has been focused on combining information from both edges network memberships. However, so far theoretical guarantees have established dense regime, where can lead perfect clustering under broad parameter hence role...
Extensive amounts of multi-omics data and multiple cancer subtyping methods have been developed rapidly, generate discrepant clustering results, which poses challenges for molecular subtype research. Thus, the development identification consensus subtypes is essential. The lack intuitive easy-to-use analytical tools has posed a barrier. Here, we report on COnsensus Molecular SUbtype Cancer (COMSUC) web server. With COMSUC, users can explore more than 30 cancers based eight methods, five...