- Genomics and Phylogenetic Studies
- Genetic diversity and population structure
- Computational Drug Discovery Methods
- RNA and protein synthesis mechanisms
- Evolution and Paleontology Studies
- Bioinformatics and Genomic Networks
- Gene expression and cancer classification
- Identification and Quantification in Food
- Receptor Mechanisms and Signaling
- Amino Acid Enzymes and Metabolism
- RNA modifications and cancer
- Molecular Biology Techniques and Applications
- Machine Learning in Materials Science
- Prostate Cancer Treatment and Research
- Crystallization and Solubility Studies
- Cancer Genomics and Diagnostics
- Statistical Methods in Clinical Trials
- Influenza Virus Research Studies
- Protein Degradation and Inhibitors
- Radiopharmaceutical Chemistry and Applications
- Analytical Chemistry and Chromatography
- Genetics and Plant Breeding
- Multiple Myeloma Research and Treatments
- Aquatic Invertebrate Ecology and Behavior
- Algorithms and Data Compression
National University of Civil Engineering
2023-2024
Vietnam National University, Hanoi
2010-2023
Vietnam National University Ho Chi Minh City
2023
Thai Binh University of Medicine and Pharmacy
2020
Inserm
2015-2019
Centre de Recherche en Cancérologie de Marseille
2015-2018
Institut Paoli-Calmettes
2015-2017
Aix-Marseille Université
2015-2017
Cancer Research Center
2016-2017
VNU University of Science
2017
Abstract The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number possible combinations vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large combination dataset, consisting 11,576 experiments from 910 across 85 molecularly characterized cell lines, and results a DREAM Challenge evaluate computational strategies for...
Most protein substitution models use a single amino acid replacement matrix summarizing the biochemical properties of acids. However, site evolution is highly heterogeneous and depends on many factors that influence patterns. In this paper, we investigate different matrices for evolutionary rates. Indeed, variability rates corresponds to one most apparent heterogeneity among sites, there no reason assume patterns remain identical regardless rate. We first introduce LG4M, which composed four...
Abstract Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino models; however, they are typically complicated and slow. In this article, we propose QMaker, new ML method general time-reversible $Q$ matrix from large protein data set consisting of multiple sequence alignments. QMaker combines an efficient tree search algorithm, model selection for handling the heterogeneity among alignments,...
Amino acid substitution models are a key component in phylogenetic analyses of protein sequences. All commonly used amino available to date time-reversible, an assumption designed for computational convenience but not biological reality. Another significant downside time-reversible is that they do allow inference rooted trees without outgroups. In this article, we introduce maximum likelihood approach nQMaker, extension the recently published QMaker method, allows estimation time...
Abstract Background The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, alignment, and phylogenetic inference. Although several general models have been estimated from large diverse databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics influenza viruses raise need comprehensive studies these dangerous We propose an influenza-specific to enhance understanding evolution...
Amino acid substitution models play an essential role in inferring phylogenies from mitochondrial protein data. However, only few empirical have been estimated restricted data of a hundred species. The existing are unlikely to represent appropriately the amino substitutions thousands metazoan sequences.We selected 125,935 sequences 34,448 species kingdom estimate new targeting metazoa, vertebrates and invertebrate groups. help find significantly better likelihood comparison with models. We...
Cancer drug therapies are only effective in a small proportion of patients. To make things worse, our ability to identify these responsive patients before administering treatment is generally very limited. The recent arrival large-scale pharmacogenomic data sets, which measure the sensitivity molecularly profiled cancer cell lines panel drugs, has boosted research on discovery markers. However, no systematic comparison widely-used single-gene markers with multi-gene machine-learning...
Computational methods for Target Fishing (TF), also known as Prediction or Polypharmacology Prediction, can be used to discover new targets small-molecule drugs. This may result in repositioning the drug a indication improving our current understanding of its efficacy and side effects. While there is substantial body research on TF methods, still need improve their validation, which often limited small part available not easily interpretable by user. Here we discuss how target-centric are...
Background: Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets were introduced discover more these single-gene markers drug sensitivity. Very recently, machine learning regression has been investigate how well cell line sensitivity is predicted depending on type molecular profile. The latter revealed that expression most predictive profile in pan-cancer setting. However, no study date exploited GDSC...
<ns4:p><ns4:italic>Background:</ns4:italic>Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets, such as those by Genomics Drug Sensitivity in Cancer (GDSC) consortium, were introduced discover more these single-gene markers drug sensitivity. Very recently, machine learning regression has been investigate how well cell line sensitivity is predicted depending on type molecular profile. The latter revealed...
Amino acid replacement rate matrices are a crucial component of many protein analysis systems such as sequence similarity search, alignment, and phylogenetic inference. Ideally, the matrix reflects mutational behavior actual data under study; however, estimating amino requires large alignments is computationally expensive complex. As compromise, sub-optimal pre-calculated generic typically used for protein-based phylogeny. Sequence availability has now grown to point where problem-specific...
Abstract Summary: Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number general purpose have been proposed JTT, WAG, LG) since the seminal work Margaret Dayhoff co-workers. However, it has shown that specific to certain groups mitochondrial) or life domains viruses) differ significantly from average matrices, thus perform better when applied data which they dedicated. This Web server implements maximum-likelihood...
Meristems are central to plant growth and development, yet evidence of directly manipulating this control improve crop yield is scarce. Kernel row number (KRN) an important agronomic trait that can affect maize (Zea mays L.) yield. However, difficult select by phenotyping, since it highly variable in the mixed genetic backgrounds early selfing generations. This study sought marker-assisted backcrossing (MABC) a weak allele FASCIATED EAR 2 known inflorescence meristem size, but effect which...
The single-matrix amino acid (AA) substitution models are widely used in phylogenetic analyses; however, they unable to properly model the heterogeneity of AA rates among sites. multi-matrix mixture can handle site rate and outperform models. Estimating is a complex process no computer program available for this task. In study, we implemented so-called QMix based on algorithm LG4X LG4M with several enhancements automatically estimate from large datasets. employs QMaker instead XRATE...
Abstract Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino models, however, they are typically complicated and slow. In this paper, we propose QMaker, new ML method general time-reversible Q matrix from large protein dataset consisting of multiple sequence alignments. QMaker combines an efficient tree search algorithm, model selection for handling the heterogeneity among alignments, consideration...
Abstract Amino acid substitution models represent the rates among amino acids during evolution of protein sequences. The are a prerequisite for maximum likelihood or Bayesian methods to analyse phylogenetic relationships species based on their Estimating requires large datasets and intensive computation. In this paper, we presented estimation both time‐reversible model (Q.met) time non‐reversible (NQ.met) multicellular animals (Metazoa). Analyses showed that Q.met NQ.met were significantly...
Abstract Estimating parameters of amino acid substitution models is a crucial task in bioinformatics. The maximum likelihood (ML) approach has been proposed to estimate from large datasets. quality newly estimated normally assessed by comparing with the existing building ML trees. Two important questions remained are correlation true and required size training datasets reliable models. In this article, we performed simulation study answer these two based on simulated data. We genome...
Oncology drugs are only effective in a small proportion of cancer patients. Our current ability to identify these responsive patients before treatment is still poor most cases. Thus, there pressing need discover response markers for marketed and research oncology drugs. Screening against large panel cell lines has led the discovery new genomic vitro drug response. However, while identification such among thousands candidate drug-gene associations data error-prone, an appraisal effectiveness...
Abstract Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets were introduced discover more these single-gene markers drug sensitivity. Very recently, machine learning regression has been investigate how well cell line sensitivity is predicted depending on type molecular profile. The latter revealed that expression most predictive profile in pan-cancer setting. However, no study date exploited GDSC...
Computational methods for Target Fishing (TF), also known as Prediction or Polypharmacology Prediction, can be used to discover new targets in small-molecule drugs. This may result repositioning the drug a indication improving our current understanding of its efficacy and side effects. While there is substantial body research on TF methods, still need improve their validation, which often limited small part available not easily interpretable by user. Here we discuss how target-centric are...
Reconstructing phylogenetic trees from protein sequences normally requires empirical amino acid substitution models to calculate the likelihood of or genetic distances between species. The tree life is classified into three domains Eukaryotes, Archaea, and Bacteria. have been intensively studied for decades, but few are related Rooting bacterial remains a challenging problem in analysis due long branch separating Bacteria other domains. two main objectives this paper estimating Q.bac NQ.bac...
Abstract Estimating amino acid substitution models is a crucial task in bioinformatics. The maximum likelihood (ML) approach has been proposed to estimate from large datasets. quality of newly estimated normally assessed by comparing with the existing building ML trees. Two important questions remained are correlation true and required size training datasets reliable models. In this paper, we performed simulation study answer these two based on simulated data. We genome different number...
A phylogenetic tree is a diagram that illustrates the relationships between species or organisms across time. Building trees crucial task in bioinformatics. Various approaches to perform this such as parsimony methods, distance-based methods have been proposed, especially maximum likelihood methods. Normally, which based on time reversible substitution models can construct unrooted trees. To rooted trees, might use an outgroup infer with non-reversible models. Recently, amino acid estimated...