NFDI4DS | UHH-SEMS - Publication Details

Cuong Cao Dang

ORCID: 0000-0002-4307-5972

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5014650392

Research Areas

Genomics and Phylogenetic Studies
Genetic diversity and population structure
Computational Drug Discovery Methods
RNA and protein synthesis mechanisms
Evolution and Paleontology Studies
Bioinformatics and Genomic Networks
Gene expression and cancer classification
Identification and Quantification in Food
Receptor Mechanisms and Signaling
Amino Acid Enzymes and Metabolism
RNA modifications and cancer
Molecular Biology Techniques and Applications
Machine Learning in Materials Science
Prostate Cancer Treatment and Research
Crystallization and Solubility Studies
Cancer Genomics and Diagnostics
Statistical Methods in Clinical Trials
Influenza Virus Research Studies
Protein Degradation and Inhibitors
Radiopharmaceutical Chemistry and Applications
Analytical Chemistry and Chromatography
Genetics and Plant Breeding
Multiple Myeloma Research and Treatments
Aquatic Invertebrate Ecology and Behavior
Algorithms and Data Compression

National University of Civil Engineering
2023-2024

Vietnam National University, Hanoi
2010-2023

Vietnam National University Ho Chi Minh City
2023

Thai Binh University of Medicine and Pharmacy
2020

Inserm
2015-2019

Centre de Recherche en Cancérologie de Marseille
2015-2018

Institut Paoli-Calmettes
2015-2017

Aix-Marseille Université
2015-2017

Cancer Research Center
2016-2017

VNU University of Science
2017

Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen

OPENALEX - Publications

Michael P. Menden Dennis Wang Mike J. Mason Bence Szalai Krishna C. Bulusu and 95 more

Abstract The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number possible combinations vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large combination dataset, consisting 11,576 experiments from 910 across 85 molecularly characterized cell lines, and results a DREAM Challenge evaluate computational strategies for...

10.1038/s41467-019-09799-2 article EN cc-by Nature Communications 2019-06-17

Modeling Protein Evolution with Several Amino Acid Replacement Matrices Depending on Site Rates

OPENALEX - Publications

Si Quang Le Cuong Cao Dang Olivier Gascuel

Most protein substitution models use a single amino acid replacement matrix summarizing the biochemical properties of acids. However, site evolution is highly heterogeneous and depends on many factors that influence patterns. In this paper, we investigate different matrices for evolutionary rates. Indeed, variability rates corresponds to one most apparent heterogeneity among sites, there no reason assume patterns remain identical regardless rate. We first introduce LG4M, which composed four...

10.1093/molbev/mss112 article EN Molecular Biology and Evolution 2012-04-06

Prediction of overall survival for patients with metastatic castration-resistant prostate cancer: development of a prognostic model through a crowdsourced challenge with open clinical trial data

OPENALEX - Publications

Justin Guinney Tao Wang Teemu D. Laajala Kimberly Kanigel Winner J Christopher Bare and 95 more

10.1016/s1470-2045(16)30560-5 article EN The Lancet Oncology 2016-11-16

QMaker: Fast and Accurate Method to Estimate Empirical Models of Protein Evolution

OPENALEX - Publications

Bùi Quang Minh Cuong Cao Dang Lê Sỹ Vinh Robert Lanfear

Abstract Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino models; however, they are typically complicated and slow. In this article, we propose QMaker, new ML method general time-reversible $Q$ matrix from large protein data set consisting of multiple sequence alignments. QMaker combines an efficient tree search algorithm, model selection for handling the heterogeneity among alignments,...

10.1093/sysbio/syab010 article EN cc-by-nc Systematic Biology 2021-02-17

nQMaker: Estimating Time Nonreversible Amino Acid Substitution Models

OPENALEX - Publications

Cuong Cao Dang Bùi Quang Minh Hanon McShea Joanna Masel Jennifer James and 2 more

Amino acid substitution models are a key component in phylogenetic analyses of protein sequences. All commonly used amino available to date time-reversible, an assumption designed for computational convenience but not biological reality. Another significant downside time-reversible is that they do allow inference rooted trees without outgroups. In this article, we introduce maximum likelihood approach nQMaker, extension the recently published QMaker method, allows estimation time...

10.1093/sysbio/syac007 article EN cc-by Systematic Biology 2022-02-02

nT4X and nT4M: Novel Time Non-reversible Mixture Amino Acid Substitution Models

OPENALEX - Publications

Nguyen Huy Tinh Cuong Cao Dang Lê Sỹ Vinh

10.1007/s00239-024-10230-8 article EN other-oa Journal of Molecular Evolution 2025-01-20

FLU, an amino acid substitution model for influenza proteins

OPENALEX - Publications

Cuong Cao Dang Si Quang Le Olivier Gascuel Lê Sỹ Vinh

Abstract Background The amino acid substitution model is the core component of many protein analysis systems such as sequence similarity search, alignment, and phylogenetic inference. Although several general models have been estimated from large diverse databases, they remain inappropriate for analyzing specific species, e.g., viruses. Emerging epidemics influenza viruses raise need comprehensive studies these dangerous We propose an influenza-specific to enhance understanding evolution...

10.1186/1471-2148-10-99 article EN cc-by BMC Evolutionary Biology 2010-04-12

Improved mitochondrial amino acid substitution models for metazoan evolutionary studies

OPENALEX - Publications

Lê Sỹ Vinh Cuong Cao Dang Si Quang Le

Amino acid substitution models play an essential role in inferring phylogenies from mitochondrial protein data. However, only few empirical have been estimated restricted data of a hundred species. The existing are unlikely to represent appropriately the amino substitutions thousands metazoan sequences.We selected 125,935 sequences 34,448 species kingdom estimate new targeting metazoa, vertebrates and invertebrate groups. help find significantly better likelihood comparison with models. We...

10.1186/s12862-017-0987-y article EN cc-by BMC Evolutionary Biology 2017-06-12

Precision and recall oncology: combining multiple gene mutations for improved identification of drug-sensitive tumours

OPENALEX - Publications

Stefan Naulaerts Cuong Cao Dang Pedro J. Ballester

Cancer drug therapies are only effective in a small proportion of patients. To make things worse, our ability to identify these responsive patients before administering treatment is generally very limited. The recent arrival large-scale pharmacogenomic data sets, which measure the sensitivity molecularly profiled cancer cell lines panel drugs, has boosted research on discovery markers. However, no systematic comparison widely-used single-gene markers with multi-gene machine-learning...

10.18632/oncotarget.20923 article EN Oncotarget 2017-09-15

How Reliable Are Ligand-Centric Methods for Target Fishing?

OPENALEX - Publications

Antonio Peón Cuong Cao Dang Pedro J. Ballester

Computational methods for Target Fishing (TF), also known as Prediction or Polypharmacology Prediction, can be used to discover new targets small-molecule drugs. This may result in repositioning the drug a indication improving our current understanding of its efficacy and side effects. While there is substantial body research on TF methods, still need improve their validation, which often limited small part available not easily interpretable by user. Here we discuss how target-centric are...

10.3389/fchem.2016.00015 article EN cc-by Frontiers in Chemistry 2016-04-13

Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data

OPENALEX - Publications

L. Nguyen Cuong Cao Dang Pedro J. Ballester

Background: Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets were introduced discover more these single-gene markers drug sensitivity. Very recently, machine learning regression has been investigate how well cell line sensitivity is predicted depending on type molecular profile. The latter revealed that expression most predictive profile in pan-cancer setting. However, no study date exploited GDSC...

10.12688/f1000research.10529.1 preprint EN cc-by F1000Research 2016-12-28

Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data

OPENALEX - Publications

L. Nguyen Cuong Cao Dang Pedro J. Ballester

<ns4:p><ns4:italic>Background:</ns4:italic>Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets, such as those by Genomics Drug Sensitivity in Cancer (GDSC) consortium, were introduced discover more these single-gene markers drug sensitivity. Very recently, machine learning regression has been investigate how well cell line sensitivity is predicted depending on type molecular profile. The latter revealed...

10.12688/f1000research.10529.2 preprint EN cc-by F1000Research 2017-03-14

FastMG: a simple, fast, and accurate maximum likelihood procedure to estimate amino acid replacement rate matrices from large data sets

OPENALEX - Publications

Cuong Cao Dang Lê Sỹ Vinh Olivier Gascuel Bart Hazes Si Quang Le

Amino acid replacement rate matrices are a crucial component of many protein analysis systems such as sequence similarity search, alignment, and phylogenetic inference. Ideally, the matrix reflects mutational behavior actual data under study; however, estimating amino requires large alignments is computationally expensive complex. As compromise, sub-optimal pre-calculated generic typically used for protein-based phylogeny. Sequence availability has now grown to point where problem-specific...

10.1186/1471-2105-15-341 article EN cc-by BMC Bioinformatics 2014-10-24

ReplacementMatrix: a web server for maximum-likelihood estimation of amino acid replacement rate matrices

OPENALEX - Publications

Cuong Cao Dang Vincent Lefort Lê Sỹ Vinh Si Quang Le Olivier Gascuel

Abstract Summary: Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number general purpose have been proposed JTT, WAG, LG) since the seminal work Margaret Dayhoff co-workers. However, it has shown that specific to certain groups mitochondrial) or life domains viruses) differ significantly from average matrices, thus perform better when applied data which they dedicated. This Web server implements maximum-likelihood...

10.1093/bioinformatics/btr435 article EN Bioinformatics 2011-07-26

A Weak Allele of FASCIATED EAR 2 (FEA2) Increases Maize Kernel Row Number (KRN) and Yield in Elite Maize Hybrids

OPENALEX - Publications

Khuất Hữu Trung Quan Hong Tran Ngoc H. Bui Thuy Thi Thu Tran Kong Quy Luu and 11 more

Meristems are central to plant growth and development, yet evidence of directly manipulating this control improve crop yield is scarce. Kernel row number (KRN) an important agronomic trait that can affect maize (Zea mays L.) yield. However, difficult select by phenotyping, since it highly variable in the mixed genetic backgrounds early selfing generations. This study sought marker-assisted backcrossing (MABC) a weak allele FASCIATED EAR 2 known inflorescence meristem size, but effect which...

10.3390/agronomy10111774 article EN cc-by Agronomy 2020-11-13

QMix: An Efficient Program to Automatically Estimate Multi-Matrix Mixture Models for Amino Acid Substitution Process

OPENALEX - Publications

Nguyen Huy Tinh Cuong Cao Dang Lê Sỹ Vinh

The single-matrix amino acid (AA) substitution models are widely used in phylogenetic analyses; however, they unable to properly model the heterogeneity of AA rates among sites. multi-matrix mixture can handle site rate and outperform models. Estimating is a complex process no computer program available for this task. In study, we implemented so-called QMix based on algorithm LG4X LG4M with several enhancements automatically estimate from large datasets. employs QMaker instead XRATE...

10.1089/cmb.2023.0403 article EN Journal of Computational Biology 2024-06-11

QMaker: Fast and accurate method to estimate empirical models of protein evolution

OPENALEX - Publications

Bùi Quang Minh Cuong Cao Dang Lê Sỹ Vinh Robert Lanfear

Abstract Amino acid substitution models play a crucial role in phylogenetic analyses. Maximum likelihood (ML) methods have been proposed to estimate amino models, however, they are typically complicated and slow. In this paper, we propose QMaker, new ML method general time-reversible Q matrix from large protein dataset consisting of multiple sequence alignments. QMaker combines an efficient tree search algorithm, model selection for handling the heterogeneity among alignments, consideration...

10.1101/2020.02.20.958819 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2020-02-23

Estimating amino acid substitution models for metazoan evolutionary studies

OPENALEX - Publications

Cuong Cao Dang Lê Sỹ Vinh

Abstract Amino acid substitution models represent the rates among amino acids during evolution of protein sequences. The are a prerequisite for maximum likelihood or Bayesian methods to analyse phylogenetic relationships species based on their Estimating requires large datasets and intensive computation. In this paper, we presented estimation both time‐reversible model (Q.met) time non‐reversible (NQ.met) multicellular animals (Metazoa). Analyses showed that Q.met NQ.met were significantly...

10.1111/jeb.14147 article EN Journal of Evolutionary Biology 2023-01-04

Estimating amino acid substitution models from genome datasets: a simulation study on the performance of estimated models

OPENALEX - Publications

Nguyen Huy Tinh Cuong Cao Dang Lê Sỹ Vinh

Abstract Estimating parameters of amino acid substitution models is a crucial task in bioinformatics. The maximum likelihood (ML) approach has been proposed to estimate from large datasets. quality newly estimated normally assessed by comparing with the existing building ML trees. Two important questions remained are correlation true and required size training datasets reliable models. In this article, we performed simulation study answer these two based on simulated data. We genome...

10.1093/jeb/voad017 article EN Journal of Evolutionary Biology 2023-12-12

Unearthing new genomic markers of drug response by improved measurement of discriminative power

OPENALEX - Publications

Cuong Cao Dang Antonio Peón Pedro J. Ballester

Oncology drugs are only effective in a small proportion of cancer patients. Our current ability to identify these responsive patients before treatment is still poor most cases. Thus, there pressing need discover response markers for marketed and research oncology drugs. Screening against large panel cell lines has led the discovery new genomic vitro drug response. However, while identification such among thousands candidate drug-gene associations data error-prone, an appraisal effectiveness...

10.1186/s12920-018-0336-z article EN cc-by BMC Medical Genomics 2018-02-06

Systematic assessment of multi-gene predictors of pan-cancer cell line sensitivity to drugs exploiting gene expression data

OPENALEX - Publications

L. Nguyen Cuong Cao Dang Pedro J. Ballester

Abstract Selected gene mutations are routinely used to guide the selection of cancer drugs for a given patient tumour. Large pharmacogenomic data sets were introduced discover more these single-gene markers drug sensitivity. Very recently, machine learning regression has been investigate how well cell line sensitivity is predicted depending on type molecular profile. The latter revealed that expression most predictive profile in pan-cancer setting. However, no study date exploited GDSC...

10.1101/095224 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2016-12-18

How reliable are ligand-centric methods for Target Fishing?

OPENALEX - Publications

Antonio Peón Cuong Cao Dang Pedro J. Ballester

Computational methods for Target Fishing (TF), also known as Prediction or Polypharmacology Prediction, can be used to discover new targets in small-molecule drugs. This may result repositioning the drug a indication improving our current understanding of its efficacy and side effects. While there is substantial body research on TF methods, still need improve their validation, which often limited small part available not easily interpretable by user. Here we discuss how target-centric are...

10.1101/032946 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2015-11-26

ESTIMATING AMINO ACID SUBSTITUTION MODELS AND ROOTING BACTERIAL TREES

OPENALEX - Publications

Cuong Cao Dang Lê Sỹ Vinh

Reconstructing phylogenetic trees from protein sequences normally requires empirical amino acid substitution models to calculate the likelihood of or genetic distances between species. The tree life is classified into three domains Eukaryotes, Archaea, and Bacteria. have been intensively studied for decades, but few are related Rooting bacterial remains a challenging problem in analysis due long branch separating Bacteria other domains. two main objectives this paper estimating Q.bac NQ.bac...

10.15625/1813-9663/19324 article EN Journal of Computer Science and Cybernetics 2024-03-19

Estimating amino acid substitution models from genome datasets: A simulation study on the performance of estimated models

OPENALEX - Publications

Tinh Nguyen Huy Cuong Cao Dang Lê Sỹ Vinh

Abstract Estimating amino acid substitution models is a crucial task in bioinformatics. The maximum likelihood (ML) approach has been proposed to estimate from large datasets. quality of newly estimated normally assessed by comparing with the existing building ML trees. Two important questions remained are correlation true and required size training datasets reliable models. In this paper, we performed simulation study answer these two based on simulated data. We genome different number...

10.1101/2023.04.09.536188 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-04-10

Rooting Phylogenetic Trees from Protein Alignments

OPENALEX - Publications

Tinh H. Nguyen Cuong Cao Dang Lê Sỹ Vinh

A phylogenetic tree is a diagram that illustrates the relationships between species or organisms across time. Building trees crucial task in bioinformatics. Various approaches to perform this such as parsimony methods, distance-based methods have been proposed, especially maximum likelihood methods. Normally, which based on time reversible substitution models can construct unrooted trees. To rooted trees, might use an outgroup infer with non-reversible models. Recently, amino acid estimated...

10.1109/kse59128.2023.10299425 article EN 2023-10-18

Coming Soon ...