- Chromosomal and Genetic Variations
- Gene expression and cancer classification
- Genomic variations and chromosomal abnormalities
- Bioinformatics and Genomic Networks
- Genetic Associations and Epidemiology
- Computational Drug Discovery Methods
- CRISPR and Genetic Engineering
- Genetic Mapping and Diversity in Plants and Animals
- Genetic and phenotypic traits in livestock
- Machine Learning in Materials Science
- Protein Structure and Dynamics
- Genetics, Bioinformatics, and Biomedical Research
- Machine Learning in Healthcare
- Global Cancer Incidence and Screening
- Machine Learning in Bioinformatics
- Biomedical Text Mining and Ontologies
- Health Systems, Economic Evaluations, Quality of Life
- Metabolomics and Mass Spectrometry Studies
- Statistical Methods and Inference
- Data Quality and Management
- Cell Image Analysis Techniques
- Molecular Biology Techniques and Applications
- vaccines and immunoinformatics approaches
- Single-cell and spatial transcriptomics
- RNA modifications and cancer
Université Paris Sciences et Lettres
2015-2024
Institut Curie
2015-2024
Inserm
2015-2024
ParisTech
2018-2024
École Nationale Supérieure des Mines de Paris
2015-2023
Cancer et génome: Bioinformatique, biostatistiques et épidémiologie des systèmes complexes
2017-2023
Génomique Bioinformatique et Applications
2017-2023
Centre de Biologie du Développement
2017
Max Planck Institute for Intelligent Systems
2012-2015
Max Planck Institute for Developmental Biology
2015
Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies exploring complex and Mendelian diseases. A large number of silico tools have been employed the task pathogenicity prediction, including PolyPhen-2, SIFT, FatHMM, MutationTaster-2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, GERP++, as well optimized methods combining tool scores, such Condel Logit. Due to wealth these methods, an important...
Abstract Background Variability in datasets is not only the product of biological processes: they are also technical biases. ComBat and ComBat-Seq among most widely used tools for correcting those biases, called batch effects, in, respectively, microarray RNA-Seq expression data. Results In this note, we present a new Python implementation ComBat-Seq. While mathematical framework strictly same, show here that our implementations: (i) have similar results terms effects correction; (ii) as...
Being able to predict the course of arbitrary chemical reactions is essential theory and applications organic chemistry. Approaches reaction prediction problems can be organized around three poles corresponding to: (1) physical laws; (2) rule-based expert systems; (3) inductive machine learning. Previous approaches at these poles, respectively, are not high throughput, generalizable or scalable, lack sufficient data structure implemented. We propose a new approach utilizing elements from...
When it becomes completely possible for one to computationally forecast the impacts of harmful substances on humans, would be easier attempt addressing shortcomings existing safety testing chemicals. In this paper, we relay outcomes a community-facing DREAM contest prognosticate nature environment-based compounds, considering their likelihood have disadvantageous health-related effects human populace. Our research quantified cytotoxicity levels in 156 compounds across 884 lymphoblastic lines...
Between 30% and 70% of patients with breast cancer have pre-existing chronic conditions, more than half are on long-term non-cancer medication at the time diagnosis. Preliminary epidemiological evidence suggests that some medications may affect risk, recurrence, survival. In this nationwide cohort study, we assessed association between use diagnosis We included 235,368 French women newly diagnosed non-metastatic cancer. analyzes 288 medications, identified eight positively associated either...
Abstract Motivation: The performance of classifiers is often assessed using Receiver Operating Characteristic ROC [or (AC) accumulation curve or enrichment curve] curves and the corresponding areas under (AUCs). However, in many fundamental problems ranging from information retrieval to drug discovery, only very top ranked list predictions any interest ROCs AUCs are not useful. New metrics, visualizations optimization tools needed address this ‘early retrieval’ problem. Results: To early...
Rheumatoid arthritis (RA) affects millions world-wide. While anti-TNF treatment is widely used to reduce disease progression, fails in ∼one-third of patients. No biomarker currently exists that identifies non-responders before treatment. A rigorous community-based assessment the utility SNP data for predicting efficacy RA patients was performed context a DREAM Challenge (http://www.synapse.org/RA_Challenge). An open challenge framework enabled comparative evaluation predictions developed by...
Abstract Motivation Finding non-linear relationships between biomolecules and a biological outcome is computationally expensive statistically challenging. Existing methods have important drawbacks, including among others lack of parsimony, non-convexity computational overhead. Here we propose block HSIC Lasso, feature selector that does not present the previous drawbacks. Results We compare Lasso to other state-of-the-art selection techniques in both synthetic real data, experiments over...
Abstract Motivation: As an increasing number of genome-wide association studies reveal the limitations attempt to explain phenotypic heritability by single genetic loci, there is a recent focus on associating complex phenotypes with sets loci. Although several methods for multi-locus mapping have been proposed, it often unclear how relate detected loci growing knowledge about gene pathways and networks. The few that take biological or networks into account are either restricted investigating...
Many chemoinformatics applications, including high-throughput virtual screening, benefit from being able to rapidly predict the physical, chemical, and biological properties of small molecules screen large repositories identify suitable candidates. When training sets are available, machine learning methods provide an effective alternative ab initio for these predictions. Here, we leverage rich molecular representations 1D SMILES strings, 2D graphs bonds, 3D coordinates derive efficient...
Abstract Background Variability in datasets is not only the product of biological processes: they are also technical biases. ComBat and ComBat-Seq among most widely used tools for correcting those biases, called batch effects, in, respectively, microarray RNA-Seq expression data. Results In this note, we present a new Python implementation ComBat-Seq. While mathematical framework strictly same, show here that our implementations: ( i ) have similar results terms effects correction; ii as...
Given activity training data from high-throughput screening (HTS) experiments, virtual (vHTS) methods aim to predict in silico the of untested chemicals. We present a novel method, Influence Relevance Voter (IRV), specifically tailored for vHTS task. The IRV is low-parameter neural network which refines k-nearest neighbor classifier by nonlinearly combining influences chemical's neighbors set. Influences are decomposed, also nonlinearly, into relevance component and vote component....
<p>Fig S9. Age has a minor effect on L1PA DNA methylation patterns and is not confounding factor in this study</p>
<p>Fig S5. Comparison of tumor and plasma paired samples</p>
<p>Fig S8. DIAMOND profiles and performances in the validation versus discovery cohorts</p>
<p>Fig S11. 2 step-models integrating CNA signal extracted from DIAMOND data</p>
<p>Fig S6. Classifier performances: feature types, calculation parameters, cancer subtypes and stages</p>
<p>Fig S8. DIAMOND profiles and performances in the validation versus discovery cohorts</p>
<p>Fig S10. Comparison of multiple classifiers (expert, all, stack and blind models) prognostic value L1PA hypomethylation</p>
<p>Fig S5. Comparison of tumor and plasma paired samples</p>
<p>Fig S3. Preparation of L1PA targeted bisulfite sequencing libraries and analysis workflow</p>
<p>Fig S1. cfDNA extraction methods did not impact the L1PA methylation patterns</p>
<p>Fig S9. Age has a minor effect on L1PA DNA methylation patterns and is not confounding factor in this study</p>
<p>Fig S4. DIAMOND features: CpG calling and contribution of CG positions or haplotypes</p>