- Genomics and Rare Diseases
- Genetic Associations and Epidemiology
- Genetics, Bioinformatics, and Biomedical Research
- Genomic variations and chromosomal abnormalities
- Genetics and Neurodevelopmental Disorders
- Machine Learning in Bioinformatics
- Cardiovascular Function and Risk Factors
- Liver Disease Diagnosis and Treatment
- Bioinformatics and Genomic Networks
- Artificial Intelligence in Healthcare
- Lipid metabolism and disorders
- Cardiac Imaging and Diagnostics
- Machine Learning in Healthcare
- Metabolomics and Mass Spectrometry Studies
- BRCA gene mutations in cancer
- Liver Disease and Transplantation
- Cardiomyopathy and Myosin Studies
- Heart Failure Treatment and Management
- Liver Diseases and Immunity
- Renal Diseases and Glomerulopathies
- Health, Environment, Cognitive Aging
- Hepatitis C virus research
- Chronic Disease Management Strategies
- Healthcare Systems and Public Health
- Peptidase Inhibition and Analysis
Icahn School of Medicine at Mount Sinai
2020-2025
Leiden University Medical Center
2023
Massachusetts General Hospital
2023
Boston University
2023
Twitter (United States)
2022
Institut Pasteur de Montevideo
2021-2022
Population-based assessment of disease risk associated with gene variants informs clinical decisions and stratification approaches.To evaluate the population-based in known predisposition genes.This cohort study included 72 434 individuals 37 780 who were enrolled BioMe Biobank from 2007 onwards follow-up until December 2020 UK 2006 to 2010 June 2020. Participants had linked exome electronic health record data, older than 20 years, diverse ancestral backgrounds.Variants previously reported...
Abstract Background Missing data is a common issue in different fields, such as electronics, image processing, medical records and genomics. They can limit or even bias the posterior analysis. The collection process lead to distribution, frequency, structure of missing points. be classified into four categories: Structurally Data (SMD), Completely At Random (MCAR), (MAR) Not (MNAR). For three later, context genomic (especially non-coding data), we will discuss six imputation approaches using...
Abstract Systemic autoimmune rheumatic diseases (SARDs) can lead to irreversible damage if left untreated, yet these patients often endure long diagnostic journeys before being diagnosed and treated. Machine learning may help overcome the challenges of diagnosing SARDs inform clinical decision-making. Here, we developed tested a machine model identify who should receive rheumatological evaluation for using longitudinal electronic health records 161,584 individuals from two institutions. The...
Venous thromboembolism (VTE) is a major cause of morbidity and mortality worldwide. Current risk assessment tools, such as the Caprini Padua scores Wells criteria, have limitations in their applicability accuracy. This study aimed to develop machine learning models using structured electronic health record data predict diagnosis 1-year VTE.
Background: Liver fibrosis is a critical public health concern, necessitating early detection to prevent progression. This study evaluates the recently developed LiverRisk score and steatosis-associated Fibrosis Estimator (SAFE) against established indices for prognostication and/or prediction in 4diverse cohorts, including participants with metabolic dysfunction–associated steatotic liver disease (MASLD). Methods: We used data from Mount Sinai Data Warehouse (32,828 without diagnoses),...
Diet is a key modifiable risk factor of coronary artery disease (CAD). However, the causal effects specific dietary traits on CAD remain unclear. With expansion data in population biobanks, Mendelian randomization (MR) could help enable efficient estimation causality diet-disease associations. The primary goal was to test for 13 common using systematic two-sample MR framework. A secondary identify plasma metabolites mediating diet-CAD associations suspected be causal. Cross-sectional genetic...
Identifying genetic drivers of chronic diseases is necessary for drug discovery. Here, we develop a machine learning-assisted priority score, which call ML-GPS, that incorporates associations with predicted disease phenotypes to enhance target First, construct gradient boosting models predict 112 phecodes in the UK Biobank and analyze observed common, rare, ultra-rare variants model allelic series. We integrate these existing evidence using continuous feature encoding training it indications...
Causality between plasma triglyceride (TG) levels and atherosclerotic cardiovascular disease (ASCVD) risk remains controversial despite more than four decades of study two recent landmark trials, STRENGTH, REDUCE-IT. Further unclear is the association TG non-atherosclerotic diseases across organ systems.
Abstract The prediction of pathogenic human missense variants has improved in recent years, but a more granular level variant characterization is required. Further axes information need to be incorporated order advance the genotype-to-phenotype map. Recent efforts have developed mode inheritance tools; however, these lack robust validation and their discrimination performance does not support clinical utility, with evidence them being fundamentally insensitive recessive acting diseases....
Abstract Whole genome sequencing has become a wide-spread diagnostic tool for rare diseases patients. This broadens analyses to non-coding regions of the showing strong evidence clinical significance human Mendelian diseases. Notwithstanding its importance, current in-silico prediction tools are restricted coding sequences which limits applicability. Additionally, lack power in discriminating variants uncertain (VUS) utility. Here we present PANCO, genome-wide pathogenicity aiming at...
Abstract Background Causality between plasma triglyceride (TG) levels and atherosclerotic cardiovascular disease (ASCVD) risk remains controversial despite more than four decades of study two recent landmark trials, STRENGTH REDUCE-IT. Further unclear is the association TG non-atherosclerotic diseases across organ systems. Methods Here, we conducted a phenome-wide, two-sample Mendelian randomization (MR) analysis using inverse-variance weighted (IVW) regression to systematically infer causal...
A recent study has shown that an in silico quantitative score of CAD (ISCAD) can be built using machine learning and clinical data from electronic health records. ISCAD was to associated with risk, atherosclerosis, sequela all-cause mortality. We developed a CAD-predictive model only metabolic 93,642 individuals the UK Biobank (median [IQR] age, 57 [14] years; 39,796 [42%] male; 5640 [6%] diagnosed CAD), assessed its probabilities as risk for (M-CAD; range 0 [lowest probability] 1 [highest...
Abstract Metabolic dysfunction-associated steatotic liver disease (MASLD) affects 30% of the global population but is often underdiagnosed. To fill this diagnostic gap, we developed a digital score reflecting presence and severity MASLD. We fitted machine learning model to electronic health records from 37,212 UK Biobank participants with proton density fat fraction measurements and/or MASLD diagnosis generate “MASLD score”. In holdout testing, our achieved areas under receiver-operating...