Andrew Terpolovsky

ORCID: 0000-0001-7368-2055
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genetic Associations and Epidemiology
  • Gene expression and cancer classification
  • Bioinformatics and Genomic Networks
  • Genomics and Rare Diseases
  • Machine Learning in Bioinformatics
  • Artificial Intelligence in Healthcare
  • MicroRNA in disease regulation
  • Genetic and phenotypic traits in livestock
  • Forensic and Genetic Research
  • Error Correcting Code Techniques
  • Algorithms and Data Compression
  • Genomics and Chromatin Dynamics

Whole-genome data has become significantly more accessible over the last two decades. This can largely be attributed to both reduced sequencing costs and imputation models which make it possible obtain nearly whole-genome from less expensive genotyping methods, such as microarray chips. Although there are many different approaches imputation, Hidden Markov Model (HMM) remains most widely used. In this study, we compared latest versions of popular HMM-based tools for phasing imputation:...

10.1371/journal.pone.0260177 article EN cc-by PLoS ONE 2022-10-19

Generating polygenic risk scores for diseases and complex traits requires high quality GWAS summary statistic files. Often, these files can be difficult to acquire either as a result of unshared or incomplete data. To date, bioinformatics tools which focus on restoring missing columns containing identification association data are limited, has the potential increase number usable statistics files.SumStatsRehab was able restore rsID, effect/other alleles, chromosome, base pair position,...

10.1186/s12859-022-04920-7 article EN cc-by BMC Bioinformatics 2022-10-25

Abstract Whole-genome data has become significantly more accessible over the last two decades. This can largely be attributed to both reduced sequencing costs and imputation models which make it possible obtain nearly whole-genome from less expensive genotyping methods, such as microarray chips. Although there are many different approaches imputation, Hidden Markov Model remains most widely used. In this study, we compared latest versions of popular based tools for phasing imputation: Beagle...

10.1101/2021.11.04.467340 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-11-04

Abstract Background Polygenic risk scores (PRS) have ushered in a new era genetic epidemiology, offering insights into individual predispositions to wide range of diseases. However, despite recent marked enhancements their predictive power, there are still challenges that need be overcome before PRS-based models can broadly applied the clinic, including sufficient accuracy, easy interpretability and portability across diverse populations. Methods Leveraging trans-ancestry genome-wide...

10.1101/2024.04.17.24305723 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2024-04-19

Abstract In an increasingly diverse world, including admixed individuals in genomic studies is imperative for equity and portability. A crucial first step precise local ancestry inference (LAI). We have developed Orchestra, a LAI model with unprecedented accuracy, trained on over 10,000 single-origin from 35 worldwide populations. employed Orchestra to delve into genetic relationships demographic histories, focus Latin Americans, prime example of admixture, the Ashkenazi Jewish, whose...

10.1101/2023.09.11.557177 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-09-13

Abstract Genotype imputation, crucial in genomics research, often faces accuracy limitations, notably for rarer variants. Leveraging data from the 1000 Genomes Project, TOPMed and UK Biobank, we demonstrate that Selphi, our novel imputation method, significantly outperforms Beagle5.4, Minimac4 IMPUTE5 across various metrics (12.5%-26.5% as measured by error count) allele frequencies (13.0%-27.1% low-frequency variants).This improvement boosts variant discovery GWAS improves polygenic risk scores.

10.1101/2023.12.18.23300143 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2023-12-19

Abstract Background : Generating polygenic risk scores for diseases and complex traits requires high quality GWAS summary statistic files. Often, these files can be difficult to acquire either as a result of unshared or incomplete data. To date, bioinformatics tools which focus on restoring missing columns containing identification association data are limited, has the potential increase number usable statistics Results SumStatsRehab was able restore rsID, effect/other alleles, chromosome,...

10.21203/rs.3.rs-1359902/v1 preprint EN cc-by Research Square (Research Square) 2022-03-02
Coming Soon ...