Alejandro Ochoa

ORCID: 0000-0003-4928-3403
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genetic Associations and Epidemiology
  • Genetic Mapping and Diversity in Plants and Animals
  • Genetic and phenotypic traits in livestock
  • Renal Diseases and Glomerulopathies
  • Machine Learning in Bioinformatics
  • RNA and protein synthesis mechanisms
  • Protein Structure and Dynamics
  • Genomics and Phylogenetic Studies
  • Forensic and Genetic Research
  • Genomics and Chromatin Dynamics
  • Ion Transport and Channel Regulation
  • Amyloidosis: Diagnosis, Treatment, Outcomes
  • Genetic diversity and population structure
  • Glaucoma and retinal disorders
  • Celiac Disease Research and Management
  • Data Analysis with R
  • Retinal Imaging and Analysis
  • Vasculitis and related conditions
  • Advanced Glycation End Products research
  • Electrolyte and hormonal disorders
  • Garlic and Onion Studies
  • Studies on Chitinases and Chitosanases
  • Family Business Performance and Succession
  • Text and Document Classification Technologies
  • Antimicrobial Peptides and Activities

Duke University
2016-2024

Duke Medical Center
2022

University of California, Riverside
2021

University of California, Los Angeles
2015-2017

Princeton University
2007-2017

Instituto Venezolano de Investigaciones Científicas
2017

Institute for Integrative and Experimental Genomics
2016

Tulane University
1992

ABSTRACT Herpes simplex virus 1 (HSV-1) causes a chronic, lifelong infection in >60% of adults. Multiple recent vaccine trials have failed, with viral diversity likely contributing to these failures. To understand HSV-1 better, we comprehensively compared 20 newly sequenced genomes from China, Japan, Kenya, and South Korea six previously the United States, Europe, Japan. In this diverse collection passaged strains, found that one-fifth members share gene deletion one-third exhibit...

10.1128/jvi.01987-13 article EN Journal of Virology 2013-11-14

F ST and kinship are key parameters often estimated in modern population genetics studies order to quantitatively characterize structure relatedness. Kinship matrices have also become a fundamental quantity used genome-wide association heritability estimation. The most frequently-used estimators of method-of-moments whose accuracies depend strongly on the existence simple underlying forms structure, such as independent subpopulations model non-overlapping, independently evolving...

10.1371/journal.pgen.1009241 article EN cc-by PLoS Genetics 2021-01-19

Lysine acetylation is a ubiquitous post-translational modification in many organisms including the malaria parasite Plasmodium falciparum, yet full extent of across proteome remains unresolved. Moreover, functional significance or how specific acetyl-lysine sites are regulated largely unknown. Here we report seven-fold expansion known 'acetylome', characterizing 2,876 on 1,146 proteins. We observe that lysine targets diverse range protein complexes and particularly enriched within...

10.1038/srep19722 article EN cc-by Scientific Reports 2016-01-27

Principal Component Analysis (PCA) and the Linear Mixed-effects Model (LMM), sometimes in combination, are most common genetic association models. Previous PCA-LMM comparisons give mixed results, unclear guidance, have several limitations, including not varying number of principal components (PCs), simulating simple population structures, inconsistent use real data power evaluations. We evaluate PCA LMM both PCs realistic genotype complex trait simulations admixed families, subpopulation...

10.7554/elife.79238 article EN cc-by eLife 2023-05-04

To assess patient and provider perspectives on the potential value use of a bilingual portal in large safety-net health system serving predominantly Spanish-speaking patients.We captured through administration surveys to Internet access, barriers, facilitators adoption, along with preferences. We report these survey results using descriptive comparative statistics.Four hundred patients (82% response rate) 59 providers (80% participated study. Although 73% believed that would increase...

10.1093/jamia/ocx040 article EN Journal of the American Medical Informatics Association 2017-04-07

Identifying domains in protein sequences is an important step structural and functional annotation. Existing domain recognition methods typically evaluate each prediction independently of the rest. However, majority proteins are multidomain, pairwise co-occurrences highly specific non-transitive.Here, we demonstrate how to exploit co-occurrence boost weak predictions that appear previously observed combinations, while penalizing higher confidence if such combinations have never been...

10.1186/1471-2105-12-90 article EN cc-by BMC Bioinformatics 2011-03-31

E-values have been the dominant statistic for protein sequence analysis past two decades: from identifying statistically significant local alignments to evaluating matches hidden Markov models describing domain families. Here we formally show that "stratified" multiple hypothesis testing problems, controlling False Discovery Rate (lFDR) per stratum, or partition, yields most predictions across data at any given threshold on FDR E-value over all strata combined. For important problem of...

10.1371/journal.pcbi.1004509 article EN cc-by PLoS Computational Biology 2015-11-17

We performed next-generation sequencing in patients with familial steroid-sensitive nephrotic syndrome (SSNS) and identified a homozygous segregating variant (p.H310Y) the gene encoding clavesin-1 (CLVS1) consanguineous family 3 affected individuals. Knockdown of clavesin zebrafish (clvs2) produced edema phenotypes due to disruption podocyte structure loss glomerular filtration barrier integrity that could be rescued by WT CLVS1 but not p.H310Y variant. Analysis cultured human podocytes...

10.1172/jci.insight.152102 article EN cc-by JCI Insight 2021-12-07

Abstract F ST is a fundamental measure of genetic differentiation and population structure, currently defined for subdivided populations. in practice typically assumes independent, non-overlapping subpopulations , which all split simultaneously from their last common ancestral so that drift each subpopulation probabilistically independent the other subpopulations. We introduce generalized definition arbitrary structures, where individuals may be related ways, allowing probabilistic...

10.1101/083915 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2016-10-27

High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of variance in STARR-seq data. We then develop a statistical model correct those and improve detection elements. This approach substantially improves precision recall over...

10.1101/gr.269209.120 article EN cc-by-nc Genome Research 2021-03-15

Abstract The rotamer approximation states that protein side‐chain conformations can be described well using a finite set of rotational isomers. This is often applied in the context computational design and structure prediction to reduce complexity structural sampling. It an effective way reducing space most relevant conformations. However, appropriateness rotamers for sampling does not imply rotamer‐based energy landscape preserves any properties true continuous landscape. Specifically,...

10.1002/prot.21470 article EN Proteins Structure Function and Bioinformatics 2007-06-06

Common genetic association models for structured populations, including principal component analysis (PCA) and linear mixed-effects (LMMs), model the correlation structure between individuals using population kinship matrices, also known as relatedness matrices. However, most common estimators can have severe biases that were only recently determined. Here we characterize effect of these on association. We employ a large simulated admixed family genotypes from 1000 Genomes Project, both with...

10.1093/genetics/iyad030 article EN Genetics 2023-02-27

Abstract F ST and kinship are key parameters often estimated in modern population genetics studies order to quantitatively characterize structure relatedness. Kinship matrices have also become a fundamental quantity used genome-wide association heritability estimation. The most frequently estimators of method-of-moments whose accuracies depend strongly on the existence simple underlying forms structure, such as independent subpopulations model non-overlapping, independently evolving...

10.1101/083923 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2016-10-27

Kinship coefficients and F ST , which measure genetic relatedness the overall population structure, respectively, have important biomedical applications. However, existing estimators are only accurate under restrictive conditions that most natural structures do not satisfy. We recently derived new kinship for arbitrary [1, 2]. Our estimates on human datasets reveal a complex structure driven by founder effects due to dispersal from Africa admixture. Notably, our approach larger values of 26%...

10.1101/653279 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-05-30

Precis: This is the first exploratory study demonstrating promising potential of app-based visual fields testing in a low-resource health fair setting for community screening high-risk Latino adults. Purpose: To compare “Visual Fields Easy” (VFE) iPad application against Humphrey Frequency Doubling Technology (FDT) N-30-5 detecting abnormal setting. Methods: Latinos aged 40 to 80 years were recruited at Los Angeles, California, November 2017. Both eyes tested using VFE and FDT. account...

10.1097/ijg.0000000000001902 article EN Journal of Glaucoma 2021-06-24

Abstract Principal Component Analysis (PCA) and the Linear Mixed-effects Model (LMM), sometimes in combination, are most common genetic association models. Previous PCA-LMM comparisons give mixed results, unclear guidance, have several limitations, including not varying number of principal components (PCs), simulating simple population structures, inconsistent use real data power evaluations. We evaluate PCA LMM both PCs realistic genotype complex trait simulations admixed families,...

10.1101/2022.03.25.485885 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2022-03-27

The incidence of melanoma and other skin cancers has risen drastically in the United States. As with most types cancer, prognosis survival rates are significantly improved early diagnosis, but dismal for patients who present advanced disease. It remains a fact that although is common Caucasian populations, ethnic minorities have worse prognosis. Our hypothesis this dermatologic health literacy study was before necessary education, required fund knowledge respect to cancer risk lacking...

10.5070/d32111029308 article EN cc-by-nc-nd Dermatology Online Journal 2015-01-01

Protein domain prediction is one of the most powerful approaches for sequence-based function prediction. Although instances are typically predicted independently each other, newer have demonstrated improved performance by rewarding pairs that frequently co-occur within sequences. However, these ignored order in which domains preferentially and also not modeled co-occurrence probabilistically.We introduce a probabilistic approach models 'directional' context. Our method first to score all...

10.1093/bioinformatics/btx221 article EN cc-by Bioinformatics 2017-04-11

Recurrent focal segmental glomerulosclerosis (FSGS) after kidney transplantation accounts for the majority of allograft failures in children with primary FSGS. Although current research focuses on FSGS pathophysiology, a common etiology and mechanisms disease recurrence remain elusive.We performed retrospective review Scientific Registry Transplant Recipients to determine association specific HLA Kidney transplants recipients under age 19 who were diagnosed FSGS, transplanted January 1,...

10.1097/txd.0000000000001201 article EN cc-by-nc-nd Transplantation Direct 2021-08-26

Abstract Modern genetic association studies require modeling population structure and family relatedness in order to calculate correct statistics. Principal Components Analysis (PCA) is one of the most common approaches for this structure, but nowadays Linear Mixed-Effects Model (LMM) believed by many be a superior model. Remarkably, previous comparisons have been limited testing PCA without varying number principal components (PCs), simulating unrealistically simple structures, not always...

10.1101/858399 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-11-29

The etiology of most cases nephrotic syndrome (NS) remains unknown, therefore patients are phenotypically categorized based on response to corticosteroid therapy as steroid sensitive NS (SSNS), or resistant (SRNS). Genetic risk factors have been identified for SSNS from unbiased genome-wide association studies (GWAS), however it is unclear if these loci disease in other forms such SRNS. Additionally, unknown associated with therapy. Thus, we investigated the between and a large, multi-race...

10.3389/fped.2023.1248733 article EN cc-by Frontiers in Pediatrics 2023-10-06

Abstract Motivation Protein domain prediction is one of the most powerful approaches for sequence-based function prediction. While instances are typically predicted independently each other, newer have demonstrated improved performance by rewarding pairs that frequently co-occur within sequences. However, these ignored order in which domains preferentially and also not modeled co-occurrence probabilistically. Results We introduce a probabilistic approach models “directional” context. Our...

10.1101/094284 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2016-12-14
Coming Soon ...