NFDI4DS | UHH-SEMS - Publication Details

Wyatt T. Clark

ORCID: 0000-0002-5041-3669

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5091494022

Research Areas

Bioinformatics and Genomic Networks
Lysosomal Storage Disorders Research
Biomedical Text Mining and Ontologies
RNA regulation and disease
Cytomegalovirus and herpesvirus research
Machine Learning in Bioinformatics
Genomics and Rare Diseases
Computational Drug Discovery Methods
Cellular transport and secretion
Genomics and Phylogenetic Studies
Genetics and Neurodevelopmental Disorders
RNA modifications and cancer
Trypanosoma species research and implications
Genomic variations and chromosomal abnormalities
Cancer Genomics and Diagnostics
Algorithms and Data Compression
Biochemical and Molecular Research
Genomics and Chromatin Dynamics
Advanced Proteomics Techniques and Applications
Microbial Metabolic Engineering and Bioproduction
Glycosylation and Glycoproteins Research
Fractal and DNA sequence analysis
Computability, Logic, AI Algorithms
Carbohydrate Chemistry and Synthesis
Chronic Lymphocytic Leukemia Research

BioMarin (United States)
2016-2023

Yale University
2014-2016

Whitney Museum of American Art
2015

Mayo Clinic
2015

Indiana University Bloomington
2008-2014

Miami University
2014

Indiana University
2008

A large-scale evaluation of computational protein function prediction

OPENALEX - Publications

Predrag Radivojac Wyatt T. Clark Tal Oron Alexandra M. Schnoes Tobias Wittkop and 95 more

Automated annotation of protein function is challenging. As the number sequenced genomes rapidly grows, overwhelming majority products can only be annotated computationally. If computational predictions are to relied upon, it crucial that accuracy these methods high. Here we report results from first large-scale community-based critical assessment (CAFA) experiment. Fifty-four representing state art for prediction were evaluated on a target set 866 proteins 11 organisms. Two findings stand...

10.1038/nmeth.2340 article EN cc-by-nc-sa Nature Methods 2013-01-27

An expanded evaluation of protein function prediction methods shows an improvement in accuracy

OPENALEX - Publications

Yuxiang Jiang Tal Oron Wyatt T. Clark Asma Bankapur Daniel D’Andrea and 95 more

A major bottleneck in our understanding of the molecular underpinnings life is assignment function to proteins. While experiments provide most reliable annotation proteins, their relatively low throughput and restricted purview have led an increasing role for computational prediction. However, assessing methods protein prediction tracking progress field remain challenging.We conducted second critical assessment functional (CAFA), a timed challenge assess that automatically assign function....

10.1186/s13059-016-1037-6 article EN cc-by Genome biology 2016-09-07

Testing the Ortholog Conjecture with Comparative Functional Genomic Data from Mammals

OPENALEX - Publications

Nathan L Nehrt Wyatt T. Clark Predrag Radivojac Matthew W. Hahn

A common assumption in comparative genomics is that orthologous genes share greater functional similarity than do paralogous (the "ortholog conjecture"). Many methods used to computationally predict protein function are based on this assumption, even though it largely untested. Here we present the first large-scale test of ortholog conjecture using genomic data from human and mouse. We use experimentally derived functions more 8,900 genes, as well an independent microarray dataset, directly...

10.1371/journal.pcbi.1002073 article EN cc-by PLoS Computational Biology 2011-06-09

An integrated approach to inferring gene–disease associations in humans

OPENALEX - Publications

Predrag Radivojac Kang Peng Wyatt T. Clark Brandon Peters Amrita Mohan and 2 more

Abstract One of the most important tasks modern bioinformatics is development computational tools that can be used to understand and treat human disease. To date, a variety methods have been explored algorithms for candidate gene prioritization are gaining in their usefulness. Here, we propose an algorithm detecting gene–disease associations based on protein–protein interaction network, known associations, protein sequence, functional information at molecular level. Our method, PhenoPred,...

10.1002/prot.21989 article EN Proteins Structure Function and Bioinformatics 2008-02-25

Analysis of protein function and its prediction from amino acid sequence

OPENALEX - Publications

Wyatt T. Clark Predrag Radivojac

Understanding protein function is one of the keys to understanding life at molecular level. It also important in context human disease because many conditions arise as a consequence alterations function. The recent availability relatively inexpensive sequencing technology has resulted thousands complete or partially sequenced genomes with millions functionally uncharacterized proteins. Such large volume data, combined lack high-throughput experimental assays annotate proteins, attributes...

10.1002/prot.23029 article EN Proteins Structure Function and Bioinformatics 2011-03-21

Information-theoretic evaluation of predicted ontological annotations

OPENALEX - Publications

Wyatt T. Clark Predrag Radivojac

Abstract Motivation: The development of effective methods for the prediction ontological annotations is an important goal in computational biology, with protein function and disease gene prioritization gaining wide recognition. Although various algorithms have been proposed these tasks, evaluating their performance difficult owing to problems caused both by structure biomedical ontologies biased or incomplete experimental genes products. Results: We propose information-theoretic framework...

10.1093/bioinformatics/btt228 article EN cc-by-nc Bioinformatics 2013-06-19

Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms

OPENALEX - Publications

Alexej Abyzov Shantao Li Daniel Kim Marghoob Mohiyuddin Adrian M. Stütz and 9 more

Investigating genomic structural variants at basepair resolution is crucial for understanding their formation mechanisms. We identify and analyse 8,943 deletion breakpoints in 1,092 samples from the 1000 Genomes Project. find have more nearby SNPs indels than average, likely a consequence of relaxed selection. By investigating correlation with DNA methylation, Hi–C interactions, histone marks substitution patterns nucleotides near them, we that signature non-allelic homologous recombination...

10.1038/ncomms8256 article EN cc-by-nc-nd Nature Communications 2015-06-01

Comparative analysis of pseudogenes across three phyla

OPENALEX - Publications

Cristina Sisu Baikang Pei Jing Leng Adam Frankish Yan Zhang and 10 more

Significance Pseudogenes have long been considered nonfunctional elements. However, recent studies shown they can potentially regulate the expression of protein-coding genes. Capitalizing on available functional-genomics data and finished annotation human, worm, fly, we compared pseudogene complements across three phyla. We found that in contrast to genes, pseudogenes are highly lineage specific, reflecting genome history more so than conservation essential biological functions....

10.1073/pnas.1407293111 article EN Proceedings of the National Academy of Sciences 2014-08-25

Predicting disease severity in metachromatic leukodystrophy using protein activity and a patient phenotype matrix

OPENALEX - Publications

Marena Trinidad Xinying Hong Steven Froelich Jessica Daiker James Sacco and 7 more

Abstract Background Metachromatic leukodystrophy (MLD) is a lysosomal storage disorder caused by mutations in the arylsulfatase A gene ( ARSA ) and categorized into three subtypes according to age of onset. The functional effect most mutants remains unknown; better understanding genotype–phenotype relationship required support newborn screening (NBS) guide treatment. Results We collected patient data set from literature that relates disease severity genotype 489 individuals with MLD....

10.1186/s13059-023-03001-z article EN cc-by Genome biology 2023-07-21

The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective

OPENALEX - Publications

Yuxiang Jiang Wyatt T. Clark Iddo Friedberg Predrag Radivojac

Abstract Motivation: The automated functional annotation of biological macromolecules is a problem computational assignment concepts or ontological terms to genes and gene products. A number methods have been developed computationally annotate using standardized nomenclature such as Gene Ontology (GO). However, questions remain about the possibility for development accurate that can integrate disparate molecular data well an unbiased evaluation these methods. One important concern...

10.1093/bioinformatics/btu472 article EN cc-by-nc Bioinformatics 2014-08-22

Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A

OPENALEX - Publications

Shantanu Jain Marena Trinidad Thanh Nguyen Kaiya Jones Santiago Diaz Neto and 64 more

10.1007/s00439-025-02731-3 article EN Human Genetics 2025-03-08

Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A

OPENALEX - Publications

Shantanu Jain Marena Trinidad Thanh Nguyen Kaiya Jones Santiago Diaz Neto and 64 more

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods accurately determine clinical impact variants unknown significance (VUS). Towards this goal, ARSA Critical Assessment Genome Interpretation (CAGI) challenge was designed characterize progress by utilizing 219 experimentally assayed missense VUS

10.1101/2024.05.16.594558 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-05-19

Characterization of glycan substrates accumulating in GM1 Gangliosidosis

OPENALEX - Publications

Roger Lawrence Jeremy L. Van Vleet Linley Mangini Adam Harris Nathan Martin and 7 more

GM1 gangliosidosis is a rare autosomal recessive genetic disorder caused by the disruption of GLB1 gene that encodes β-galactosidase, lysosomal hydrolase removes β-linked galactose from non-reducing end glycans. Deficiency this catabolic enzyme leads to accumulation and its asialo derivative GA1 in β-galactosidase deficient patients animal models. In addition GA1, there are other glycoconjugates contain whose metabolites substrates for β-galactosidase. For example, number N-linked glycan...

10.1016/j.ymgmr.2019.100524 article EN cc-by-nc-nd Molecular Genetics and Metabolism Reports 2019-11-03

Identifying therapeutic drug targets using bidirectional effect genes

OPENALEX - Publications

Karol Estrada Steven Froelich Arthur Wüster Christopher R. Bauer Teague Sterling and 10 more

Abstract Prioritizing genes for translation to therapeutics common diseases has been challenging. Here, we propose an approach identify drug targets with high probability of success by focusing on both gain function (GoF) and loss (LoF) mutations associated opposing effects phenotype (Bidirectional Effect Selected Targets, BEST). We find 98 BEST a variety indications. Drugs targeting those are 3.8-fold more likely be approved than non-BEST genes. focus five ( IGF1R, NPPC, NPR2, FGFR3 , SHOX...

10.1038/s41467-021-21843-8 article EN cc-by Nature Communications 2021-04-13

Using Compression to Identify Classes of Inauthentic Texts

OPENALEX - Publications

Mehmet Dalkılıç Wyatt T. Clark James C. Costello Predrag Radivojac

Previous chapter Next Full AccessProceedings Proceedings of the 2006 SIAM International Conference on Data Mining (SDM)Using Compression to Identify Classes Inauthentic TextsMehmet M. Dalkilic, Wyatt T. Clark, James C. Costello, and Predrag RadivojacMehmet Radivojacpp.604 - 608Chapter DOI:https://doi.org/10.1137/1.9781611972764.69PDFBibTexSections ToolsAdd favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract Recent events have made it clear that some kinds technical texts,...

10.1137/1.9781611972764.69 article EN 2006-04-20

Prediction of functional regulatory SNPs in monogenic and complex disease

OPENALEX - Publications

Yiqiang Zhao Wyatt T. Clark Matthew Mort D.N. Cooper Predrag Radivojac and 1 more

Next-generation sequencing (NGS) technologies are yielding ever higher volumes of human genome sequence data. Given this large amount data, it has become both a possibility and priority to determine how disease-causing single nucleotide polymorphisms (SNPs) detected within gene regulatory regions (rSNPs) exert their effects on expression. Recently, several studies have explored whether attributes that can distinguish them from those neutral, attaining moderate success at discriminating...

10.1002/humu.21559 article EN Human Mutation 2011-07-27

Assessment of predicted enzymatic activity of α‐ N ‐acetylglucosaminidase variants of unknown significance for CAGI 2016

OPENALEX - Publications

Wyatt T. Clark Laura Kasak Constantina Bakolitsa Zhiqiang Hu Gaia Andreoletti and 28 more

The NAGLU challenge of the fourth edition Critical Assessment Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict impact variants unknown significance (VUS) on enzymatic activity lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies lead a rare, monogenic, recessive storage disorder, Sanfilippo syndrome type B (MPS IIIB). This attracted 17 submissions from 10 groups. We observed that top models were able missense mutations with Pearson's...

10.1002/humu.23875 article EN Human Mutation 2019-07-25

Utilizing ExAC to assess the hidden contribution of variants of unknown significance to Sanfilippo Type B incidence

OPENALEX - Publications

Wyatt T. Clark Guoying Karen Yu Mika Aoyagi-Scharber Jonathan H. LeBowitz

Given the large and expanding quantity of publicly available sequencing data, it should be possible to extract incidence information for monogenic diseases from allele frequencies, provided one knows which mutations are causal. We tested this idea on a rare, monogenic, lysosomal storage disorder, Sanfilippo Type B (Mucopolysaccharidosis type IIIB). is caused by in gene encoding α-N-acetylglucosaminidase (NAGLU). There were 189 NAGLU missense variants found ExAC dataset that comprises roughly...

10.1371/journal.pone.0200008 article EN cc-by PLoS ONE 2018-07-06

VECTOR QUANTIZATION KERNELS FOR THE CLASSIFICATION OF PROTEIN SEQUENCES AND STRUCTURES

OPENALEX - Publications

Wyatt T. Clark Predrag Radivojac

10.1142/9789814583220_0031 article EN Biocomputing 2013-11-01

Identifying therapeutic drug targets for rare and common forms of short stature

OPENALEX - Publications

Karol Estrada Steven Froelich Arthur Wüster Christopher R. Bauer Teague Sterling and 10 more

Abstract While GWAS of common diseases has delivered thousands novel genetic findings, prioritizing genes for translation to therapeutics been challenging. Here, we propose an approach resolve that issue by identifying have both gain function (GoF) and loss (LoF) mutations associated with opposing effects on phenotype (Bidirectional Effect Selected Targets, BEST). Bidirectionality is a desirable feature the best targets because it implies causal role in one direction modulating target...

10.1101/2020.04.02.022624 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-04-03

Utilizing activity assays and population-wide allele frequencies to assess the contribution of novel mutations in NAGLU to MPS IIIB incidence

OPENALEX - Publications

Jonathan H. LeBowitz Wyatt T. Clark Guoying Karen Yu Mika Aoyagi-Scharber

10.1016/j.ymgme.2015.12.334 article EN Molecular Genetics and Metabolism 2016-02-01

Utilizing ExAC to Assess the Hidden Contribution of Variants of Unknown Significance to Sanfilippo Type B Incidence

OPENALEX - Publications

Wyatt T. Clark Guoying Karen Yu Mika Aoyagi-Scharber Jonathan H. LeBowitz

Abstract Given the large and expanding quantity of publicly available sequencing data, it should be possible to extract incidence information for monogenic diseases from allele frequencies, provided one knows which mutations are causal. We tested this idea on a rare, monogenic, lysosomal storage disorder, Sanfilippo Type B (Mucopolysaccharidosis type IIIB). is caused by in gene encoding α-N-acetylglucosaminidase (NAGLU). There were 189 NAGLU missense variants found ExAC dataset that...

10.1101/253435 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-02-22

Coming Soon ...