Wyatt T. Clark

ORCID: 0000-0002-5041-3669
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Bioinformatics and Genomic Networks
  • Lysosomal Storage Disorders Research
  • Biomedical Text Mining and Ontologies
  • RNA regulation and disease
  • Cytomegalovirus and herpesvirus research
  • Machine Learning in Bioinformatics
  • Genomics and Rare Diseases
  • Computational Drug Discovery Methods
  • Cellular transport and secretion
  • Genomics and Phylogenetic Studies
  • Genetics and Neurodevelopmental Disorders
  • RNA modifications and cancer
  • Trypanosoma species research and implications
  • Genomic variations and chromosomal abnormalities
  • Cancer Genomics and Diagnostics
  • Algorithms and Data Compression
  • Biochemical and Molecular Research
  • Genomics and Chromatin Dynamics
  • Advanced Proteomics Techniques and Applications
  • Microbial Metabolic Engineering and Bioproduction
  • Glycosylation and Glycoproteins Research
  • Fractal and DNA sequence analysis
  • Computability, Logic, AI Algorithms
  • Carbohydrate Chemistry and Synthesis
  • Chronic Lymphocytic Leukemia Research

BioMarin (United States)
2016-2023

Yale University
2014-2016

Whitney Museum of American Art
2015

Mayo Clinic
2015

Indiana University Bloomington
2008-2014

Miami University
2014

Indiana University
2008

Predrag Radivojac Wyatt T. Clark Tal Oron Alexandra M. Schnoes Tobias Wittkop and 95 more Artem Sokolov Kiley Graim Christopher S. Funk Karin Verspoor Asa Ben‐Hur Gaurav Pandey Jeffrey M. Yunes Ameet Talwalkar Susanna Repo Michael L Souza Damiano Piovesan Rita Casadio Zheng Wang Jianlin Cheng Hai Fang Julian Gough Patrik Koskinen Petri Törönen Jussi Nokso-Koivisto Liisa Holm Domenico Cozzetto Daniel Buchan Kevin Bryson David T. Jones Bhakti Limaye Harshal Inamdar Avik Datta Sunitha K Manjari Rajendra Joshi Meghana Chitale Daisuke Kihara Andreas Martin Lisewski Serkan Erdin Eric Venner Olivier Lichtarge Robert Rentzsch Haixuan Yang Alfonso E. Romero Prajwal Bhat Alberto Paccanaro Tobias Hamp Rebecca Kaßner Stefan Seemayer Esmeralda Vicedo Christian Schaefer Dominik Achten Florian Auer Ariane C. Boehm Tatjana Braun Maximilian Hecht B. Mark Heron Peter Hönigschmid Thomas A. Hopf Stefanie Kaufmann Michael Kiening Denis Krompaß Cedric Landerer Yannick Mahlich Manfred Roos Jari Björne Tapio Salakoski Andrew Wong Hagit Shatkay Fanny Gatzmann I. Sommer Mark N. Wass Michael J.E. Sternberg Nives Škunca Fran Supek Matko Bošnjak Panče Panov Sašo Džeroski Tomislav Šmuc Yiannis Kourmpetis Aalt D. J. van Dijk Cajo J. F. ter Braak Yuanpeng Zhou Qingtian Gong Xinran Dong Weidong Tian Marco Falda Paolo Fontana Enrico Lavezzo Barbara Di Camillo Stefano Toppo Liang Lan Nemanja Djuric Yuhong Guo Slobodan Vučetić Amos Bairoch Michal Linial Patricia C. Babbitt Steven E. Brenner Christine Orengo Burkhard Rost

Automated annotation of protein function is challenging. As the number sequenced genomes rapidly grows, overwhelming majority products can only be annotated computationally. If computational predictions are to relied upon, it crucial that accuracy these methods high. Here we report results from first large-scale community-based critical assessment (CAFA) experiment. Fifty-four representing state art for prediction were evaluated on a target set 866 proteins 11 organisms. Two findings stand...

10.1038/nmeth.2340 article EN cc-by-nc-sa Nature Methods 2013-01-27
Yuxiang Jiang Tal Oron Wyatt T. Clark Asma Bankapur Daniel D’Andrea and 95 more Rosalba Lepore Christopher S. Funk Indika Kahanda Karin Verspoor Asa Ben‐Hur Da Chen Emily Koo Duncan Penfold-Brown Dennis Shasha Noah Youngs Richard Bonneau Alexandra J. Lin Sayed Mohammad Ebrahim Sahraeian Pier Luigi Martelli Giuseppe Profiti Rita Casadio Renzhi Cao Zhaolong Zhong Jianlin Cheng Adrian Altenhoff Nives Škunca Christophe Dessimoz Tunca Doğan Kai Hakala Suwisa Kaewphan Farrokh Mehryary Tapio Salakoski Filip Ginter Hai Fang Ben Smithers Matt E. Oates Julian Gough Petri Törönen Patrik Koskinen Liisa Holm Ching-Tai Chen Wen−Lian Hsu Kevin Bryson Domenico Cozzetto Federico Minneci David T. Jones Samuel Chapman Dukka Bkc Ishita Khan Daisuke Kihara Dan Ofer Nadav Rappoport Amos Stern Elena Cibrián–Uhalte Paul Denny Rebecca E. Foulger Reija Hieta Duncan Legge Ruth C. Lovering Michele Magrane Anna N. Melidoni Prudence Mutowo Klemens Pichler Aleksandra Shypitsyna Biao Li Pooya Zakeri Sarah ElShal Léon-Charles Tranchevent Sayoni Das Natalie L. Dawson David Lee Jonathan Lees Ian Sillitoe Prajwal Bhat Tamás Nepusz Alfonso E. Romero Rajkumar Sasidharan Haixuan Yang Alberto Paccanaro Jesse Gillis Adriana E. Sedeño-Cortés Paul Pavlidis Shou Feng Juan Miguel Cejuela Tatyana Goldberg Tobias Hamp Lothar Richter Asaf Salamov Toni Gabaldón Marina Marcet‐Houben Fran Supek Qingtian Gong Wei Ning Yuanpeng Zhou Weidong Tian Marco Falda Paolo Fontana Enrico Lavezzo Stefano Toppo Carlo Ferrari Manuel Giollo

A major bottleneck in our understanding of the molecular underpinnings life is assignment function to proteins. While experiments provide most reliable annotation proteins, their relatively low throughput and restricted purview have led an increasing role for computational prediction. However, assessing methods protein prediction tracking progress field remain challenging.We conducted second critical assessment functional (CAFA), a timed challenge assess that automatically assign function....

10.1186/s13059-016-1037-6 article EN cc-by Genome biology 2016-09-07

A common assumption in comparative genomics is that orthologous genes share greater functional similarity than do paralogous (the "ortholog conjecture"). Many methods used to computationally predict protein function are based on this assumption, even though it largely untested. Here we present the first large-scale test of ortholog conjecture using genomic data from human and mouse. We use experimentally derived functions more 8,900 genes, as well an independent microarray dataset, directly...

10.1371/journal.pcbi.1002073 article EN cc-by PLoS Computational Biology 2011-06-09

Abstract One of the most important tasks modern bioinformatics is development computational tools that can be used to understand and treat human disease. To date, a variety methods have been explored algorithms for candidate gene prioritization are gaining in their usefulness. Here, we propose an algorithm detecting gene–disease associations based on protein–protein interaction network, known associations, protein sequence, functional information at molecular level. Our method, PhenoPred,...

10.1002/prot.21989 article EN Proteins Structure Function and Bioinformatics 2008-02-25

Understanding protein function is one of the keys to understanding life at molecular level. It also important in context human disease because many conditions arise as a consequence alterations function. The recent availability relatively inexpensive sequencing technology has resulted thousands complete or partially sequenced genomes with millions functionally uncharacterized proteins. Such large volume data, combined lack high-throughput experimental assays annotate proteins, attributes...

10.1002/prot.23029 article EN Proteins Structure Function and Bioinformatics 2011-03-21

Abstract Motivation: The development of effective methods for the prediction ontological annotations is an important goal in computational biology, with protein function and disease gene prioritization gaining wide recognition. Although various algorithms have been proposed these tasks, evaluating their performance difficult owing to problems caused both by structure biomedical ontologies biased or incomplete experimental genes products. Results: We propose information-theoretic framework...

10.1093/bioinformatics/btt228 article EN cc-by-nc Bioinformatics 2013-06-19

Investigating genomic structural variants at basepair resolution is crucial for understanding their formation mechanisms. We identify and analyse 8,943 deletion breakpoints in 1,092 samples from the 1000 Genomes Project. find have more nearby SNPs indels than average, likely a consequence of relaxed selection. By investigating correlation with DNA methylation, Hi–C interactions, histone marks substitution patterns nucleotides near them, we that signature non-allelic homologous recombination...

10.1038/ncomms8256 article EN cc-by-nc-nd Nature Communications 2015-06-01

Significance Pseudogenes have long been considered nonfunctional elements. However, recent studies shown they can potentially regulate the expression of protein-coding genes. Capitalizing on available functional-genomics data and finished annotation human, worm, fly, we compared pseudogene complements across three phyla. We found that in contrast to genes, pseudogenes are highly lineage specific, reflecting genome history more so than conservation essential biological functions....

10.1073/pnas.1407293111 article EN Proceedings of the National Academy of Sciences 2014-08-25

Abstract Background Metachromatic leukodystrophy (MLD) is a lysosomal storage disorder caused by mutations in the arylsulfatase A gene ( ARSA ) and categorized into three subtypes according to age of onset. The functional effect most mutants remains unknown; better understanding genotype–phenotype relationship required support newborn screening (NBS) guide treatment. Results We collected patient data set from literature that relates disease severity genotype 489 individuals with MLD....

10.1186/s13059-023-03001-z article EN cc-by Genome biology 2023-07-21

Abstract Motivation: The automated functional annotation of biological macromolecules is a problem computational assignment concepts or ontological terms to genes and gene products. A number methods have been developed computationally annotate using standardized nomenclature such as Gene Ontology (GO). However, questions remain about the possibility for development accurate that can integrate disparate molecular data well an unbiased evaluation these methods. One important concern...

10.1093/bioinformatics/btu472 article EN cc-by-nc Bioinformatics 2014-08-22

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods accurately determine clinical impact variants unknown significance (VUS). Towards this goal, ARSA Critical Assessment Genome Interpretation (CAGI) challenge was designed characterize progress by utilizing 219 experimentally assayed missense VUS

10.1101/2024.05.16.594558 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-05-19

GM1 gangliosidosis is a rare autosomal recessive genetic disorder caused by the disruption of GLB1 gene that encodes β-galactosidase, lysosomal hydrolase removes β-linked galactose from non-reducing end glycans. Deficiency this catabolic enzyme leads to accumulation and its asialo derivative GA1 in β-galactosidase deficient patients animal models. In addition GA1, there are other glycoconjugates contain whose metabolites substrates for β-galactosidase. For example, number N-linked glycan...

10.1016/j.ymgmr.2019.100524 article EN cc-by-nc-nd Molecular Genetics and Metabolism Reports 2019-11-03

Abstract Prioritizing genes for translation to therapeutics common diseases has been challenging. Here, we propose an approach identify drug targets with high probability of success by focusing on both gain function (GoF) and loss (LoF) mutations associated opposing effects phenotype (Bidirectional Effect Selected Targets, BEST). We find 98 BEST a variety indications. Drugs targeting those are 3.8-fold more likely be approved than non-BEST genes. focus five ( IGF1R, NPPC, NPR2, FGFR3 , SHOX...

10.1038/s41467-021-21843-8 article EN cc-by Nature Communications 2021-04-13

Previous chapter Next Full AccessProceedings Proceedings of the 2006 SIAM International Conference on Data Mining (SDM)Using Compression to Identify Classes Inauthentic TextsMehmet M. Dalkilic, Wyatt T. Clark, James C. Costello, and Predrag RadivojacMehmet Radivojacpp.604 - 608Chapter DOI:https://doi.org/10.1137/1.9781611972764.69PDFBibTexSections ToolsAdd favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract Recent events have made it clear that some kinds technical texts,...

10.1137/1.9781611972764.69 article EN 2006-04-20

Next-generation sequencing (NGS) technologies are yielding ever higher volumes of human genome sequence data. Given this large amount data, it has become both a possibility and priority to determine how disease-causing single nucleotide polymorphisms (SNPs) detected within gene regulatory regions (rSNPs) exert their effects on expression. Recently, several studies have explored whether attributes that can distinguish them from those neutral, attaining moderate success at discriminating...

10.1002/humu.21559 article EN Human Mutation 2011-07-27

The NAGLU challenge of the fourth edition Critical Assessment Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict impact variants unknown significance (VUS) on enzymatic activity lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies lead a rare, monogenic, recessive storage disorder, Sanfilippo syndrome type B (MPS IIIB). This attracted 17 submissions from 10 groups. We observed that top models were able missense mutations with Pearson's...

10.1002/humu.23875 article EN Human Mutation 2019-07-25

Given the large and expanding quantity of publicly available sequencing data, it should be possible to extract incidence information for monogenic diseases from allele frequencies, provided one knows which mutations are causal. We tested this idea on a rare, monogenic, lysosomal storage disorder, Sanfilippo Type B (Mucopolysaccharidosis type IIIB). is caused by in gene encoding α-N-acetylglucosaminidase (NAGLU). There were 189 NAGLU missense variants found ExAC dataset that comprises roughly...

10.1371/journal.pone.0200008 article EN cc-by PLoS ONE 2018-07-06

Abstract While GWAS of common diseases has delivered thousands novel genetic findings, prioritizing genes for translation to therapeutics been challenging. Here, we propose an approach resolve that issue by identifying have both gain function (GoF) and loss (LoF) mutations associated with opposing effects on phenotype (Bidirectional Effect Selected Targets, BEST). Bidirectionality is a desirable feature the best targets because it implies causal role in one direction modulating target...

10.1101/2020.04.02.022624 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-04-03

Abstract Given the large and expanding quantity of publicly available sequencing data, it should be possible to extract incidence information for monogenic diseases from allele frequencies, provided one knows which mutations are causal. We tested this idea on a rare, monogenic, lysosomal storage disorder, Sanfilippo Type B (Mucopolysaccharidosis type IIIB). is caused by in gene encoding α-N-acetylglucosaminidase (NAGLU). There were 189 NAGLU missense variants found ExAC dataset that...

10.1101/253435 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2018-02-22
Coming Soon ...