- Bioinformatics and Genomic Networks
- Mitochondrial Function and Pathology
- Cancer, Hypoxia, and Metabolism
- Machine Learning in Bioinformatics
- Protein Structure and Dynamics
- Genomics and Phylogenetic Studies
- Enzyme Structure and Function
- Metabolomics and Mass Spectrometry Studies
- Microbial Metabolic Engineering and Bioproduction
- Cell Image Analysis Techniques
- Analytical Chemistry and Chromatography
- Gene expression and cancer classification
- Evolution and Genetic Dynamics
- Epigenetics and DNA Methylation
- Genetics, Bioinformatics, and Biomedical Research
- RNA and protein synthesis mechanisms
- Advanced Proteomics Techniques and Applications
- Tryptophan and brain disorders
- Mass Spectrometry Techniques and Applications
- Polyamine Metabolism and Applications
- Identity, Memory, and Therapy
- Evolutionary Algorithms and Applications
- DNA and Nucleic Acid Chemistry
- Genetics, Aging, and Longevity in Model Organisms
- Machine Learning and Data Classification
University of Glasgow
2021-2024
University College London
2008-2023
University of Chicago
2022
UCL Australia
2013
Helsinki Institute for Information Technology
2012
University of Helsinki
2012
Mathématiques et Informatique Appliquées du Génome à l'Environnement
2006
University of Warwick
1999-2001
University of York
2000
Abstract Summary: The PSIPRED protein structure prediction server allows users to submit a sequence, perform of their choice and receive the results both textually via e-mail graphically web. user may select one three methods apply sequence: PSIPRED, highly accurate secondary method; MEMSAT 2, new version widely used transmembrane topology or GenTHREADER, sequence profile based fold recognition method. Availability: Freely available non-commercial at http://globin.bio.warwick.ac.uk/psipred/...
Here, we present the new UCL Bioinformatics Group’s PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. web portal provides greatly streamlined user interface with number features to allow users better explore their results. We offer additional services enable computationally scalable execution prediction methods; these include SOAP and XML-RPC server access HADOOP packages. All software are via Group...
Automated annotation of protein function is challenging. As the number sequenced genomes rapidly grows, overwhelming majority products can only be annotated computationally. If computational predictions are to relied upon, it crucial that accuracy these methods high. Here we report results from first large-scale community-based critical assessment (CAFA) experiment. Fifty-four representing state art for prediction were evaluated on a target set 866 proteins 11 organisms. Two findings stand...
A number of state-of-the-art protein structure prediction servers have been developed by researchers working in the Bioinformatics Unit at University College London. The popular PSIPRED server allows users to perform secondary prediction, transmembrane topology and fold recognition. More recent include DISOPRED for dynamic disorder DomPred domain boundary prediction. These are available from our software home page http://bioinf.cs.ucl.ac.uk/software.html.
Dynamically disordered regions appear to be relatively abundant in eukaryotic proteomes. The DISOPRED server allows users submit a protein sequence, and returns probability estimate of each residue the sequence being disordered. results are sent both plain text graphical formats, can also supply predictions secondary structure provide further structural information.The accessed by non-commercial at http://bioinf.cs.ucl.ac.uk/disopred/
A major bottleneck in our understanding of the molecular underpinnings life is assignment function to proteins. While experiments provide most reliable annotation proteins, their relatively low throughput and restricted purview have led an increasing role for computational prediction. However, assessing methods protein prediction tracking progress field remain challenging.We conducted second critical assessment functional (CAFA), a timed challenge assess that automatically assign function....
Lactobacillus delbrueckii ssp. bulgaricus ( L. ) is a representative of the group lactic acid-producing bacteria, mainly known for its worldwide application in yogurt production. The genome sequence this bacterium has been determined and shows signs ongoing specialization, with substantial number pseudogenes incomplete metabolic pathways relatively few regulatory functions. Several unique features support hypothesis that phase rapid evolution. i Exceptionally high numbers rRNA tRNA genes...
The UCL Bioinformatics Group web portal offers several high quality protein structure prediction and function annotation algorithms including PSIPRED, pGenTHREADER, pDomTHREADER, MEMSAT, MetSite, DISOPRED2, DomPred FFPred for the of secondary structure, fold, structural domain, transmembrane helix topology, metal binding sites, regions disorder, domain boundaries function, respectively. We also now offer a fully automated 3D modelling pipeline: BioSerf, which performed well in CASP8 uses...
Metformin is the first-line therapy for treating type 2 diabetes and a promising anti-aging drug. We set out to address fundamental question of how gut microbes nutrition, key regulators host physiology, affect effects metformin. Combining two tractable genetic models, bacterium E. coli nematode C. elegans, we developed high-throughput four-way screen define underlying host-microbe-drug-nutrient interactions. show that integrate cues from metformin diet through phosphotransferase signaling...
Fluoropyrimidines are the first-line treatment for colorectal cancer, but their efficacy is highly variable between patients. We queried whether gut microbes, a known source of inter-individual variability, impacted drug efficacy. Combining two tractable genetic models, bacterium E. coli and nematode C. elegans, we performed three-way high-throughput screens that unraveled complexity underlying host-microbe-drug interactions. report microbes can bolster or suppress effects fluoropyrimidines...
The results of the first Critical Assessment Fully Automated Structure Prediction (CAFASP-1) are presented. objective was to evaluate success rates fully automatic web servers for fold recognition which available community. This study based on targets used in third meeting Techniques Protein (CASP-3). However, unlike CASP-3, not a blind trial, as it held after structures were known. aim assess performance methods without user intervention that several groups their CASP-3 submissions....
Abstract Background Accurate protein function annotation is a severe bottleneck when utilizing the deluge of high-throughput, next generation sequencing data. Keeping database annotations up-to-date has become major scientific challenge that requires development reliable automatic predictors function. The CAFA experiment provided unique opportunity to undertake comprehensive 'blind testing' many diverse approaches for automated prediction. We report on methodology we used this and lessons...
We have implemented a genome annotation system for prokaryotes called AGMIAL. Our approach embodies number of key principles. First, expert manual annotators are seen as critical component the overall system; user interfaces were cyclically refined to satisfy their needs. Second, process should be orchestrated in terms global strategy; this facilitates coordination between team and automatic data analysis. Third, strategy allow progressive incremental from time when only few draft contigs...
The introduction of pneumococcal conjugate vaccines necessitates continued monitoring circulating strains to assess vaccine efficacy and replacement serotypes. Conventional serological methods are costly, labor-intensive, prone misidentification, while current DNA-based have limited serotype coverage requiring multiple PCR primers. In this study, a computer algorithm was developed interrogate the capsulation locus (cps) serotypes locate primer pairs in conserved regions that border variable...
Background: In untargeted metabolomics studies, liquid chromatography tandem mass spectrometry (LC-MS/MS) is a powerful analytical platform. The fragmentation spectra produced can be used as ``molecular fingerprints" to identify unknown metabolites. However, the high number of analytes that may co-eluting limits collected and potentially identified, presenting serious bottleneck for many studies. There need new strategies which are comprehensive, interpretable robust, meaning they produce...
Analysis of our fold recognition results in the 3rd Critical Assessment Structure Prediction (CASP3) experiment, using programs THREADER 2 and GenTHREADER, shows an encouraging level overall success. Of 23 submitted predictions, 20 targets showed no clear sequence similarity to proteins known 3D structure. These can be divided into 22 domains, which, domains either entirely match a previously fold, or partially substantial region fold. these we correctly assigned folds 10 cases.
A number of new and newly improved methods for predicting protein structure developed by the Jones-University College London group were used to make predictions CASP6 experiment. Structures predicted with a combination fold recognition (mGenTHREADER, nFOLD, THREADER) substantially enhanced version FRAGFOLD, our fragment assembly method. Attempts at automatic domain parsing made using DomPred DomSSEA, which are based on secondary algorithm additionally DomPred, simple local sequence alignment...
Abstract Although monozygotic (MZ) twins share the majority of their genetic makeup, they can be phenotypically discordant on several traits and diseases. DNA methylation is an epigenetic mechanism that influenced by genetic, environmental stochastic events may have important impact individual variability. In this study we explored differences in peripheral blood samples three MZ twin studies major depressive disorder (MDD). Epigenetic data for pairs were collected as part a previous using...
High-throughput gene expression can be used to address a wide range of fundamental biological problems, but datasets an appropriate size are often unavailable. Moreover, existing transcriptomics simulators have been criticized because they fail emulate key properties data. In this article, we develop method based on conditional generative adversarial network generate realistic data for Escherichia coli and humans. We assess the performance our approach across several tissues cancer-types.We...
What constitutes a baseline level of success for protein fold recognition methods? As benchmarks are often presented without any thought to the results that might be expected from purely random set predictions, an analysis baselines is long overdue. Given varying amounts basic information about protein-ranging length sequence knowledge its secondary structure-to what extent can determined by intelligent guesswork? Can simple methods make use structure assign folds more accurately than and...
Data-Dependent and Data-Independent Acquisition modes (DDA DIA, respectively) are both widely used to acquire MS2 spectra in untargeted liquid chromatography tandem mass spectrometry (LC-MS/MS) metabolomics analyses. Despite their wide use, little work has been attempted systematically compare MS/MS spectral annotation performance settings due the lack of ground truth costs involved running a large number acquisitions. Here, we present systematic silico comparison these two acquisition...