- Metabolomics and Mass Spectrometry Studies
- Analytical Chemistry and Chromatography
- Bioinformatics and Genomic Networks
- Gaussian Processes and Bayesian Inference
- Mass Spectrometry Techniques and Applications
- Gene expression and cancer classification
- Blood properties and coagulation
- Erythrocyte Function and Pathophysiology
- Advanced Proteomics Techniques and Applications
- Computational Drug Discovery Methods
- Interactive and Immersive Displays
- Microbial Natural Products and Biosynthesis
- Control Systems and Identification
- Bayesian Methods and Mixture Models
- Microbial Metabolic Engineering and Bioproduction
- Tactile and Sensory Interactions
- Advanced Chemical Sensor Technologies
- Biomedical Text Mining and Ontologies
- Genomics and Phylogenetic Studies
- Gene Regulatory Network Analysis
- Advanced Vision and Imaging
- Spectroscopy and Chemometric Analyses
- Artificial Intelligence in Healthcare and Education
- Plant Stress Responses and Tolerance
- Explainable Artificial Intelligence (XAI)
University of Glasgow
2015-2024
National Health Service Scotland
2023-2024
Public Health Scotland
2024
University of Manchester
2014-2022
Glasgow Life
2019
Queen Alexandra Hospital
2019
Liverpool John Moores University
2015
Helsinki Institute for Information Technology
2014
University of Helsinki
2014
Aalto University
2014
Metabolomics has started to embrace computational approaches for chemical interpretation of large data sets. Yet, metabolite annotation remains a key challenge. Recently, molecular networking and MS2LDA emerged as mining tools that find families substructures in mass spectrometry fragmentation data. Moreover, silico obtain rank candidate molecules spectra. Ideally, all structural information obtained inferred from these could be combined increase the resulting insight one can set. However,...
Significance Tandem MS is a technique for compound identification in untargeted metabolomics experiments. Because of lack reference spectra, most molecules cannot be identified, and many spectra used. We present MS2LDA, an unsupervised method (inspired by text-mining) that extracts common patterns mass fragments neutral losses—Mass2Motifs—from collections fragmentation spectra. Structurally characterized Mass2Motifs can used to annotate which no exist expose biochemical relationships between...
Spectral similarity is used as a proxy for structural in many tandem mass spectrometry (MS/MS) based metabolomics analyses such library matching and molecular networking. Although weaknesses the relationship between spectral scores true similarities have been described, little development of alternative has undertaken. Here, we introduce Spec2Vec, novel score inspired by natural language processing algorithm—Word2Vec. Spec2Vec learns fragmental relationships within large set data to derive...
A statistical methodology for estimating dataset size requirements classifying microarray data using learning curves is introduced. The goal to use existing classification results estimate future experiments and evaluate the gain in accuracy significance of classifiers built with additional data. method based on fitting inverse power-law models construct empirical curves. It also includes a permutation test procedure assess performance given size. This applied several molecular problems...
It is well known in the statistics literature that augmenting binary and polychotomous response models with gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from parameter posterior. By adopting such a data augmentation strategy, dispensing priors over regression coefficients favor of process (GP) functions, employing variational approximations to full posterior, we obtain efficient computational methods for GP classification multiclass setting. 1 The model...
Abstract Motivation: Modern transcriptomics and proteomics enable us to survey the expression of RNAs proteins at large scales. While these data are usually generated analyzed separately, there is an increasing interest in comparing co-analyzing transcriptome proteome data. A major open question whether linked how it coordinated. Results: Here we have developed a probabilistic clustering model that permits analysis links between transcriptomic proteomic profiles sensible flexible manner. Our...
We present a visually guided, dual-arm, industrial robot system that is capable of autonomously flattening garments by means novel visual perception pipeline fully interprets high-quality RGB-D images clothing scene based on an active stereo head. A segmented range map B-Spline smoothed prior to being parsed shape and topology analysis into 'wrinkle' structures. The length, width height each wrinkle used quantify the thereby rank wrinkles size such greedy algorithm can identify largest...
Abstract Motivation We recently published MS2LDA, a method for the decomposition of sets molecular fragment data derived from large metabolomics experiments. To make more widely available to community, here we present ms2lda.org, web application that allows users upload their data, run MS2LDA analyses and explore results through interactive visualizations. Results Ms2lda.org takes tandem mass spectrometry in many standard formats user infer neutral loss features co-occur together...
Mass spectrometry data is at the heart of numerous applications in biomedical and life sciences.With growing use high-throughput techniques, researchers need to analyze larger more complex datasets.In particular through joint effort research community, fragmentation mass datasets are size number.Platforms such as MassBank (Horai et al., 2010), GNPS (Wang 2016) or MetaboLights (Haug 2020) serve an open-access hub for sharing raw, processed, annotated data.Without suitable tools, however,...
Abstract Molecular networking has become a key method used to visualize and annotate the chemical space in non-targeted mass spectrometry-based experiments. However, distinguishing isomeric compounds quantitative interpretation are currently limited. Therefore, we created Feature-based Networking (FBMN) as new analysis Global Natural Products Social (GNPS) infrastructure. FBMN leverages feature detection alignment tools enhance analyses isomer distinction, including from ion-mobility...
Users often struggle to enter text accurately on touchscreen keyboards. To address this, we present a flexible decoder for entry that combines probabilistic touch models with language model. We investigate two different models. The first model is based Gaussian Process regression approach and implicitly the inherent uncertainty of touching process. second allows users explicitly control via pressure. Using show character error rate can be reduced by up 7% over baseline method, 1.3% leading...
Abstract Metabolomics has started to embrace computational approaches for chemical interpretation of large data sets. Yet, metabolite annotation remains a key challenge. Recently, molecular networking and MS2LDA emerged as mining tools that find families substructures in mass spectrometry fragmentation data. Moreover, silico obtain rank candidate molecules spectra. Ideally, all structural information obtained inferred from these could be combined increase the resulting insight one can set....
Abstract Motivation: The use of liquid chromatography coupled to mass spectrometry has enabled the high-throughput profiling metabolite composition biological samples. However, large amount data obtained can be difficult analyse and often requires computational processing understand which metabolites are present in a sample. This article looks at dual problem annotating peaks sample with metabolite, together putatively whether is starting point approach Bayesian clustering into groups, each...
Integration of MS2LDA substructure discovery with MAGMa spectral annotations and ClassyFire term predictions complemented MotifDB significantly advances metabolite annotation.
Tandem mass spectrometry (LC-MS/MS) is widely used to identify unknown ions in untargeted metabolomics. Data-dependent acquisition (DDA) chooses which fragment based upon intensities observed MS1 survey scans and typically only fragments a small subset of the present. Despite this inefficiency, relatively little work has addressed development new DDA methods, partly due high overhead associated with running many extracts necessary optimize approaches busy MS facilities. In work, we first...
Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) represents a promising way finding such chemistry. However, due to the lack detailed biosynthetic knowledge majority predicted BGCs, large number possible combinations, this is not simple task. This problem...
There is currently much interest in reverse-engineering regulatory relationships between genes from microarray expression data. We propose a new algorithmic method for inferring such interactions using data gene knockout experiments. The algorithm we use the Sparse Bayesian regression of Tipping and Faul. This highly suited to this problem as it does not require be discretized, overcomes need an explicit topology search and, most importantly, requires no heuristic thresholding discovered...
High-accuracy mass spectrometry is a popular technology for high-throughput measurements of cellular metabolites (metabolomics). One the major challenges correct identification observed peaks, including assignment their empirical formula, based on measured mass.We propose novel probabilistic method formulas to peaks in metabolomics measurements. The incorporates information about possible biochemical transformations between assign higher probability that could be created from other sample....
We present a new computational technique (a software implementation, data sets, and supplementary information are available at http://www.enm.bris.ac.uk/lpd/) which enables the probabilistic analysis of cDNA microarray we demonstrate its effectiveness in identifying features biomedical importance. A hierarchical Bayesian model, called Latent Process Decomposition (LPD), is introduced each sample set represented as combinatorial mixture over finite latent processes, expected to correspond...
We present a flexible Machine Learning approach for learning user-specific touch input models to increase accuracy on mobile devices. The model is based flexible, non-parametric Gaussian Process regression and learned using recorded inputs. demonstrate that significant improvements can be obtained when either raw sensor data used as an or the device's reported location input, with latter marginally outperforming former. show offset functions are highly nonlinear outperform trained pooled...
In untargeted metabolomics approaches, the inability to structurally annotate relevant features and map them biochemical pathways is hampering full exploitation of many experiments. Furthermore, variable metabolic content across samples result in sparse feature matrices that are statistically hard handle. Here, we introduce MS2LDA+ tackles both above-mentioned problems. Previously, presented MS2LDA, which extracts biochemically molecular substructures ("Mass2Motifs") from a collection...