- Machine Learning in Bioinformatics
- Genomics and Phylogenetic Studies
- RNA and protein synthesis mechanisms
- Protein Structure and Dynamics
- Bioinformatics and Genomic Networks
- Genetics, Bioinformatics, and Biomedical Research
- Plant and Fungal Interactions Research
- Biomedical Text Mining and Ontologies
- Autophagy in Disease and Therapy
- RNA Research and Splicing
- Molecular Biology Techniques and Applications
- Glycosylation and Glycoproteins Research
- Computational Drug Discovery Methods
- Enzyme Structure and Function
- Advanced Proteomics Techniques and Applications
- Fractal and DNA sequence analysis
- Scientific Computing and Data Management
- Gene expression and cancer classification
- Semantic Web and Ontologies
- Algorithms and Data Compression
- Genetic and Kidney Cyst Diseases
- Nuclear Structure and Function
- Metabolomics and Mass Spectrometry Studies
- RNA Interference and Gene Delivery
- Bacterial biofilms and quorum sensing
University of Cyprus
2016-2025
University of Padua
2023
Universitas Cokroaminoto Yogyakarta
2021
Centre for Research and Technology Hellas
2012-2019
University of Manchester
2019
University of Hong Kong
2019
Innovative Research (United States)
2019
Stanford University
2019
Erasmus MC
2019
University of Luxembourg
2014
Intrinsically disordered proteins, defying the traditional protein structure-function paradigm, are a challenge to study experimentally. Because large part of our knowledge rests on computational predictions, it is crucial that their accuracy high. The Critical Assessment Intrinsic Disorder prediction (CAID) experiment was established as community-based blind test determine state art in intrinsically regions and subset residues involved binding. A total 43 methods were evaluated dataset 646...
Abstract The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including doubling protein entries, a new disorder ontology, improvements annotation format and completely website. website includes redesigned graphical interface, better search engine, clearer API for programmatic access interface that integrates text mining...
Macroautophagy was initially considered to be a nonselective process for bulk breakdown of cytosolic material. However, recent evidence points toward selective mode autophagy mediated by the so-called receptors (SARs). SARs act recognizing and sorting diverse cargo substrates (e.g., proteins, organelles, pathogens) autophagic machinery. Known are characterized short linear sequence motif (LIR-, LRS-, or AIM-motif) responsible interaction between proteins Atg8 family. Interestingly, many...
Atg8-family proteins are the best-studied of core autophagic machinery. They essential for elongation and closure phagophore into a proper autophagosome. Moreover, associated with from initiation process to, or just prior fusion between autophagosomes lysosomes. In addition to their implication in autophagosome biogenesis, they crucial selective autophagy through ability interact receptor necessary specific targeting substrates degradation. past few years it has been revealed that...
Abstract The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository manually curated annotations intrinsically disordered proteins and regions from literature. We report here recent updates DisProt version 9, including a restyled web interface, refactored Ontology (IDPO), improvements in curation process significant content growth around 30%. Higher quality consistency provided by newly implemented reviewing training curators. increased...
Abstract DisProt (URL: https://disprot.org) is the gold standard database for intrinsically disordered proteins and regions, providing valuable information about their functions. The latest version of brings significant advancements, including a broader representation functions an enhanced curation process. These improvements aim to increase both quality annotations coverage at sequence level. Higher has been achieved by adopting additional evidence codes. Quality improved systematically...
Abstract Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, highly conserved many organisms, and often play important roles functions diseases. In previous decades, several methods been developed identify or bias, but most of them as stand-alone applications currently there is no web-based tool...
The human microbiome has emerged as a central research topic in biology and biomedicine. Current studies generate high-throughput omics data across different body sites, populations, life stages. Many of the challenges are similar to other studies, quantitative analyses need address heterogeneity data, specific statistical properties, remarkable variation composition individuals sites. This led broad spectrum machine learning that range from study design, processing, standardization...
Sensitive detection and masking of low-complexity regions in protein sequences. Filtered sequences can be used sequence comparison without the risk matching compositionally biased regions. The main advantage method over similar approaches is selective single residue types affecting other, possibly important, regions.A novel algorithm for region masking. based on multiple-pass Smith-Waterman query against twenty homopolymers with infinite gap penalties. output both masked further analysis,...
We present a novel method that predicts transmembrane domains in proteins using solely information contained the sequence itself. The PRED-TMR algorithm described, refines standard hydrophobicity analysis with detection of potential termini (`edges', starts and ends) regions. This allows one both to discard highly hydrophobic regions not delimited by clear start end configurations confirm putative segments distinguishable their composition. accuracy obtained on test set 101 non-homologous...
Abstract We provide the first high-throughput analysis of properties and functional role Low Complexity Regions (LCRs) in more than 1500 prokaryotic phage proteomes. observe that, contrary to a widespread belief based on older sparse data, LCRs actually have significant, persistent highly conserved presence many diverse prokaryotes. Their specific amino acid content is linked proteins with certain molecular functions, such as binding RNA, DNA, metal-ions polysaccharides. In addition, been...
The ability to monitor interactions between individuals over time can provide us with information on life histories, mating systems, behavioural and ecological the environment. Tracking has traditionally been a time‐ often cost‐intensive exercise, certain types of animals are particularly hard monitor. Here we use canonical discriminant analysis (CDA) identify individual Mexican Ant‐thrushes using data extracted semi‐automated procedure from song recordings. We test CDA time, recordings...
The function annotation process in computational biology has increasingly shifted from the traditional characterization of individual biochemical roles protein molecules to system-wide detection entire metabolic pathways and genomic structures. so-called genome-aware methods broaden misannotation inconsistencies genome sequences beyond assignments, encompassing phylogenetic anomalies artifactual regions. We outline three categories error propagation databases by providing striking examples -...
Abstract To assess the role of core metabolism genes in bacterial virulence - independently their effect on growth we correlated genome, transcriptome and pathogenicity flies mice 30 fully sequenced Pseudomonas strains. Gene presence correlates robustly with differences among all species, but not P. aeruginosa However, gene expression are evident between highly lowly pathogenic strains multiple factors a few genes. Moreover, 16.5%, noticeable fraction strain PA14 (compared to 8.5%...
Genome-wide functional annotation either by manual or automatic means has raised considerable concerns regarding the accuracy of assignments and reproducibility methodologies. In addition, a performance evaluation automated systems that attempt to tackle sequence analyses rapidly reproducibly is generally missing. order quantify function on genome-wide scale, we have re-annotated entire genome Chlamydia trachomatis (serovar D), in collaborative manner.We encoded all annotations structured...
Several selective macroautophagy receptor and adaptor proteins bind members of the Atg8 (autophagy related 8) family using short linear motifs (SLiMs), most often referred to as Atg8-family interacting (AIMs) or LC3-interacting regions (LIRs). AIM/LIR have been extensively studied during last fifteen years, since they can uncover underlying biological mechanisms possible substrates for this key catabolic process eukaryotic cells. Prompted by fact that experimental information regarding LIR...
Protein Secondary Structure Prediction (PSSP) is regarded as a challenging task in bioinformatics, and numerous approaches to achieve more accurate prediction have been proposed. Accurate PSSP can be instrumental inferring protein tertiary structure their functions. Machine Learning particular Deep show promising results for the problem. In this paper, we deploy Convolutional Neural Network (CNN) trained with Subsampled Hessian Newton (SHN) method (a Free Optimisation variant), two-...
Abstract Motivation: Local compositionally biased and low complexity regions (LCRs) in amino acid sequences have initially attracted the interest of researchers due to their implication generating artifacts sequence database searches. There is accumulating evidence biological significance LCRs both physiological pathological situations. Nonetheless, LCR-related algorithms tools not gained wide appreciation across research community, partly fact that only a handful user-friendly software...
Intrinsically disordered proteins (IDPs) contain regions lacking intrinsic globular structure (intrinsically regions, IDRs). IDPs are present across the tree of life, with great variability IDR type and frequency even between closely related taxa. To investigate function IDRs, we evaluated compared distribution disorder content in 10,695 reference proteomes, confirming its high finding certain correlation along Euteleostomi (bony vertebrates) lineage to number cell types. We used comparison...
Intrinsically disordered regions (IDRs) in protein sequences are flexible, have low structural constraints and as a result faster rates of evolution. This lack evolutionary conservation greatly limits the use sequence homology for classification functional assessment IDRs, opposed to globular domains. The study IDRs requires other properties their prediction. While composition bias is not necessary property compositionally biased (CBRs) been noted frequent part IDRs. We hypothesized that...
The vast cell-surface receptor family of G-protein coupled receptors (GPCRs) is the focus both academic and pharmaceutical research due to their key role in cell physiology along with amenability drug intervention. As data flow rate from various genome proteome projects continues grow, so does need for fast, automated reliable screening new members GPCR families. PRED-GPCR a free Internet service recognition classification at level. A submitted sequence or set sequences, queried against...
The iterative process of finding relevant information in biomedical literature and performing bioinformatics analyses might result an endless loop for inexperienced user, considering the exponential growth scientific corpora plethora tools designed to mine PubMed(®) related biological databases. Herein, we describe BioTextQuest(+), a web-based interactive knowledge exploration platform with significant advances its predecessor (BioTextQuest), aiming bridge processes such as bioentity...