- Biomedical Text Mining and Ontologies
- Semantic Web and Ontologies
- Machine Learning and Data Classification
- Data Mining Algorithms and Applications
- Data Stream Mining Techniques
- Text and Document Classification Technologies
- Scientific Computing and Data Management
- Anomaly Detection Techniques and Applications
- Spacecraft Design and Technology
- Fault Detection and Control Systems
- Spacecraft and Cryogenic Technologies
- Time Series Analysis and Forecasting
- Rocket and propulsion systems research
- Neural Networks and Applications
- Imbalanced Data Classification Techniques
- Machine Learning in Bioinformatics
- Gene expression and cancer classification
- Research Data Management Practices
- Advanced Database Systems and Queries
- Rough Sets and Fuzzy Logic
- Statistical and Computational Modeling
- Machine Learning and Algorithms
- Face and Expression Recognition
- Metaheuristic Optimization Algorithms Research
- Remote Sensing and LiDAR Applications
Jožef Stefan Institute
2016-2025
Jožef Stefan International Postgraduate School
2019-2021
John Snow (United States)
2020
Automated annotation of protein function is challenging. As the number sequenced genomes rapidly grows, overwhelming majority products can only be annotated computationally. If computational predictions are to relied upon, it crucial that accuracy these methods high. Here we report results from first large-scale community-based critical assessment (CAFA) experiment. Fifty-four representing state art for prediction were evaluated on a target set 866 proteins 11 organisms. Two findings stand...
Motivated by the need for unification of field data mining and growing demand formalized representation outcomes research, we address task constructing an ontology mining. The proposed ontology, named OntoDM, is based on a recent proposal general framework mining, includes definitions basic entities, such as datatype dataset, task, algorithm components thereof (e.g., distance function), etc. It also allows definition more complex e.g., constraints in constraint-based sets (inductive queries)...
Multi-label classification (MLC) tasks are encountered more and frequently in machine learning applications. While MLC methods exist for the classical batch setting, only a few available streaming setting. In this paper, we propose new methodology via multi-target regression Moreover, develop regressor iSOUP-Tree that uses approach. We experimentally compare two variants of method (building model trees), as well ensembles iSOUP-Trees with state-of-the-art tree ensemble on data streams....
We present OntoDT, a generic ontology for the representation of scientific knowledge about datatypes. OntoDT defines basic entities, such as datatype, properties datatypes, specifications, characterizing operations, and datatype taxonomy. demonstrate utility on several use cases. was used within an Ontology core data mining entities constructing taxonomies datasets, tasks, generalizations algorithms. Furthermore, we show how can be to annotate query dataset repositories. also improve...
New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but wide range metagenomic collections such as human microbiome. To understand deluge genomic data we face, computational approaches for gene functional annotation invaluable. We introduce novel model that refines two established concepts: based on homology and phyletic profiling. The profiling-based includes both inferred orthologs paralogs—homologs separated by speciation...
The ML-Schema, proposed by the W3C Machine Learning Schema Community Group, is a top-level ontology that provides set of classes, properties, and restrictions for representing interchanging information on machine learning algorithms, datasets, experiments. It can be easily extended specialized it also mapped to other more domain-specific ontologies developed in area data mining. In this paper we overview existing state-of-the-art interchange formats present first release canonical format...
We propose AiTLAS—an open-source, state-of-the-art toolbox for exploratory and predictive analysis of satellite imagery. It implements a range deep-learning architectures models tailored the EO tasks illustrated in this case. The versatility applicability are showcased variety tasks, including image scene classification, semantic segmentation, object detection, crop type prediction. These use cases demonstrate potential to support complete data pipeline starting from preparation...
Abstract An essential characteristic of data streams is the possibility occurrence concept drift, i.e., change in distribution stream over time. The capability to detect and adapt changes mining methods thus a necessity. While for multi-target prediction on have recently appeared, they largely remained without such capability. In this paper, we propose novel detection adaptation context incremental online learning decision trees regression. One approaches ensemble based, while other uses...
We present six datasets containing telemetry data of the Mars Express Spacecraft (MEX), a spacecraft orbiting operated by European Space Agency. The consisting context and thermal power consumption measurements, capture status over three Martian years, sampled at different time resolutions that range from 1 min to 60 min. From analysis point-of-view, these are challenging even for more sophisticated state-of-the-art artificial intelligence methods. In particular, given heterogeneity,...