- Data Stream Mining Techniques
- Evolution and Paleontology Studies
- Machine Learning and Data Classification
- Time Series Analysis and Forecasting
- Species Distribution and Climate Change
- Pleistocene-Era Hominins and Archaeology
- Wildlife Ecology and Conservation
- Advanced Bandit Algorithms Research
- Anomaly Detection Techniques and Applications
- Human Mobility and Location-Based Analysis
- Fault Detection and Control Systems
- Evolutionary Game Theory and Cooperation
- Bat Biology and Ecology Studies
- Data Mining Algorithms and Applications
- Explainable Artificial Intelligence (XAI)
- Remote Sensing in Agriculture
- Machine Learning and Algorithms
- Advanced Radiotherapy Techniques
- Ethics and Social Impacts of AI
- Imbalanced Data Classification Techniques
- Primate Behavior and Ecology
- Transportation Planning and Optimization
- Traffic Prediction and Management Techniques
- Geology and Paleoclimatology Research
- Evolution and Genetic Dynamics
University of Helsinki
2015-2024
Finnish Museum of Natural History
2018-2024
Aalto University
2012-2017
Helsinki Institute for Information Technology
2013-2016
University of Technology
2014
Bournemouth University
2011-2013
Eindhoven University of Technology
2009-2011
Tamedia (Switzerland)
2011
Vilnius University
2005-2010
In learning to classify streaming data, obtaining true labels may require major effort and incur excessive cost. Active focuses on carefully selecting as few labeled instances possible for an accurate predictive model. Streaming data poses additional challenges active learning, since the distribution change over time (concept drift) models need adapt. Conventional strategies concentrate querying most uncertain instances, which are typically concentrated around decision boundary. Changes...
Every day, huge volumes of sensory, transactional, and web data are continuously generated as streams, which need to be analyzed online they arrive. Streaming can considered one the main sources what is called big data. While predictive modeling for streams have received a lot attention over last decade, many research approaches typically designed well-behaved controlled problem settings, overlooking important challenges imposed by real-world applications. This article presents discussion on...
Concept drift refers to a non stationary learning problem over time. The training and the application data often mismatch in real life problems. In this report we present context of concept 1. We focus on issues relevant adaptive set formation. framework terminology, formulate global picture learners design. start with formalizing for drifting Section 2 discuss adaptivity mechanisms learners. 3 overview principle chapter give general available algorithms categorize them based their...
Although most business processes change over time, contemporary process mining techniques tend to analyze these as if they are in a steady state. Processes may suddenly or gradually. The drift be periodic (e.g., because of seasonal influences) one-of-a-kind the effects new legislation). For management, it is crucial discover and understand such concept drifts processes. This paper presents generic framework specific detect when changes localize parts that have changed. Different features...
Our study revisits the problem of accuracy-fairness tradeoff in binary classification. We argue that comparison non-discriminatory classifiers needs to account for different rates positive predictions, otherwise conclusions about performance may be misleading, because accuracy and discrimination naive baselines on same dataset vary with predictions. provide methodological recommendations sound classifiers, present a brief theoretical empirical analysis tradeoffs between non-discrimination.
The global distribution of vegetation is largely determined by climatic conditions and feeds back into the climate system. To predict future changes in response to change, it crucial identify understand key patterns processes that couple climate. Dynamic models (DGVMs) have been widely applied describe types their dynamics change. As a process-based approach, partly relies on hard-coded thresholds constrain vegetation. What implement DGVMs how replace them with more descriptions remain among...
Learning from evolving streaming data has become a 'hot' research topic in the last decade and many adaptive learning algorithms have been developed. This was stimulated by rapidly growing amounts of industrial, transactional, sensor other business that arrives real time needs to be mined time. Under such circumstances, constant manual adjustment models is in-efficient with increasing becoming infeasible. Nevertheless, are still rarely employed applications practice. In light structurally...
Abstract We present the Eco-ISEA3H database, a compilation of global spatial data characterizing climate, geology, land cover, physical and human geography, geographic ranges nearly 900 large mammalian species. The are tailored for machine learning (ML)-based ecological modeling, intended primarily continental- to global-scale ecometric species distribution modeling. Such models trained on present-day applied geologic past, or future scenarios climatic environmental change. Model training...
Handling changes over time in supervised learning (concept drift) lately has received a great deal of attention, number adaptive strategies have been developed. Most them make an optimistic assumption that the new labels become available immediately. In real sequential classification tasks it is often unrealistic due to task specific delayed labeling or associated costs. We address problem change detectability, given, are not available. this analytical study we look at space from...
Many supervised learning approaches that adapt to changes in data distribution over time (e.g., concept drift) have been developed. The majority of them assume the comes already preprocessed or preprocessing is an integral part a algorithm. In real-application tasks, from, e.g., sensor readings, typically noisy, contain missing values, redundant features, and very large model development efforts devoted preprocessing. As evolving time, models need be able automatically. From practical...