NFDI4DS | UHH-SEMS - Publication Details

A Benchmark of Data Stream Classification for Human Activity Recognition on Connected Objects

OPENALEX - Publications

Martin Khannouz Tristan Glatard

This paper evaluates data stream classifiers from the perspective of connected devices, focusing on use case Human Activity Recognition. We measure both classification performance and resource consumption (runtime, memory, power) five usual algorithms, implemented in a consistent library, applied to two real human activity datasets three synthetic datasets. Regarding performance, results show overall superiority Hoeffding Tree, Mondrian forest, Naïve Bayes over Feedforward Neural Network...

10.3390/s20226486 article EN cc-by Sensors 2020-11-13

OrpailleCC: a Library for Data Stream Analysis on Embedded Systems

OPENALEX - Publications

Martin Khannouz Bo Li Tristan Glatard

The Internet of Things could benefit in several ways from mining data streams on connected objects rather than the cloud.In particular, limiting network communication with cloud services would improve user privacy and reduce energy consumption devices.Besides, applications leverage computing power for improved scalability.

10.21105/joss.01485 article EN cc-by The Journal of Open Source Software 2019-07-25

Mondrian forest for data stream classification under memory constraints

OPENALEX - Publications

Martin Khannouz Tristan Glatard

10.1007/s10618-023-00970-4 article EN Data Mining and Knowledge Discovery 2023-10-17

Reducing numerical precision preserves classification accuracy in Mondrian Forests

OPENALEX - Publications

Marc Vicuna Martin Khannouz Gregory Kiar Yohan Chatelain Tristan Glatard

Mondrian Forests are a powerful data stream classification method, but their large memory footprint makes them ill-suited for low-resource platforms such as connected objects. We explored using reduced-precision floating-point representations to lower consumption and evaluated its effect on performance. applied the Forest implementation provided by OrpailleCC, C++ collection of algorithms, two canonical datasets in human activity recognition: Recofit Banos et al. Results show that precision...

10.1109/bigdata52589.2021.9671377 article EN 2021 IEEE International Conference on Big Data (Big Data) 2021-12-15

A benchmark of data stream classification for human activity recognition on connected objects

OPENALEX - Publications

Martin Khannouz Tristan Glatard

This paper evaluates data stream classifiers from the perspective of connected devices, focusing on use case HAR. We measure both classification performance and resource consumption (runtime, memory, power) five usual algorithms, implemented in a consistent library, applied to two real human activity datasets three synthetic datasets. Regarding performance, results show an overall superiority HT, MF, NB over FNN Micro Cluster Nearest Neighbor (MCNN) 4 out 6, including ones. In addition, some...

10.48550/arxiv.2008.11880 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Mondrian Forest for Data Stream Classification Under Memory Constraints

OPENALEX - Publications

Martin Khannouz Tristan Glatard

Supervised learning algorithms generally assume the availability of enough memory to store their data model during training and test phases. However, in Internet Things, this assumption is unrealistic when comes form infinite streams, or are deployed on devices with reduced amounts memory. In paper, we adapt online Mondrian forest classification algorithm work constraints streams. particular, design five out-of-memory strategies update trees new points limit reached. Moreover, trimming...

10.48550/arxiv.2205.07871 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest

OPENALEX - Publications

Martin Khannouz Tristan Glatard

Supervised learning algorithms generally assume the availability of enough memory to store data models during training and test phases. However, this assumption is unrealistic when comes in form infinite streams, or are deployed on devices with reduced amounts memory. Such constraints impact model behavior assumptions. In paper, we show that under constraints, increasing size a tree-based ensemble classifier can worsen its performance. particular, experimentally existence an optimal for...

10.48550/arxiv.2210.05704 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Dynamic Ensemble Size Adjustment for Memory Constrained Mondrian Forest

OPENALEX - Publications

Martin Khannouz Tristan Glatard

Supervised learning algorithms generally assume the availability of enough memory to store data models during training and test phases. However, this assumption is unrealistic when comes in form infinite streams, or are deployed on devices with reduced amounts memory. Such constraints impact model behavior assumptions. In paper, we show that under constraints, increasing size a tree-based ensemble classifier can worsen its performance. particular, experimentally existence an optimal for...

10.1109/bigdata55660.2022.10020511 article EN 2021 IEEE International Conference on Big Data (Big Data) 2022-12-17

Reducing numerical precision preserves classification accuracy in Mondrian Forests

OPENALEX - Publications

Marc Vicuna Martin Khannouz Gregory Kiar Yohan Chatelain Tristan Glatard

Mondrian Forests are a powerful data stream classification method, but their large memory footprint makes them ill-suited for low-resource platforms such as connected objects. We explored using reduced-precision floating-point representations to lower consumption and evaluated its effect on performance. applied the Forest implementation provided by OrpailleCC, C++ collection of algorithms, two canonical datasets in human activity recognition: Recofit Banos \emph{et al}. Results show that...

10.48550/arxiv.2106.14340 preprint EN cc-by arXiv (Cornell University) 2021-01-01