Ronald Poppe

ORCID: 0000-0002-0843-7878
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Speech and dialogue systems
  • Hand Gesture Recognition Systems
  • Anomaly Detection Techniques and Applications
  • Video Surveillance and Tracking Methods
  • Advanced Vision and Imaging
  • Social Robot Interaction and HRI
  • Advanced Neural Network Applications
  • Innovative Human-Technology Interaction
  • Gait Recognition and Analysis
  • Multimodal Machine Learning Applications
  • Video Analysis and Summarization
  • Digital Games and Media
  • Human Motion and Animation
  • Advanced Image Processing Techniques
  • Deception detection and forensic psychology
  • Visual Attention and Saliency Detection
  • Advanced Image and Video Retrieval Techniques
  • Educational Games and Gamification
  • Domain Adaptation and Few-Shot Learning
  • Emotion and Mood Recognition
  • Action Observation and Synchronization
  • Interactive and Immersive Displays
  • Gaze Tracking and Assistive Technology
  • Psychopathy, Forensic Psychiatry, Sexual Offending

Netherlands Organisation for Applied Scientific Research
2025

Utrecht University
2015-2024

University of Twente
2006-2017

Arizona State University
2016

Lancaster University
2013

Carnegie Mellon University
2013

Microsoft Research (United Kingdom)
2013

University of Duisburg-Essen
2013

Human Media
2004-2006

10.1016/j.imavis.2009.11.014 article EN Image and Vision Computing 2009-12-12

10.1016/j.cviu.2006.10.016 article EN Computer Vision and Image Understanding 2007-01-26

Convolutional Neural Networks (CNNs) use pooling to decrease the size of activation maps. This process is crucial increase receptive fields and reduce computational requirements subsequent convolutions. An important feature operation minimization information loss, with respect initial maps, without a significant impact on computation memory overhead. To meet these requirements, we propose SoftPool: fast efficient method for exponentially weighted downsampling. Through experiments across...

10.1109/iccv48922.2021.01019 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Light field imaging presents an attractive alternative to RGB because of the recording direction incoming light. The detection salient regions in a light image benefits from additional modeling angular patterns. For imaging, methods using CNNs have achieved excellent results on range tasks, including saliency detection. However, it is not trivial use CNN-based for images these are specifically designed processing inputs. In addition, current datasets sufficiently large train CNNs. To...

10.1109/tip.2020.2970529 article EN IEEE Transactions on Image Processing 2020-01-01

Greenness in the urban living environment is inconsistently associated with mental health. Satellite-derived measures of greenness may inadequately characterize how people encounter visually on site, but systematic comparisons are lacking. We aimed 1) to compare associations between remotely sensed and street view (SV) greenness, 2) examine whether these metrics differently health outcomes. used cross-sectional depressive anxiety symptoms data adults Amsterdam, Netherlands. employed a...

10.1016/j.landurbplan.2021.104181 article EN cc-by Landscape and Urban Planning 2021-07-06

Pooling layers are essential building blocks of convolutional neural networks (CNNs), to reduce computational overhead and increase the receptive fields proceeding operations. Their goal is produce downsampled volumes that closely resemble input volume while, ideally, also being computationally memory efficient. Meeting both these requirements remains a challenge. To this end, we propose an adaptive exponentially weighted pooling method: adaPool. Our method learns regional-specific fusion...

10.1109/tip.2022.3227503 article EN cc-by-nc-nd IEEE Transactions on Image Processing 2022-12-12

10.1016/j.cviu.2019.102799 article EN Computer Vision and Image Understanding 2019-08-19

Few-shot instance segmentation methods are promising when labeled training data for novel classes is scarce. However, current approaches do not facilitate flexible addition of classes. They also require that examples each class provided at train and test time, which memory intensive. In this paper, we address these limitations by presenting the first incremental approach to few-shot segmentation: iMTFA. We learn discriminative embeddings object instances merged into representatives. Storing...

10.1109/cvpr46437.2021.00124 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

In our daily life everything and everyone occupies an amount of space, simply by "being there". Edward Hall coined the term proxemics for studies man's use this space. This paper presents a study on in Human-Robot Interaction particularly robot's approaching groups people. As social psychology research found to be culturally dependent, we focus question appropriateness approach behavior different cultures. We present online survey (N=181) that was distributed three countries; China, U.S....

10.1145/2631488.2631499 article EN 2014-08-20

Matching objects across partially overlapping camera views is crucial in multi-camera systems and requires a view-invariant feature extraction network. Training such network with cycle-consistency circumvents the need for labor-intensive labeling. In this paper, we extend mathematical formulation of to handle partial overlap. We then introduce pseudo-mask which directs training loss take overlap into account. additionally present several new cycle variants that complement each other...

10.48550/arxiv.2501.06000 preprint EN arXiv (Cornell University) 2025-01-10

10.5220/0013080900003912 article EN Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 2025-01-01

For an artifact such as a robot or virtual agent to respond appropriately human social touch behavior, it should be able automatically detect and recognize touch. This paper describes the data collection of CoST: Corpus Social Touch, set containing 7805 captures 14 different gestures. All gestures were performed in three variants: gentle, normal rough on pressure sensor grid wrapped around mannequin arm. Recognition these gesture classes using various classifiers yielded accuracies up 60 %;...

10.1007/s12193-016-0232-9 article EN cc-by Journal on Multimodal User Interfaces 2016-10-21

Touch behavior is of great importance during social interaction. To transfer the tactile modality from interpersonal interaction to other areas such as Human-Robot Interaction (HRI) and remote communication automatic recognition touch necessary. This paper introduces CoST: Corpus Social Touch, a collection containing 7805 instances 14 different gestures. The gestures were performed in three variations: gentle, normal rough, on sensor grid wrapped around mannequin arm. Recognition rough...

10.1145/2663204.2663242 article EN 2014-11-12

In this work we employ multitask learning to capitalize on the structure that exists in related supervised tasks train complex neural networks. It allows training a network for multiple objectives parallel, order improve performance at least one of them by capitalizing shared representation is developed accommodate more information than it otherwise would single task. We idea tackle action recognition egocentric videos introducing additional tasks. consider verbs and nouns from which labels...

10.1109/iccvw.2019.00540 article EN 2019-10-01

Egocentric vision is an emerging field of computer that characterized by the acquisition images and video from first person perspective. In this paper we address challenge egocentric human action recognition utilizing presence position detected regions interest in scene explicitly, without further use visual features. Initially, recognize hands are essential execution actions focus on obtaining their movements as principal cues define actions. We employ object detection region tracking...

10.1109/smartworld-uic-atc-scalcom-iop-sci.2019.00185 article EN 2019-08-01

We manually designed rules for a backchannel (BC) prediction model based on pitch and pause information.In short, the predicts BC when there is of certain length that preceded by falling or rising pitch.This was validated against Dutch IFADV Corpus in corpus-based evaluation method.The results showed our performs slightly better than another well-known rule-based uses only information.We observed preceding one important features this model, next to duration slope at end an utterance.Further,...

10.21437/interspeech.2010-59 article EN Interspeech 2022 2010-09-26
Coming Soon ...