NFDI4DS | UHH-SEMS - Publication Details

Vicky Kalogeiton

ORCID: 0000-0002-7368-6993

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5048400549

Research Areas

Multimodal Machine Learning Applications
Human Pose and Action Recognition
Generative Adversarial Networks and Image Synthesis
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Video Analysis and Summarization
Video Surveillance and Tracking Methods
Advanced Neural Network Applications
Face recognition and analysis
Evacuation and Crowd Dynamics
Image Retrieval and Classification Techniques
Human Motion and Animation
Anomaly Detection Techniques and Applications
Data Visualization and Analytics
Quantum-Dot Cellular Automata
Music and Audio Processing
Image Enhancement Techniques
Slime Mold and Myxomycetes Research
Traffic control and management
Advanced Image Processing Techniques
Gait Recognition and Analysis
Humor Studies and Applications
Advanced Memory and Neural Computing
Cell Image Analysis Techniques
Visual Attention and Saliency Detection

École Polytechnique
2021-2024

Laboratoire d'Informatique de l'École Polytechnique
2021-2024

Centre National de la Recherche Scientifique
2021-2024

University of Oxford
2019-2021

Oxford Research Group
2020

Democritus University of Thrace
2010-2019

University of Edinburgh
2015-2017

Université Grenoble Alpes
2015-2017

Laboratoire Jean Kuntzmann
2015

Institut national de recherche en informatique et en automatique
2015

Action Tubelet Detector for Spatio-Temporal Action Localization

OPENALEX - Publications

Vicky Kalogeiton Philippe Weinzaepfel Vittorio Ferrari Cordelia Schmid

Current state-of-the-art approaches for spatio-temporal action localization rely on detections at the frame level that are then linked or tracked across time. In this paper, we leverage temporal continuity of videos instead operating level. We propose ACtion Tubelet detector (ACT-detector) takes as input a sequence frames and outputs tubelets, i.e., sequences bounding boxes with associated scores. The same way object detectors anchor boxes, our ACT-detector is based cuboids. build upon SSD...

10.1109/iccv.2017.472 preprint EN 2017-10-01

Joint Learning of Object and Action Detectors

OPENALEX - Publications

Vicky Kalogeiton Philippe Weinzaepfel Vittorio Ferrari Cordelia Schmid

While most existing approaches for detection in videos focus on objects or human actions separately, we aim at jointly detecting performing actions, such as cat eating dog jumping. We introduce an end-to-end multitask objective that learns object-action relationships. compare it with different training objectives, validate its effectiveness objects-actions videos, and show both tasks of object action benefit from this joint learning. Moreover, the proposed architecture can be used zero-shot...

10.1109/iccv.2017.219 preprint EN 2017-10-01

Real-Time Active SLAM and Obstacle Avoidance for an Autonomous Robot Based on Stereo Vision

OPENALEX - Publications

Vicky Kalogeiton Konstantinos Ioannidis Georgios Ch. Sirakoulis Elias B. Kosmatopoulos

In this article, the problem of real-time robot exploration and map building (active SLAM) is considered. A single stereo vision camera exploited by a fully autonomous to navigate, localize itself, define its surroundings, avoid any possible obstacle in aim maximizing mapped region following optimal route. modified version so-called cognitive-based adaptive optimization algorithm introduced for successfully complete tasks real time local minima entrapment. The method's effectiveness...

10.1080/01969722.2018.1541599 article EN Cybernetics & Systems 2019-03-11

Analysing Domain Shift Factors between Videos and Images for Object Detection

OPENALEX - Publications

Vicky Kalogeiton Vittorio Ferrari Cordelia Schmid

Object detection is one of the most important challenges in computer vision. detectors are usually trained on bounding-boxes from still images. Recently, video has been used as an alternative source data. Yet, for a given test domain (image or video), performance detector depends it was on. In this paper, we examine reasons behind gap. We define and evaluate different shift factors: spatial location accuracy, appearance diversity, image quality aspect distribution. impact these factors by...

10.1109/tpami.2016.2551239 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2016-04-06

LAEO-Net: Revisiting People Looking at Each Other in Videos

OPENALEX - Publications

Manuel J. Marín‐Jiménez Vicky Kalogeiton Pablo Medina-Suarez Andrew Zisserman

Capturing the 'mutual gaze' of people is essential for understanding and interpreting social interactions between them. To this end, paper addresses problem detecting Looking At Each Other (LAEO) in video sequences. For purpose, we propose LAEO-Net, a new deep CNN determining LAEO videos. In contrast to previous works, LAEO-Net takes spatio-temporal tracks as input reasons about whole track. It consists three branches, one each character's tracked head their relative position. Moreover,...

10.1109/cvpr.2019.00359 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

A Survey on Reinforcement Learning Methods in Character Animation

OPENALEX - Publications

Ariel Kwiatkowski Eduardo Alvarado Vicky Kalogeiton C. Karen Liu Julien Pettré and 2 more

Abstract Reinforcement Learning is an area of Machine focused on how agents can be trained to make sequential decisions, and achieve a particular goal within arbitrary environment. While learning, they repeatedly take actions based their observation the environment, receive appropriate rewards which define objective. This experience then used progressively improve policy controlling agent's behavior, typically represented by neural network. module reused for similar problems, makes this...

10.1111/cgf.14504 article EN Computer Graphics Forum 2022-05-01

One-shot Unsupervised Domain Adaptation with Personalized Diffusion Models

OPENALEX - Publications

Yasser Benigmim Subhankar Roy Slim Essid Vicky Kalogeiton Stéphane Lathuilière

Adapting a segmentation model from labeled source domain to target domain, where single unlabeled datum is available, one of the most challenging problems in adaptation and otherwise known as one-shot un-supervised (OSUDA). Most prior works have addressed problem by relying on style transfer techniques, images are stylized appearance domain. Departing common notion transferring only "texture" information, we leverage text-to-image diffusion models (e.g., Stable Diffusion) generate synthetic...

10.1109/cvprw59228.2023.00077 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023-06-01

FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild

OPENALEX - Publications

Zhi-Song Liu Robin Courant Vicky Kalogeiton

Abstract Automatically understanding funny moments (i.e., the that make people laugh) when watching comedy is challenging, as they relate to various features, such body language, dialogues and culture. In this paper, we propose FunnyNet-W, a model relies on cross- self-attention for visual, audio text data predict in videos. Unlike most methods rely ground truth form of subtitles, work exploit modalities come naturally with videos: (a) video frames contain visual information indispensable...

10.1007/s11263-024-02000-2 article EN cc-by International Journal of Computer Vision 2024-02-23

Cellular automaton model of crowd evacuation inspired by slime mould

OPENALEX - Publications

Vicky Kalogeiton Dim P. Papadopoulos Ioannis Georgilas Georgios Ch. Sirakoulis Andrew Adamatzky

In all the living organisms, self-preservation behaviour is almost universal. Even most simple of like slime mould, typically under intense selective pressure to evolve a response ensure their evolution and safety in best possible way. On other hand, evacuation place can be easily characterized as one stressful situations for individuals taking part on it. Taking inspiration from mould behaviour, we are introducing computational bio-inspired model crowd model. Cellular Automata (CA) were...

10.1080/03081079.2014.997527 article EN International Journal of General Systems 2015-03-12

UGaitNet: Multimodal Gait Recognition With Missing Input Modalities

OPENALEX - Publications

Manuel J. Marín‐Jiménez Francisco M. Castro Rubén Delgado-Escaño Vicky Kalogeiton Nicolás Guil

Gait recognition systems typically rely solely on silhouettes for extracting gait signatures. Nevertheless, these approaches struggle with changes in body shape and dynamic backgrounds; a problem that can be alleviated by learning from multiple modalities. However, many real-life some modalities missing, therefore most existing multimodal frameworks fail to cope missing To tackle this problem, work, we propose UGaitNet, unifying framework recognition, robust UGaitNet handles mingles various...

10.1109/tifs.2021.3132579 article EN IEEE Transactions on Information Forensics and Security 2021-01-01

Name Your Style: An Arbitrary Artist-aware Image Style Transfer

OPENALEX - Publications

Zhi-Song Liu Liwen Wang Wan-Chi Siu Vicky Kalogeiton

Image style transfer has attracted widespread attention in the past few years. Despite its remarkable results, it requires additional images available as references, making less flexible and inconvenient. Using text is most natural way to describe style. More importantly, can implicit abstract styles, like styles of specific artists or art movements. In this paper, we propose a text-driven image (TxST) that leverages advanced image-text encoders control arbitrary transfer. We introduce...

10.48550/arxiv.2202.13562 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Collaborating Foundation Models for Domain Generalized Semantic Segmentation

OPENALEX - Publications

Yasser Benigmim Subhankar Roy Slim Essid Vicky Kalogeiton Stéphane Lathuilière

10.1109/cvpr52733.2024.00300 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Programmable Crossbar Quantum-Dot Cellular Automata Circuits

OPENALEX - Publications

Vicky Kalogeiton Dim P. Papadopoulos Orestis Liolis Vassilios A. Mardiris Georgios Ch. Sirakoulis and 1 more

Quantum-dot fabrication and characterization is a well-established technology, which used in photonics, quantum optics, nanoelectronics. Four quantum-dots placed at the corners of square form unit cell, can hold bit information serve as basis for quantum-dot cellular automata (QCA) nanoelectronic circuits. Although several basic QCA circuits have been designed, fabricated, tested, proving that functional, fast low-power circuits, nanoelectronics still remain its infancy. One reasons this...

10.1109/tcad.2016.2618869 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2016-10-19

Learning the What and How of Annotation in Video Object Segmentation

OPENALEX - Publications

Thanos Delatolas Vicky Kalogeiton Dim P. Papadopoulos

Video Object Segmentation (VOS) is crucial for several applications, from video editing to data generation. Training a VOS model requires an abundance of manually labeled training videos. The de-facto traditional way annotating objects humans draw detailed segmentation masks on the target at each frame. This annotation process, however, tedious and time-consuming. To reduce this cost, in paper, we propose EVA-VOS, human-in-the-loop framework object segmentation. Unlike approach, introduce...

10.1109/wacv57701.2024.00680 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

Automatic summarization and annotation of videos with lack of metadata information

OPENALEX - Publications

Dim P. Papadopoulos Vicky Kalogeiton Savvas A. Chatzichristofis Nikos Papamarkos

10.1016/j.eswa.2013.02.016 article EN Expert Systems with Applications 2013-03-08

Face, Body, Voice: Video Person-Clustering with Multiple Modalities

OPENALEX - Publications

Andrew Brown Vicky Kalogeiton Andrew Zisserman

The objective of this work is person-clustering in videos – grouping characters according to their identity. Previous methods focus on the narrower task face-clustering, and for most part ignore other cues such as person's voice, overall appearance (hair, clothes, posture), editing structure videos. Similarly, current datasets evaluate only rather than person-clustering. This limits applicability downstream applications story understanding which require person-level, face-level, reasoning.In...

10.1109/iccvw54120.2021.00357 preprint EN 2021-10-01

Multiple Style Transfer Via Variational Autoencoder

OPENALEX - Publications

Zhi-Song Liu Vicky Kalogeiton Marie‐Paule Cani

Modern works on style transfer focus transferring from a single image. Recently, some approaches study multiple transfer; these, however, are either too slow or fail to mix styles. We propose ST-VAE, Variational AutoEncoder for latent space-based transfer. It performs by projecting nonlinear styles linear space, enabling merge via interpolation before the new content To evaluate we experiment COCO and also present case revealing that ST-VAE outperforms other methods while being faster,...

10.1109/icip42928.2021.9506379 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2021-08-23

Name your style: text-guided artistic style transfer

OPENALEX - Publications

Zhi-Song Liu Liwen Wang Wan-Chi Siu Vicky Kalogeiton

Image style transfer has attracted widespread attention in the past years. Despite its remarkable results, it requires additional images available as references, making less flexible and inconvenient. Using text is most natural way to describe style. Text can implicit abstract styles, like styles of specific artists or art movements. In this work, we propose a text-driven (TxST) that leverages advanced image-text encoders control arbitrary transfer. We introduce contrastive training strategy...

10.1109/cvprw59228.2023.00359 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2023-06-01

LAEO-Net++: Revisiting People Looking at Each Other in Videos

OPENALEX - Publications

Manuel J. Marín‐Jiménez Vicky Kalogeiton Pablo Medina-Suarez Andrew Zisserman

Capturing the 'mutual gaze' of people is essential for understanding and interpreting social interactions between them. To this end, paper addresses problem detecting <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Looking At Each Other (LAEO)</i> in video sequences. For purpose, we propose LAEO-Net++, a new deep CNN determining LAEO videos. In contrast to previous works, LAEO-Net++ takes spatio-temporal tracks as input reasons about whole...

10.1109/tpami.2020.3048482 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-12-31

Understanding reinforcement learned crowds

OPENALEX - Publications

Ariel Kwiatkowski Vicky Kalogeiton Julien Pettré Marie‐Paule Cani

10.1016/j.cag.2022.11.007 article EN publisher-specific-oa Computers & Graphics 2022-11-23

Coming Soon ...