NFDI4DS | UHH-SEMS - Publication Details

Spyros Gidaris

ORCID: 0000-0003-1515-3635

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5070809773

Research Areas

Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Multimodal Machine Learning Applications
Advanced Image and Video Retrieval Techniques
Remote Sensing and LiDAR Applications
Robotics and Sensor-Based Localization
Video Surveillance and Tracking Methods
COVID-19 diagnosis using AI
Human Pose and Action Recognition
Anomaly Detection Techniques and Applications
Advanced Optical Sensing Technologies
Animal Disease Management and Epidemiology
Medical Image Segmentation Techniques
Image Retrieval and Classification Techniques
Handwritten Text Recognition Techniques
Machine Learning and Algorithms
Industrial Vision Systems and Defect Detection
3D Shape Modeling and Analysis
Image Processing and 3D Reconstruction
Underwater Acoustics Research
AI in cancer detection
Human Mobility and Location-Based Analysis
Advanced Vision and Imaging
Forecasting Techniques and Applications
Image Enhancement Techniques

Valeo (France)
2019-2025

École nationale des ponts et chaussées
2015-2019

Google (United States)
2018

Laboratoire d'Informatique Gaspard-Monge
2015-2018

Université Gustave Eiffel
2015-2016

Unsupervised Representation Learning by Predicting Image Rotations

OPENALEX - Publications

Spyros Gidaris Praveer Singh Nikos Komodakis

Over the last years, deep convolutional neural networks (ConvNets) have transformed field of computer vision thanks to their unparalleled capacity learn high level semantic image features. However, in order successfully those features, they usually require massive amounts manually labeled data, which is both expensive and impractical scale. Therefore, unsupervised feature learning, i.e., learning without requiring manual annotation effort, crucial importance harvest vast amount visual data...

10.48550/arxiv.1803.07728 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Dynamic Few-Shot Visual Learning Without Forgetting

OPENALEX - Publications

Spyros Gidaris Nikos Komodakis

The human visual system has the remarkably ability to be able effortlessly learn novel concepts from only a few examples. Mimicking same behavior on machine learning vision systems is an interesting and very challenging research problem with many practical advantages real world applications. In this context, goal of our work devise few-shot that during test time it will efficiently categories training data while at not forget initial which was trained (here called base categories). To...

10.1109/cvpr.2018.00459 article EN 2018-06-01

Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model

OPENALEX - Publications

Spyros Gidaris Nikos Komodakis

We propose an object detection system that relies on a multi-region deep convolutional neural network (CNN) also encodes semantic segmentation-aware features. The resulting CNN-based representation aims at capturing diverse set of discriminative appearance factors and exhibits localization sensitivity is essential for accurate localization. exploit the above properties our recognition module by integrating it iterative mechanism alternates between scoring box proposal refining its location...

10.1109/iccv.2015.135 preprint EN 2015-12-01

Boosting Few-Shot Visual Learning With Self-Supervision

OPENALEX - Publications

Spyros Gidaris Andrei Bursuc Nikos Komodakis Patrick Perez Perez Matthieu Cord

Few-shot learning and self-supervised address different facets of the same problem: how to train a model with little or no labeled data. aims for optimization methods models that can learn efficiently recognize patterns in low data regime. Self-supervised focuses instead on unlabeled looks into it supervisory signal feed high capacity deep neural networks. In this work we exploit complementarity these two domains propose an approach improving few-shot through self-supervision. We use...

10.1109/iccv.2019.00815 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Generating Classification Weights With GNN Denoising Autoencoders for Few-Shot Learning

OPENALEX - Publications

Spyros Gidaris Nikos Komodakis

Given an initial recognition model already trained on a set of base classes, the goal this work is to develop meta-model for few-shot learning. The meta-model, given as input some novel classes with few training examples per class, must properly adapt existing into new that can correctly classify in unified way both and classes. To accomplish it learn output appropriate classification weight vectors those two types build our we make use main innovations: propose Denoising Autoencoder network...

10.1109/cvpr.2019.00011 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in Autonomous Driving

OPENALEX - Publications

Angelika Ando Spyros Gidaris Andrei Bursuc Gilles Puy Alexandre Boulch and 1 more

Casting semantic segmentation of outdoor LiDAR point clouds as a 2D problem, e.g., via range projection, is an effective and popular approach. These projection-based methods usually benefit from fast computations and, when combined with techniques which use other cloud representations, achieve state-of-the-art results. Today, leverage CNNs but recent advances in computer vision show that transformers (ViTs) have achieved results many image- based benchmarks. In this work, we question if...

10.1109/cvpr52729.2023.00507 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Detect, Replace, Refine: Deep Structured Prediction for Pixel Wise Labeling

OPENALEX - Publications

Spyros Gidaris Nikos Komodakis

Pixel wise image labeling is an interesting and challenging problem with great significance in the computer vision community. In order for a dense algorithm to be able achieve accurate precise results, it has consider dependencies that exist joint space of both input output variables. An implicit approach modeling those by training deep neural network that, given as initial estimate labels image, will predict new refined labels. this context, our work concerned what optimal architecture...

10.1109/cvpr.2017.760 preprint EN 2017-07-01

LocNet: Improving Localization Accuracy for Object Detection

OPENALEX - Publications

Spyros Gidaris Nikos Komodakis

We propose a novel object localization methodology with the purpose of boosting accuracy stateof-the-art detection systems. Our model, given search region, aims at returning bounding box an interest inside this region. To accomplish its goal, it relies on assigning conditional probabilities to each row and column where these provide useful information regarding location boundaries region allow accurate inference under simple probabilistic framework. For implementing our we make use...

10.1109/cvpr.2016.92 preprint EN 2016-06-01

Learning Representations by Predicting Bags of Visual Words

OPENALEX - Publications

Spyros Gidaris Andrei Bursuc Nikos Komodakis Patrick Pérez Matthieu Cord

Self-supervised representation learning targets to learn convnet-based image representations from unlabeled data. Inspired by the success of NLP methods in this area, work we propose a self-supervised approach based on spatially dense descriptions that encode discrete visual concepts, here called words. To build such representations, quantize feature maps first pre-trained convnet, over k-means vocabulary. Then, as task, train another convnet predict histogram words an (i.e., its...

10.1109/cvpr42600.2020.00696 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data

OPENALEX - Publications

Corentin Sautier Gilles Puy Spyros Gidaris Alexandre Boulch Andrei Bursuc and 1 more

Segmenting or detecting objects in sparse Lidar point clouds are two important tasks autonomous driving to allow a vehicle act safely its 3D environment. The best performing methods semantic segmentation object detection rely on large amount of annotated data. Yet annotating data for these is tedious and costly. In this context, we propose self-supervised pretraining method perception models that tailored Specifically, leverage the availability synchronized calibrated image sensors setups...

10.1109/cvpr52688.2022.00966 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning

OPENALEX - Publications

Spyros Gidaris Andrei Bursuc Gilles Puy Nikos Komodakis Matthieu Cord and 1 more

Learning image representations without human supervision is an important and active research field. Several recent approaches have successfully leveraged the idea of making such a representation invariant under different types perturbations, especially via contrastive-based instance discrimination training. Although effective visual should indeed exhibit invariances, there are other characteristics, as encoding contextual reasoning skills, for which alternative reconstruction-based might be...

10.1109/cvpr46437.2021.00676 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization

OPENALEX - Publications

Nikos Komodakis Spyros Gidaris

The problem of computing category agnostic bounding box proposals is utilized as a core component in many computer vision tasks and thus has lately attracted lot attention. In this work we propose new approach to tackle that based on an active strategy for generating starts from set seed boxes, which are uniformly distributed the image, then progressively moves its attention promising image areas where it more likely discover well localized proposals. We call our AttractioNet CNN-based...

10.5244/c.30.90 preprint EN 2016-01-01

Localizing Objects with Self-Supervised Transformers and no Labels

OPENALEX - Publications

Oriane Siméoni Gilles Puy Huy V. Vo Simon Roburin Spyros Gidaris and 4 more

Localizing objects in image collections without supervision can help to avoid expensive annotation campaigns. We propose a simple approach this problem, that leverages the activation features of vision transformer pre-trained self-supervised manner. Our method, LOST, does not require any external object proposal nor exploration collection; it operates on single image. Yet, we outperform state-of-the-art discovery methods by up 8 CorLoc points PASCAL VOC 2012. also show training...

10.48550/arxiv.2109.14279 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation

OPENALEX - Publications

Antonín Vobecký David Hurych Oriane Siméoni Spyros Gidaris Andrei Bursuc and 2 more

Abstract Semantic image segmentation models typically require extensive pixel-wise annotations, which are costly to obtain and prone biases. Our work investigates learning semantic in urban scenes without any manual annotation. We propose a novel method for using raw, uncurated data from vehicle-mounted cameras LiDAR sensors, thus eliminating the need labeling. contributions as follows. First, we develop approach cross-modal unsupervised of by leveraging synchronized data. A crucial element...

10.1007/s11263-024-02320-3 article EN cc-by International Journal of Computer Vision 2025-01-15

Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers

OPENALEX - Publications

Efstathios Karypidis Ioannis Kakogeorgiou Spyros Gidaris Nikos Komodakis

Semantic future prediction is important for autonomous systems navigating dynamic environments. This paper introduces FUTURIST, a method multimodal semantic that uses unified and efficient visual sequence transformer architecture. Our approach incorporates masked modeling objective novel masking mechanism designed training. allows the model to effectively integrate visible information from various modalities, improving accuracy. Additionally, we propose VAE-free hierarchical tokenization...

10.48550/arxiv.2501.08303 preprint EN arXiv (Cornell University) 2025-01-14

EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

OPENALEX - Publications

Theodoros Kouzelis Ioannis Kakogeorgiou Spyros Gidaris Nikos Komodakis

Latent generative models have emerged as a leading approach for high-quality image synthesis. These rely on an autoencoder to compress images into latent space, followed by model learn the distribution. We identify that existing autoencoders lack equivariance semantic-preserving transformations like scaling and rotation, resulting in complex spaces hinder performance. To address this, we propose EQ-VAE, simple regularization enforces reducing its complexity without degrading reconstruction...

10.48550/arxiv.2502.09509 preprint EN arXiv (Cornell University) 2025-02-13

POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

OPENALEX - Publications

Antonín Vobecký Oriane Siméoni David Hurych Spyros Gidaris Andrei Bursuc and 2 more

We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map from input 2D images with the objective of enabling grounding, segmentation and retrieval free-form language queries. This is a challenging problem because 2D-3D ambiguity nature target tasks, where obtaining annotated training data in difficult. The contributions this work are three-fold. First, we design new model architecture for prediction. consists encoder together prediction 3D-language heads. output...

10.48550/arxiv.2401.09413 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey

OPENALEX - Publications

Oriane Siméoni Éloi Zablocki Spyros Gidaris Gilles Puy Patrick Pérez

10.1007/s11263-024-02167-8 article EN International Journal of Computer Vision 2024-08-20

Three Pillars Improving Vision Foundation Model Distillation for Lidar

OPENALEX - Publications

Gilles Puy Spyros Gidaris Alexandre Boulch Oriane Siméoni Corentin Sautier and 3 more

10.1109/cvpr52733.2024.02033 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Coming Soon ...