NFDI4DS | UHH-SEMS - Publication Details

Enver Sangineto

ORCID: 0000-0002-5187-4133

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5024183744

Research Areas

Generative Adversarial Networks and Image Synthesis
Advanced Image and Video Retrieval Techniques
Image Retrieval and Classification Techniques
Domain Adaptation and Few-Shot Learning
Video Surveillance and Tracking Methods
Human Pose and Action Recognition
Face recognition and analysis
Anomaly Detection Techniques and Applications
Face and Expression Recognition
Multimodal Machine Learning Applications
Advanced Vision and Imaging
Digital Media Forensic Detection
Visual Attention and Saliency Detection
Learning Styles and Cognitive Differences
Advanced Neural Network Applications
Open Education and E-Learning
Robotics and Sensor-Based Localization
Remote-Sensing Image Classification
Image and Object Detection Techniques
Video Analysis and Summarization
Computer Graphics and Visualization Techniques
Semantic Web and Ontologies
Advanced Image Processing Techniques
Intelligent Tutoring Systems and Adaptive Learning
Gaze Tracking and Assistive Technology

University of Modena and Reggio Emilia
2022-2024

University of Trento
2014-2022

Italian Institute of Technology
2011-2013

Sapienza University of Rome
1997-2011

Centro di Ricerca in Matematica Pura ed Applicata
2002-2006

University of Salerno
2006

Roma Tre University
2003-2006

Deformable GANs for Pose-Based Human Image Generation

OPENALEX - Publications

Aliaksandr Siarohin Enver Sangineto Stéphane Lathuilière Nicu Sebe

In this paper we address the problem of generating person images conditioned on a given pose. Specifically, an image and target pose, synthesize new that in novel order to deal with pixel-to-pixel misalignments caused by pose differences, introduce deformable skip connections generator our Generative Adversarial Network. Moreover, nearest-neighbour loss is proposed instead common L 1...

10.1109/cvpr.2018.00359 preprint EN 2018-06-01

Abnormal event detection in videos using generative adversarial nets

OPENALEX - Publications

Mahdyar Ravanbakhsh Moin Nabi Enver Sangineto Lucio Marcenaro Carlo S. Regazzoni and 1 more

In this paper we address the abnormality detection problem in crowded scenes. We propose to use Generative Adversarial Nets (GANs), which are trained using normal frames and corresponding optical-flow images order learn an internal representation of scene normality. Since our GANs with only data, they not able generate abnormal events. At testing time real data compared both appearance motion representations reconstructed by areas detected computing local differences. Experimental results on...

10.1109/icip.2017.8296547 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2017-09-01

Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection

OPENALEX - Publications

Mahdyar Ravanbakhsh Moin Nabi Hossein Mousavi Enver Sangineto Nicu Sebe

Most of the crowd abnormal event detection methods rely on complex hand-crafted features to represent motion and appearance. Convolutional Neural Networks (CNN) have shown be a powerful instrument with excellent representational capacities, which can leverage need for features. In this paper, we show that keeping track changes in CNN feature across time used effectively detect local anomalies. Specifically, propose measure abnormality by combining semantic information (inherited from...

10.1109/wacv.2018.00188 article EN 2018-03-01

Training Adversarial Discriminators for Cross-Channel Abnormal Event Detection in Crowds

OPENALEX - Publications

Mahdyar Ravanbakhsh Enver Sangineto Moin Nabi Nicu Sebe

Abnormal crowd behaviour detection attracts a large interest due to its importance in video surveillance scenarios. However, the ambiguity and lack of sufficient abnormal ground truth data makes end-to-end training deep networks hard this domain. In paper we propose use Generative Adversarial Nets (GANs), which are trained generate only normal distribution data. During adversarial GAN training, discriminator (D) is used as supervisor for generator network (G) vice versa. At testing time D...

10.1109/wacv.2019.00206 article EN 2019-01-01

Unsupervised Domain Adaptation Using Feature-Whitening and Consensus Loss

OPENALEX - Publications

Subhankar Roy Aliaksandr Siarohin Enver Sangineto Samuel Rota Bulò Nicu Sebe and 1 more

A classifier trained on a dataset seldom works other datasets obtained under different conditions due to domain shift. This problem is commonly addressed by adaptation methods. In this work we introduce novel deep learning framework which unifies paradigms in unsupervised adaptation. Specifically, propose alignment layers implement feature whitening for the purpose of matching source and target distributions. Additionally, leverage unlabeled data proposing Min-Entropy Consensus loss,...

10.1109/cvpr.2019.00970 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

We are not All Equal

OPENALEX - Publications

Enver Sangineto Gloria Zen Elisa Ricci Nicu Sebe

Previous works on facial expression analysis have shown that person specific models are advantageous with respect to generic ones for recognizing expressions of new users added the gallery set. This finding is not surprising, due often significant inter-individual variability: different persons morphological aspects and express their emotions in ways. However, acquiring person-specific labeled data learning a very time consuming process. In this work we propose transfer method compute...

10.1145/2647868.2654916 article EN 2014-10-31

Self Paced Deep Learning for Weakly Supervised Object Detection

OPENALEX - Publications

Enver Sangineto Moin Nabi Dubravko Ćulibrk Nicu Sebe

In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not available, most of the solutions proposed so far are based on an iterative, Multiple Instance Learning framework in which current classifier used select highest-confidence boxes each image, treated as pseudo-ground next training iteration. However, errors immature can make process drift, usually introducing many false positives dataset. To...

10.1109/tpami.2018.2804907 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-02-12

FOIL it! Find One mismatch between Image and Language caption

OPENALEX - Publications

Ravi Shekhar Sandro Pezzelle Yauhen Klimovich Aurélie Herbelot Moin Nabi and 2 more

In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between two modalities. To end, propose an extension of MSCOCO dataset, FOIL-COCO, which associates images with both correct "foil" captions, that is, descriptions image are highly similar original ones, but contain one single mistake ("foil word"). We show LaVi fall into traps data perform badly on three tasks: a) caption classification (correct vs. foil); b) foil word...

10.18653/v1/p17-1024 preprint EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

A Unified Objective for Novel Class Discovery

OPENALEX - Publications

Enrico Fini Enver Sangineto Stéphane Lathuilière Zhun Zhong Moin Nabi and 1 more

In this paper, we study the problem of Novel Class Discovery (NCD). NCD aims at inferring novel object categories in an unlabeled set by leveraging from prior knowledge a labeled containing different, but related classes. Existing approaches tackle considering multiple objective functions, usually involving specialized loss terms for and samples respectively, often requiring auxiliary regularization terms. paper depart traditional scheme introduce UNified Objective function (UNO) discovering...

10.1109/iccv48922.2021.00915 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Metric-Learning-Based Deep Hashing Network for Content-Based Retrieval of Remote Sensing Images

OPENALEX - Publications

Subhankar Roy Enver Sangineto Begüm Demir Nicu Sebe

Hashing methods have been recently found very effective in retrieval of remote sensing (RS) images due to their computational efficiency and fast search speed. The traditional hashing RS usually exploit hand-crafted features learn hash functions obtain binary codes, which can be insufficient optimally represent the information content images. To overcome this problem, paper we introduce a metric-learning based network, learns: 1) semantic-based metric space for feature representation; 2)...

10.1109/lgrs.2020.2974629 article EN IEEE Geoscience and Remote Sensing Letters 2020-02-26

Adaptive course generation through learning styles representation

OPENALEX - Publications

Enver Sangineto Nicola Capuano Matteo Gaeta Alessandro Micarelli

10.1007/s10209-007-0101-0 article EN Universal Access in the Information Society 2007-10-29

Video classification with Densely extracted HOG/HOF/MBH features: an evaluation of the accuracy/computational efficiency trade-off

OPENALEX - Publications

Jasper Uijlings Ionuţ Cosmin Duţă Enver Sangineto Nicu Sebe

10.1007/s13735-014-0069-5 article EN International Journal of Multimedia Information Retrieval 2014-09-27

Learning Personalized Models for Facial Expression Analysis and Gesture Recognition

OPENALEX - Publications

Gloria Zen Lorenzo Porzi Enver Sangineto Elisa Ricci Nicu Sebe

Facial expression and gesture recognition algorithms are key enabling technologies for human-computer interaction (HCI) systems. State of the art approaches automatic detection body movements analyzing emotions from facial features heavily rely on advanced machine learning algorithms. Most these methods designed average user, but assumption "one-size-fits-all" ignores diversity in cultural background, gender, ethnicity, personal behavior, limits their applicability real-world scenarios. A...

10.1109/tmm.2016.2523421 article EN IEEE Transactions on Multimedia 2016-01-28

Whitening for Self-Supervised Representation Learning

OPENALEX - Publications

Aleksandr Ermolov Aliaksandr Siarohin Enver Sangineto Nicu Sebe

Most of the current self-supervised representation learning (SSL) methods are based on contrastive loss and instance-discrimination task, where augmented versions same image instance ("positives") contrasted with instances extracted from other images ("negatives"). For to be effective, many negatives should compared a positive pair, which is computationally demanding. In this paper, we propose different direction new function for SSL, whitening latent-space features. The operation has...

10.48550/arxiv.2007.06346 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Appearance and Pose-Conditioned Human Image Generation Using Deformable GANs

OPENALEX - Publications

Aliaksandr Siarohin Stéphane Lathuilière Enver Sangineto Nicu Sebe

In this paper, we address the problem of generating person images conditioned on both pose and appearance information. Specifically, given an image x a a target P(x xmlns:xlink="http://www.w3.org/1999/xlink">b ), extracted from , synthesize new that in while preserving visual details . orderto deal with pixel-to-pixel misalignments caused bythe differences between ) introduce deformable...

10.1109/tpami.2019.2947427 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-10-15

Efficient Training of Visual Transformers with Small Datasets

OPENALEX - Publications

Yahui Liu Enver Sangineto Wei Bi Nicu Sebe Bruno Lepri and 1 more

Visual Transformers (VTs) are emerging as an architectural paradigm alternative to Convolutional networks (CNNs). Differently from CNNs, VTs can capture global relations between image elements and they potentially have a larger representation capacity. However, the lack of typical convolutional inductive bias makes these models more data-hungry than common CNNs. In fact, some local properties visual domain which embedded in CNN design, should be learned samples. this paper, we empirically...

10.48550/arxiv.2106.03746 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation

OPENALEX - Publications

Yahui Liu Enver Sangineto Yajing Chen Linchao Bao Haoxian Zhang and 4 more

Figure 1: Our method generates smooth interpolations within and across domains in various image-to-image translation tasks.Here, we show gender, age smile translations from CelebA-HQ [20] animal AFHQ [10].

10.1109/cvpr46437.2021.01064 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Collaborative Neural Painting

OPENALEX - Publications

Nicola Dall’Asen Willi Menapace Elia Peruzzo Enver Sangineto Yiming Wang and 1 more

10.1016/j.cviu.2025.104298 article EN Computer Vision and Image Understanding 2025-01-01

Identifying elephant photos by multi-curve matching

OPENALEX - Publications

Alessandro Ardovini Luigi Cinque Enver Sangineto

10.1016/j.patcog.2007.11.010 article EN Pattern Recognition 2007-11-26

Unsupervised Domain Adaptation for Personalized Facial Emotion Recognition

OPENALEX - Publications

Gloria Zen Enver Sangineto Elisa Ricci Nicu Sebe

The way in which human beings express emotions depends on their specific personality and cultural background. As a consequence, person independent facial expression classifiers usually fail to accurately recognize vary between different individuals. On the other hand, training person-specific classifier for each new user is time consuming activity involves collecting hundreds of labeled samples. In this paper we present personalization approach only unlabeled target-specific data are...

10.1145/2663204.2663247 article EN 2014-11-12

Pose and Expression Independent Facial Landmark Localization Using Dense-SURF and the Hausdorff Distance

OPENALEX - Publications

Enver Sangineto

We present an approach to automatic localization of facial feature points which deals with pose, expression, and identity variations combining 3D shape models local image patch classification. The latter is performed by means densely extracted SURF-like features, we call DU-SURF, while the former based on a multiclass version Hausdorff distance address classification errors nonvisible points. final system able localize in real-world scenarios, dealing out plane head rotations, expression...

10.1109/tpami.2012.87 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2012-04-10

Input Perturbation Reduces Exposure Bias in Diffusion Models

OPENALEX - Publications

Mang Ning Enver Sangineto Angelo Porrello Simone Calderara Rita Cucchiara

Denoising Diffusion Probabilistic Models have shown an impressive generation quality, although their long sampling chain leads to high computational costs. In this paper, we observe that a also error accumulation phenomenon, which is similar the exposure bias problem in autoregressive text generation. Specifically, note there discrepancy between training and testing, since former conditioned on ground truth samples, while latter previously generated results. To alleviate problem, propose...

10.48550/arxiv.2301.11706 preprint EN other-oa arXiv (Cornell University) 2023-01-01

SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective

OPENALEX - Publications

Zipeng Xu Songlong Xing Enver Sangineto Nicu Sebe

Owing to the power of vision-language foundation models, e.g., CLIP, area image synthesis has seen recent important advances. Particularly, for style transfer, CLIP enables transferring more general and abstract styles without collecting images in advance, as can be efficiently described with natural language, result is optimized by minimizing similarity between text description stylized image. However, directly using guide transfer leads undesirable artifacts (mainly written words unrelated...

10.1109/wacv57701.2024.00504 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

Unsupervised Tube Extraction Using Transductive Learning and Dense Trajectories

OPENALEX - Publications

Mihai Puscas Enver Sangineto Dubravko Ćulibrk Nicu Sebe

We address the problem of automatic extraction foreground objects from videos. The goal is to provide a method for unsupervised collection samples which can be further used object detection training without any human intervention. use well known Selective Search approach produce an initial still-image based segmentation video frames. This set proposals pruned and temporally extended using optical flow transductive learning. Specifically, we propose Dense Trajectories in order robustly match...

10.1109/iccv.2015.193 article EN 2015-12-01

Coming Soon ...