- Face recognition and analysis
- Handwritten Text Recognition Techniques
- Digital Media Forensic Detection
- Biometric Identification and Security
- Face and Expression Recognition
- Remote-Sensing Image Classification
- Advanced Image and Video Retrieval Techniques
- Remote Sensing and Land Use
- Image Retrieval and Classification Techniques
- Speech and Audio Processing
- Digital and Cyber Forensics
- Multimodal Machine Learning Applications
- Advanced Image Fusion Techniques
- Natural Language Processing Techniques
- Phonetics and Phonology Research
- Text and Document Classification Technologies
- Infrared Target Detection Methodologies
- Domain Adaptation and Few-Shot Learning
- Advanced Steganography and Watermarking Techniques
- Hand Gesture Recognition Systems
- Categorization, perception, and language
- Blind Source Separation Techniques
- Topic Modeling
- Gaze Tracking and Assistive Technology
- Geographic Information Systems Studies
Sorbonne Université
2024-2025
Université Sorbonne Paris Nord
2023-2025
La Rochelle Université
2017-2023
Laboratoire Informatique, Image et Interaction (L3i)
2017-2023
Laboratoire Bordelais de Recherche en Informatique
2015-2018
Grenoble Images Parole Signal Automatique
2013
Centre Hospitalier Universitaire de Grenoble
2010
The topic of text document image classification has been explored extensively over the past few years. Most recent approaches handled this task by jointly learning visual features images and their corresponding textual contents. Due to various structures images, extraction semantic information from its content is beneficial for processing tasks such as retrieval, extraction, classification. In work, a two-stream neural architecture proposed perform task. We conduct an exhaustive...
Real-time detection of Remote Sensing Imagery (RSI) with a wide background and small targets is challenging in various fields. Multimodal data fusion enhancing CNNs Transformers can improve performance. The approach combines complementary information from different modalities leverages CNNs' feature extraction capabilities. capture global learn sequence dependency without requiring large samples. goal to achieve accurate efficient target applications such as fire detection, military...
Recently, benefiting from the advances of deep convolution neural networks (CNNs), significant progress has been made in field face verification and recognition.Specially, performance FaceNet overpassed human level terms accuracy on datasets "Labeled Faces Wild (LFW)"and "Youtube (YTF)".The triplet loss used proved its effectiveness for verification.However, number possible triplets is explosive when using a large scale dataset to train model.In this paper, we propose simple class-wise based...
Automatic facial expression recognition has emerged over two decades. The of the posed expressions and detection Action Units (AUs) have already made great progress. More recently, automatic estimation variation expression, either in terms intensities AUs or values dimensional emotions, field analysis. However, discriminating different is a far more challenging task than due to several intractable problems. Aiming continuing standardized evaluation procedures surpass limits current research,...
Face Presentation Attack Detection (PAD) is an important measure to prevent spoof attacks for face biometric systems. Many works based on Convolution Neural Networks (CNNs) PAD formulate the problem as image-level binary classification task without considering context. Alternatively, Vision Transformers (ViT) using self-attention attend context of image become mainstreams in PAD. Inspired by ViT, we propose a Video-based Transformer (ViTransPAD) with short/long-range spatio-temporal...
Various government and commercial services, including, but not limited to, e-government, fintech, banking, sharing economy widely use smartphones to simplify service access user authorization. Many organizations involved in these areas identity document analysis systems order improve personal-data-input processes. The tasks of such are only ID data recognition extraction also fraud prevention by detecting forgery or checking whether the is genuine. Modern this kind often expected operate...
The effectiveness of the state-of-the-art face verifi-cation/recognition algorithms and convenience recognition greatly boost face-related biometric authentication applications. However, existing verification architectures seldom integrate any liveness detection or keep such stage isolated from as if it was irrelevant. This may potentially result in system being exposed to spoof attacks between two stages. work introduces FaceLiveNet, a holistic end-to-end deep networks which can perform...
Benefiting from the joint learning of multiple tasks in deep multi-task networks, many applications have shown promising performance comparing to single-task learning. However, framework is highly dependant on relative weights tasks. How assign weight each task a critical issue Instead tuning manually which exhausted and time-consuming, this paper we propose an approach can dynamically adapt according difficulty for training task. Specifically, proposed method does not introduce...
As a fundamental step of document related tasks, classification has been widely adopted to various image processing applications. Unlike the general problem in computer vision field, text images contain both visual cues and corresponding within image. However, how bridge these two different modalities leverage textual features classify remains challenging. In this paper, we present cross-modal deep network that enables capture content information included images. Thanks efficient jointly...
In this paper, we propose to fuse multiple sources remotely sensed datasets, such as hyperspectral (HS) and Light Detection Ranging (LiDAR)-derived digital surface model (DSM) using a novel deep learning method. Morphological openings closings with partial reconstruction are taken into account spatial elevation information for both sources. Then, the stacked features directly input classifier, namely Deep Forest (DF). particular, can be viewed cascade or ensembles of Rotation Forests (RoF)...
Benefiting from the advance of deep convolutional neural network approaches (CNNs), many face detection algorithms have achieved state-of-the-art performance in terms accuracy and very high speed unconstrained applications. However, due to lack public datasets variation orientation images, complex background lighting, defocus varying illumination camera captured on identity documents under environments has not been sufficiently studied. To address this problem more efficiently, we survey...
The widespread deployment of face recognition-based biometric systems has made Presentation Attack Detection (face anti-spoofing) an increasingly critical issue. This survey thoroughly investigates the (PAD) methods, that only require RGB cameras generic consumer devices, over past two decades. We present attack scenario-oriented typology existing PAD methods and we provide a review 50 most recent their related issues. adopt comprehensive presentation have influenced following proposed...
In cases of digital enrolment via mobile and online services, identity documents (IDs) verification is critical to efficiently detect forgery therefore build user trust in the world. this paper, we propose a copy-move public dataset, called FMIDV (forged ID video dataset) containing forged IDs with respect guilloche patterns. Also, two fraud detection models on patterns IDs, which are based contrastive adversarial learning. sequel, each proposed model manages read entire recognize pattern...
In this work, a novel deep rotation forest is proposed to fuse hyperspectral (HS) and LiDAR. First, we extract the spatial elevation information of two datasets by using morphological filters. Then, each feature source applied superpixel segmentation then are treated as input forest. forest, relationships fully considered, output probability layer used next layer. Experimental results demonstrate that excellent performance method.
Object detection in remote sensing imagery plays a vital role various Earth observation applications. However, unlike object natural scene images, this task is particularly challenging due to the abundance of small, often barely visible objects across diverse terrains. To address these challenges, multimodal learning can be used integrate features from different data modalities, thereby improving accuracy. Nonetheless, performance constrained by limited size labeled datasets. In paper, we...