- Advanced Image Processing Techniques
- Generative Adversarial Networks and Image Synthesis
- Advanced Vision and Imaging
- Image and Signal Denoising Methods
- Face recognition and analysis
- Image Processing Techniques and Applications
- Advanced Image Fusion Techniques
- Video Surveillance and Tracking Methods
- Advanced Image and Video Retrieval Techniques
- Emotion and Mood Recognition
- Face and Expression Recognition
- Human Pose and Action Recognition
- Image and Video Quality Assessment
- Visual Attention and Saliency Detection
- Image Enhancement Techniques
- Biometric Identification and Security
- Multimodal Machine Learning Applications
- Neural Networks and Applications
- Advanced Neural Network Applications
- Advanced Optical Imaging Technologies
- Medical Image Segmentation Techniques
- Cell Image Analysis Techniques
- Human Motion and Animation
- Anomaly Detection Techniques and Applications
- Cloud Data Security Solutions
Adobe Systems (United States)
2022-2024
Shanghai Institute of Organic Chemistry
2024
Chengdu University of Information Technology
2023-2024
Nanjing University of Aeronautics and Astronautics
2023
University of Illinois Urbana-Champaign
2018-2022
International University of the Caribbean
2019-2021
York University
2019-2020
Hong Kong University of Science and Technology
2017-2019
University of Hong Kong
2017-2019
Megvii (China)
2019
Domain adaptation in person re-identification (re-ID) has always been a challenging task. In this work, we explore how to harness the similar natural characteristics existing samples from target domain for learning conduct re-ID an unsupervised manner. Concretely, propose Self-similarity Grouping (SSG) approach, which exploits potential similarity (from global body local parts) of unlabeled build multiple clusters different views automatically. These independent are then assigned with...
Both Non-Local (NL) operation and sparse representation are crucial for Single Image Super-Resolution (SISR). In this paper, we investigate their combinations propose a novel Sparse Attention (NLSA) with dynamic attention pattern. NLSA is designed to retain long-range modeling capability from NL while enjoying robustness high-efficiency of representation. Specifically, rectifies non-local spherical locality sensitive hashing (LSH) that partitions the input space into hash buckets related...
Despite the remarkable progress in person re-identification (Re-ID), such approaches still suffer from failure cases where discriminative body parts are missing. To mitigate this type of failure, we propose a simple yet effective Horizontal Pyramid Matching (HPM) approach to fully exploit various partial information given person, so that correct candidates can be identified even if some key With HPM, make following contributions produce more robust feature representations for Re-ID task: 1)...
Deep convolution-based single image super-resolution (SISR) networks embrace the benefits of learning from large-scale external resources for local recovery, yet most existing works have ignored long-range feature-wise similarities in natural images. Some recent successfully leveraged this intrinsic feature correlation by exploring non-local attention modules. However, none current deep models studied another inherent property images: cross-scale correlation. In paper, we propose first...
Abstract Self-similarity refers to the image prior widely used in restoration algorithms that small but similar patterns tend occur at different locations and scales. However, recent advanced deep convolutional neural network-based methods for do not take full advantage of self-similarities by relying on self-attention modules only process information same scale. To solve this problem, we present a novel Pyramid Attention module restoration, which captures long-range feature correspondences...
Deep image inpainting has made impressive progress with recent advances in generation and processing algorithms. We claim that the performance of algorithms can be better judged by generated structures textures. Structures refer to object boundary or novel geometric within hole, while texture refers high-frequency details, especially man-made repeating patterns filled inside structural regions. believe are usually obtained from a coarse-to-fine GAN-based generator network nowadays modeled...
This paper reviews the NTIRE 2019 challenge on real image denoising with focus proposed methods and their results. The has two tracks for quantitatively evaluating performance in (1) Bayer-pattern raw-RGB (2) standard RGB (sRGB) color spaces. had 216 220 registered participants, respectively. A total of 15 teams, proposing 17 methods, competed final phase challenge. by teams represent current state-of-the-art targeting noisy images.
Discriminative learning based image denoisers have achieved promising performance on synthetic noises such as Additive White Gaussian Noise (AWGN). The adopted in most previous work are pixel-independent, but real mostly spatially/channel-correlated and spatially/channel-variant. This domain gap yields unsatisfied images with if the model is only trained AWGN. In this paper, we propose a novel approach to boost of denoiser which pixel-independent noise data dominated by First, train deep...
This paper reviews the NTIRE 2020 challenge on real image denoising with focus newly introduced dataset, proposed methods and their results. The is a new version of previous 2019 that was based SIDD benchmark. collected validation testing datasets, hence, named SIDD+. has two tracks for quantitatively evaluating performance in (1) Bayer-pattern rawRGB (2) standard RGB (sRGB) color spaces. Each track ~250 registered participants. A total 22 teams, proposing 24 methods, competed final phase...
The new trend of full-screen devices encourages us to position a camera behind screen. Removing the bezel and centralizing under screen brings larger display-to-body ratio enhances eye contact in video chat, but also causes image degradation. In this paper, we focus on newly-defined Under-Display Camera (UDC), as novel real-world single restoration problem. First, take 4k Transparent OLED (T-OLED) phone Pentile (P-OLED) analyze their optical systems understand Second, design Monitor-Camera...
Face detection is a well-explored problem. Many challenges on face detectors like extreme pose, illumination, low resolution and small scales are studied in the previous work. However, proposed models mostly trained tested good-quality images which not always case for practical applications surveillance systems. In this paper, we first review current state-of-the-art their performance benchmark dataset FDDB, compare design protocols of algorithms. Secondly, investigate degradation while...
Photorealistic facial expression synthesis from single face image can be widely applied to recognition, data augmentation for emotion recognition or entertainment. This problem is challenging, in part due a paucity of labeled data, making it difficult algorithms disambiguate changes identity and expression. In this paper, we propose the conditional difference adversarial autoencoder (CDAAE) synthesis. The CDAAE takes previously unseen person generates an that person's with target action unit...
Self-similarity refers to the image prior widely used in restoration algorithms that small but similar patterns tend occur at different locations and scales. However, recent advanced deep convolutional neural network based methods for do not take full advantage of self-similarities by relying on self-attention modules only process information same scale. To solve this problem, we present a novel Pyramid Attention module restoration, which captures long-range feature correspondences from...
In this paper, we present new data pre-processing and augmentation techniques for DNN-based raw image denoising. Compared with traditional RGB denoising, performing task on direct camera sensor readings presents challenges such as how to effectively handle various Bayer patterns from different sources, subsequently perform valid images. To address the first problem, propose a pattern unification (BayerUnify) method unify patterns. This allows us fully utilize heterogeneous dataset train...
Image inpainting is the task of plausibly restoring missing pixels within a hole region that to be removed from target image. Most existing technologies exploit patch similarities image, or leverage large-scale training data fill using learned semantic and texture information. However, due ill-posed nature task, such methods struggle complete larger holes containing complicated scenes. In this paper, we propose TransFill, multi-homography transformed fusion method by referring another source...
Image matting is a key technique for image and video editing composition. Conventionally, deep learning approaches take the whole input an associated trimap to infer alpha matte using convolutional neural networks. Such set state-of-the-arts in matting; however, they may fail real-world applications due hardware limitations, since images are mostly of very high resolution. In this paper, we propose HDMatt, first based approach high-resolution inputs. More concretely, HDMatt runs patch-based...
Image rasterization is a mature technique in computer graphics, while image vectorization, the reverse path of rasterization, remains major challenge. Recent advanced deep learning-based models achieve vectorization and semantic interpolation vector graphs demonstrate better topology generating new figures. However, cannot be easily generalized to out-of-domain testing data. The generated SVGs also contain complex redundant shapes that are not quite convenient for further editing....
Talking face synthesis has been widely studied in either appearance-based or warping-based methods. Previous works mostly utilize single image as a source, and generate novel facial animations by merging other person's features. However, some regions like eyes teeth, which may be hidden the source image, can not synthesized faithfully stably. In this paper, We present landmark driven two-stream network to faithful talking animation, more details are created, preserved transferred from...
Spatial-temporal feature learning is of vital importance for video emotion recognition. Previous deep network structures often focused on macro-motion which extends over long time scales, e.g., the order seconds. We believe integrating capturing information about both micro- and will benefit prediction, because human perceive macro-expressions. In this paper, we propose to combine features improve recognition with a two-stream recurrent network, named MIMAMO (Micro-Macro-Motion) Net....
Facial expression recognition plays an increasingly important role in human behavior analysis and computer interaction. action units (AUs) coded by the Action Coding System (FACS) provide rich cues for interpretation of facial expressions. Much past work on AU used only frontal view images, but natural images contain a much wider variety poses. The FG 2017 Expression Recognition Analysis challenge (FERA 2017) requires participants to estimate occurrence intensity under nine different pose...
Facial expression recognizers based on handcrafted features have achieved satisfactory performance many databases. Recently, deep neural networks, e. g. convolutional networks (CNNs) been shown to boost vision tasks. However, the mechanisms exploited by CNNs are not well established. In this paper, we establish existence and utility of feature maps selective action units in a CNN trained transfer learning. We network pre-trained Image-Net dataset facial recognition task using Karolinska...
Surface-based geodesic topology provides strong cues for object semantic analysis and geometric modeling. However, such connectivity information is lost in point clouds. Thus we introduce GeoNet, the first deep learning architecture trained to model intrinsic structure of surfaces represented as To demonstrate applicability learned geodesic-aware representations, propose fusion schemes which use GeoNet conjunction with other baseline or backbone networks, PU-Net PointNet++, down-stream cloud...
Deep convolution-based single image super-resolution (SISR) networks embrace the benefits of learning from large-scale external resources for local recovery, yet most existing works have ignored long-range feature-wise similarities in natural images. Some recent successfully leveraged this intrinsic feature correlation by exploring non-local attention modules. However, none current deep models studied another inherent property images: cross-scale correlation. In paper, we propose first...
The integration of information across multiple modalities and time is a promising way to enhance the emotion recognition performance affective systems. Much previous work has focused on instantaneous recognition. 2018 One-Minute Gradual-Emotion Recognition (OMG-Emotion) challenge, which was held in conjunction with IEEE World Congress Computational Intelligence, encouraged participants address long-term by integrating cues from modalities, including facial expression, audio language....