- Image and Signal Denoising Methods
- Advanced Image Fusion Techniques
- Image Enhancement Techniques
- Music and Audio Processing
- Speech and Audio Processing
- Video Analysis and Summarization
- Multimodal Machine Learning Applications
- Visual Attention and Saliency Detection
- Advanced Image Processing Techniques
- Epigenetics and DNA Methylation
- Advanced Image and Video Retrieval Techniques
- Recommender Systems and Techniques
- Diverse Musicological Studies
- Domain Adaptation and Few-Shot Learning
- Noise Effects and Management
- Generative Adversarial Networks and Image Synthesis
- Face recognition and analysis
- Multisensory perception and integration
McGill University
2023-2024
Institute of Art
2023
Audio-Visual scene understanding is a challenging problem due to the unstructured spatial-temporal relations that exist in audio signals and spatial layouts of different objects visual images. Recently, many studies have focused on abstracting features from convolutional neural networks while learning explicit semantically relevant frames sound images has been overlooked. To this end, we present an end-to-end framework, namely attentional graph network (AGCN), for structure-aware...
Natural image matting aims to precisely separate foreground objects from backgrounds using alpha mattes. Fully automatic natural without external annotations is challenging. Well-performed methods usually require accurate labor-intensive handcrafted trimap as an extra input while the performance of generation method, e.g., erosion/dilation manipulation on segmentation, fluctuates with segmentation quality. Therefore, we argue that how produce a high-quality coarse major issue in matting. In...
Environmental sound classification (ESC) is a challenging problem due to the unstructured spatial-temporal relations that exist in signals. Recently, many studies have focused on abstracting features from convolutional neural networks while learning of semantically relevant frames signals has been overlooked. To this end, we present an end-to-end framework, namely feature pyramid attention network (FPAM), focusing for ESC. We first extract maps preprocessed spectrogram waveform by backbone...
Abstract Audio‐visual scene classification (AVSC) poses a formidable challenge owing to the intricate spatial‐temporal relationships exhibited by audio‐visual signals, coupled with complex spatial patterns of objects and textures found in visual images. The focus recent studies has predominantly revolved around extracting features from diverse neural network structures, inadvertently neglecting acquisition semantically meaningful regions crucial components within data. authors present...
Unbiased scene graph generation (USGG) is a challenging task that requires predicting diverse and heavily imbalanced predicates between objects in an image. To address this, we propose novel framework peer learning uses predicate sampling consensus voting (PSCV) to encourage multiple peers learn from each other. Predicate divides the classes into sub-distributions based on frequency, assigns different handle sub-distribution or combinations of them. Consensus ensembles peers' complementary...
In this paper, we extend the blind-spot based self-supervised denoising by using affinity learning to remove noise from affected pixels. Inspired inpainting, introduce a novel Mask Guided Residual Convolution (MGRConv) learn neighboring image pixel map that gradually removes and refines process. We show mask convolution plays an important role in since it is theoretically aligned with $\mathcal{J} - invariance$, which frameworks are built upon. The theoretical analysis further shows...
We propose a generative framework based on adversarial network (GAN) to enhance facial attractiveness while preserving identity and high-fidelity. Given portrait image as input, having applied gradient descent recover latent vector that this can use synthesize an resemble the input image, beauty semantic editing manipulation corresponding recovered InterFaceGAN enables achieve beautification. This paper compared our system with Beholder-GAN proposed result-enhanced version of Beholder-GAN....
Scene graph generation (SGG) has gained tremendous progress in recent years. However, its underlying long-tailed distribution of predicate classes is a challenging problem. For extremely unbalanced distributions, existing approaches usually construct complicated context encoders to extract the intrinsic relevance scene predicates and complex networks improve learning ability network models for highly imbalanced distributions. To address unbiased SGG problem, we introduce simple yet effective...
In recent years, self-supervised denoising methods have shown impressive performance, which circumvent painstaking collection procedure of noisy-clean image pairs in supervised and boost applicability real world. One well-known strategies is the blind-spot training scheme. However, a few works attempt to improve based self-denoiser aspect network architecture. this paper, we take an intuitive view strategy consider its process using neighbor pixels predict manipulated as inpainting process....
Natural image matting aims to precisely separate foreground objects from background using alpha matte. Fully automatic natural without external annotation is challenging. Well-performed methods usually require accurate labor-intensive handcrafted trimap as extra input, while the performance of generation method dilating segmentation fluctuates with quality. Therefore, we argue that how handle trade-off additional information input a major issue in matting. This paper presents semantic-guided...
Audio-Visual scene understanding is a challenging problem due to the unstructured spatial-temporal relations that exist in audio signals and spatial layouts of different objects various texture patterns visual images. Recently, many studies have focused on abstracting features from convolutional neural networks while learning explicit semantically relevant frames sound images has been overlooked. To this end, we present an end-to-end framework, namely attentional graph network (AGCN), for...