- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Multimodal Machine Learning Applications
- Advanced Vision and Imaging
- Visual Attention and Saliency Detection
- Medical Image Segmentation Techniques
- Image Retrieval and Classification Techniques
- Video Analysis and Summarization
- Video Surveillance and Tracking Methods
- Face and Expression Recognition
- Image and Signal Denoising Methods
- Image Enhancement Techniques
- Image Processing and 3D Reconstruction
- Generative Adversarial Networks and Image Synthesis
- Computer Graphics and Visualization Techniques
- Face recognition and analysis
- Domain Adaptation and Few-Shot Learning
- Biometric Identification and Security
- Radiomics and Machine Learning in Medical Imaging
- Advanced Data Processing Techniques
- Image and Object Detection Techniques
- Thermography and Photoacoustic Techniques
- Robotics and Sensor-Based Localization
- Sustainable Urban and Rural Development
- Guidance and Control Systems
North University of China
2024
Xidian University
2015-2024
Hohai University
2024
Microsoft Research Asia (China)
2021-2023
Microsoft Research (United Kingdom)
2023
Jiangsu University
2023
Yanshan University
2022
Guilin University of Technology
2022
Xiangtan University
2018
University of Houston
2006-2009
Previous works on video object segmentation (VOS) are trained densely annotated videos. Nevertheless, acquiring annotations in pixel level is expensive and time-consuming. In this work, we demonstrate the feasibility of training a satisfactory VOS model sparsely videos—we merely require two labeled frames per while performance sustained. We term novel paradigm as two-shot segmentation, or for short. The underlying idea to generate pseudo labels unlabeled during optimize combination...
Talking head generation is to generate video based on a given source identity and target motion. However, current methods face several challenges that limit the quality controllability of generated videos. First, often has unexpected deformation severe distortions. Second, driving image does not explicitly disentangle movement-relevant information, such as poses expressions, which restricts manipulation different attributes during generation. Third, videos tend have flickering artifacts due...
The normal operation of insulator strings affects the safety and stability power system, string flashover is one important faults. In this paper, considering characteristics in complex environments, noise added to collected simulate actual environment, then data are rotationally transformed filtered using an improved non-local mean filtering algorithm. To accurately locate geodesic active contour correction algorithm employed segment image. This developed based on level set model, replaces...
The Multiplane Image (MPI), containing a set of fronto-parallel <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$RGB_{\alpha}$</tex> layers, is an effective and efficient representation for view synthesis from sparse inputs. Yet, its fixed structure limits the performance, especially surfaces imaged at oblique angles. We introduce Structural MPI (S-MPI), where plane approximates 3D scenes concisely. Conveying contexts with geometrically-faithful...
Multimodal and multi-domain stylization are two important problems in the field of image style transfer. Currently, there few methods that can perform multimodal simultaneously. In this study, we propose a unified framework for transfer with support both exemplar-based reference randomly sampled guidance. The key component our method is novel distribution alignment module eliminates explicit gaps between various domains reduces risk mode collapse. diversity ensured by either guidance from...
Abstract We analyze localized textural consistencies in high‐resolution X‐ray (computed tomography) CT scans of coronary arteries to identify the appearance diagnostically relevant changes tissue. For efficient and accurate processing volume data, we use fast wavelet algorithms associated with three‐dimensional isotropic multiresolution wavelets that implement a redundant, frame‐based image encoding without directional preference. Our algorithm identifies by correlating coefficients...
Instance segmentation is a challenging task aiming at classifying and segmenting all object instances of specific classes. While two-stage box-based methods achieve top performances in the image domain, they cannot easily extend their superiority into video domain. This because usually deal with features or images cropped from detected bounding boxes without alignment, failing to capture pixel-level temporal consistency. We embrace observation that bottom-up dealing box-free could offer...
Recently, transformer-based image segmentation methods have achieved notable success against previous solutions. While for video domains, how to effectively model temporal context with the attention of object instances across frames remains an open problem. In this paper, we propose online instance framework a novel instance-aware fusion method. We first leverages representation, i.e., latent code in global (instance code) and CNN feature maps represent instance- pixel-level features. Based...
Referring Video Object Segmentation (R-VOS) is a challenging task that aims to segment an object in video based on linguistic expression. Most existing R-VOS methods have critical assumption: the referred must appear video. This assumption, which we refer as semantic consensus, often violated real-world scenarios, where expression may be queried against false videos. In this work, highlight need for robust model can handle mismatches. Accordingly, propose extended called Robust R-VOS,...
To tackle the challenge of time-varying formation control for underactuated robots under model parameter uncertainties and environmental disturbances, this study proposes an affine approach enhanced by Extended State Observer. Initially, using positioning theory polynomial interpolation, guidelines selecting leader vehicles trajectory planning methods are established, whereby follower is uniquely determined through stress matrix. address cumulative disturbances arising from factors impacting...
In order to extract useful information from X-ray fluorescence (XRF) spectra and establish a high-accuracy prediction model of soil heavy metal contents, hybrid combining deep belief network (DBN) with tree-based was proposed. The DBN first introduced into feature extraction XRF spectral data, which can obtain layer features spectra. Owing the strong regression ability model, it offset deficiency in so used for predicting contents based on extracted features. further improve performance...
The current studies on road edge detection are mainly focused algorithms for finding and tracking edges through optical images (Y. Wang et al., 1998) (R. 2002) (B. Ma 1999). In this study, the researchers developed a new road/trail system which is based frequency-modulated continuous-wave (FMCW) radars. This able to provide much more information than do. key features of as follows: 1) FMCW radars, radar technology works effectively during both daytime nighttime, any types terrain, in variety...
Error propagation is a general but crucial problem in online semi-supervised video object segmentation. We aim to suppress error through correction mechanism with high reliability. The key insight disentangle the from conventional mask process reliable cues. introduce two modulators, and separately perform channel-wise re-calibration on target frame embeddings according local temporal correlations references respectively. Specifically, we assemble modulators cascaded propagation-correction...
Here, an efficient framework is developed to address the problem of unconstrained face verification. In particular, unsupervised feature learning method for image representation and a novel similarity metric model are discussed. First, authors propose with sparse auto-encoder (SAE) based on local descriptor (SAELD). A set filter operators learned SAE from patches, descriptors extracted by applying convolve images. This can discriminative issue Then pairwise SAELD projected into weighted...
Remote Sensing Image can be degraded by a variety of causes during acquisition, transmission, compression, storage and reconstruction. Noise is one the most important degradation factors. Quantifying its impact on image may useful for applications such as improving acquisition system thus quality produced images. Objective Quality Measure (IQA) methods classified whether reference image, representing original signal exists. In case remote sensing, ideal un-degraded not available....
Previous works on video object segmentation (VOS) are trained densely annotated videos. Nevertheless, acquiring annotations in pixel level is expensive and time-consuming. In this work, we demonstrate the feasibility of training a satisfactory VOS model sparsely videos-we merely require two labeled frames per while performance sustained. We term novel paradigm as two-shot segmentation, or for short. The underlying idea to generate pseudo labels unlabeled during optimize combination...