- Video Analysis and Summarization
- 3D Shape Modeling and Analysis
- Advanced Image and Video Retrieval Techniques
- Image Retrieval and Classification Techniques
- Computer Graphics and Visualization Techniques
- Advanced Vision and Imaging
- Visual Attention and Saliency Detection
- Multimedia Communication and Technology
- Image and Video Quality Assessment
- 3D Surveying and Cultural Heritage
- Human Pose and Action Recognition
- Human Motion and Animation
- Image Enhancement Techniques
- Generative Adversarial Networks and Image Synthesis
- Aesthetic Perception and Analysis
- Topic Modeling
- Lattice Boltzmann Simulation Studies
- Optical measurement and interference techniques
- Multimodal Machine Learning Applications
- Sentiment Analysis and Opinion Mining
- Cavitation Phenomena in Pumps
- Blood properties and coagulation
- Interactive and Immersive Displays
- Advanced Image Fusion Techniques
- Nanopore and Nanochannel Transport Studies
Sun Yat-sen University
2014-2025
Academy of Military Medical Sciences
2024
Nanyang Technological University
2019-2022
Guilin University of Electronic Technology
2017-2021
Shenyang Ligong University
2009
Point cloud registration is a popular topic that has been widely used in 3D model reconstruction, location, and retrieval. In this paper, we propose new method, KSS-ICP, to address the rigid task Kendall shape space (KSS) with Iterative Closest (ICP). The KSS quotient removes influences of translations, scales, rotations for feature-based analysis. Such can be concluded as similarity transformations do not change feature. point representation invariant transformations. We utilize such...
Mesh reconstruction from a 3D point cloud is an important topic in the fields of computer graphic, vision, and multimedia analysis. In this paper, we propose voxel structure-based mesh framework. It provides intrinsic metric to improve accuracy local region detection. Based on detected regions, initial reconstructed can be obtained. With optimization our framework, optimized into isotropic one with geometric features such as external internal edges. The experimental results indicate that...
A point cloud as an information-intensive 3D representation usually requires a large amount of transmission, storage and computing resources, which seriously hinder its usage in many emerging fields. In this paper, we propose novel simplification method, Approximate Intrinsic Voxel Structure (AIVS), to meet the diverse demands real-world application scenarios. The method includes pre-processing (denoising down-sampling), AIVS-based realization for isotropic flexible with intrinsic control...
Camouflaged object detection (COD) is an important yet challenging task, with great application values in industrial defect detection, medical care, etc. The challenges mainly come from the high intrinsic similarities between target objects and background. In this paper, inspired by biological studies that consists of two steps, i.e., search identification, we propose a novel framework, named DCNet, for accurate COD. DCNet explores candidate extra object-related edges through constraints...
Reconstructing 3D models from a single image remains challenges in computer graphics and vision, especially when dealing with free-hand sketches. Fragmented strokes distorted lines often introduce ambiguities, leading to that deviate the intended shape. Moreover, variations sketching styles frequently result incomplete object representations. To address above challenges, we present sketch-orientated Autoencoder SkFC-AE for high-quality voxelized model reconstruction. Our approach features...
With rapid development of 3D scanning technology, point cloud based research and applications are becoming more popular. However, major difficulties still exist which affect the performance utilization. Such include lack local adjacency information, non-uniform density, control numbers. In this paper, we propose a two-step intrinsic isotropic (I&I) resampling framework to address challenge these three difficulties. The efficient provides geodesic measurement for improve region detection...
Blind image quality assessment (BIQA) targets predict the perceptual of an without any reference information. However, known methods have considerable room for performance improvement due to limited efforts in distortion knowledge usage. This paper proposes a novel multitask learning based BIQA method termed KGANet, which takes classification as auxiliary task and uses learned from assist accurate prediction. Different existing CNN-based methods, KGANet adopts transformer backbone feature...
Personalized image aesthetics assessment (IAA) aims to estimate aesthetic experiences subject the preferences of individual users, contrary generic IAA that estimates average preferences. Most existing personalized methods treat as deviations from a experience, and therefore, models are designed build upon prior knowledge on IAA. However, we propose acquiring is not necessary for building model. Instead modeling basis IAA, this work proposes directly interactions between contents user (i.e.,...
With recent developments and advances in distance learning MOOCs, the amount of open educational videos on Internet has grown dramatically past decade. However, most these are lengthy lack high-quality indexing annotations, which triggers an urgent demand for efficient effective tools that facilitate video content navigation exploration. In this paper, we propose a novel visual system exploring videos. The tightly integrates multimodal cues obtained from visual, audio textual channels...
The rapid development of distance education technologies, e.g. MOOCs, provide learners unprecedented access to high-quality online lecture videos at scale, anytime and anywhere. Unfortunately, these valuable resources are often underutilized by learners. One prevailing reason is the lack support for resulting difficulty exploring locating content interest among lengthy recordings course lectures. To address this deficiency, we introduce a novel visual interface that supports efficient search...
In this work, we investigate deep learning based solutions to blind quality assessment of stitched panoramic images (SPI). The main problem tackle is that the ground truth data usually insufficient. As a result, learned model can easily overfit with specific content. Because most distortions SPIs lie within local regions, cannot be alleviated by commonly-used patch-wise training, which assumes equals global quality. We propose multi-task strategy encourages representation less dependent on...
Due to the high dimensionality of point cloud data and irregularity complexity its geometric structure, effective attribute compression remains a very challenging task. Many recent efforts have focused on transforming clouds into images leveraging existing sophisticated image/video codecs improve coding efficiency. However, how synthesize coherent correlation-preserving is still inadequately addressed by studies, which are hindering exertion merits well-developed infrastructure. In this...
Abstract Objective Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among vast corpus of quality yet diverse OER is nontrivial due to limitations today’s keyword- and content-based video retrieval techniques. To address this need, study introduces a novel visual navigation system that facilitates users’ information biomedical in mass quantity by interactively offering textual...
Given rapid development witnessed by open educational resources (OER) in the past few decades, a considerable number of online videos emerge on various MOOC platforms such as Coursera and YouTube. Nevertheless, most internet are lengthy lack elaborate annotations, which poses challenge for learners to explore locate content interest efficiently. To address this, we present an automatic note-generating method establish correspondences between visual entities slide-based lecture video their...
This paper proposes a new task of commonsense question generation, which aims to yield deep-level and to-the-point questions from the text. Their answers need reason over disjoint relevant contexts external knowledge, such as encyclopedic facts causality. The knowledge may not be explicitly mentioned in text but is used by most humans for problem-shooting. Such complex reasoning with hidden involves deep semantic understanding. Thus, this has great application value, making high-quality...
With the increasing popularity of open educational resources in past few decades, more and users watch online videos to gain knowledge. However, most only provide monotonous navigation tools lack elaborating annotations. This makes task locating interesting contents time consuming. To address this limitation, article, we propose a slide-based video tool that is able extract hierarchical structure semantic relationship visual entities videos, by integrating multichannel information. Features...
Thumbnails provide an efficient way to perceive video content and give online viewers instant gratification of making relevance judgements. In this paper, we proposed automatic approach generate magazine-cover-like thumbnail using the salient visual textual metadata extracted from video. Compared with traditional snapshot, synthesized is more informative attractive, which would be helpful for selection.
As an effective and efficient way to graphically portray ideas concepts, free-hand sketches are playing increasingly significant role in today's image retrieval systems. There many factors that contribute a successful sketch-based (SBIR) system, but perhaps the most essential ones design of powerful feature descriptors perceptual similarity metrics space. However, painting experiences styles users vary hugely, which poses great challenge for consistently accurate representation matching. We...
In this paper, we introduce a novel 3D shape reconstruction method from single-view sketch image based on deep neural network. The proposed pipeline is mainly composed of three modules. first module component segmentation multi-modal DNN fusion and used to segment given into series basic units build transformation template by the knots between them. second non-linear network for multifarious generation with obtained template. It creates representation extracting features an input samples....
Thumbnail plays a vital role in boosting the discovery and viewership of online video. Although it can be easily obtained by simply selecting certain image from video sequence, most popular videos today's sharing platforms come with elaborately designed custom thumbnails to better showcase highlight within Unfortunately, both selection salient content thousands frames creation an eye-catching thumbnail are very time-consuming require highly specialized skills. In this paper, we present fully...