- Visual Attention and Saliency Detection
- Advanced Neural Network Applications
- EEG and Brain-Computer Interfaces
- Advanced Vision and Imaging
- Advanced Image and Video Retrieval Techniques
- Adversarial Robustness in Machine Learning
- Image Processing and 3D Reconstruction
- Human Pose and Action Recognition
- Quantum Information and Cryptography
- Domain Adaptation and Few-Shot Learning
- Image Processing Techniques and Applications
- Time Series Analysis and Forecasting
- Multimodal Machine Learning Applications
- Machine Learning and Data Classification
- CCD and CMOS Imaging Sensors
- Gait Recognition and Analysis
- Brain Tumor Detection and Classification
- Cell Image Analysis Techniques
- Context-Aware Activity Recognition Systems
- Video Analysis and Summarization
- Software Engineering Research
- Machine Learning in Materials Science
- Diabetic Foot Ulcer Assessment and Management
- Advanced Image Processing Techniques
- Software Testing and Debugging Techniques
Communication University of China
2022-2024
State Key Laboratory of Media Convergence and Communication
2024
Lingnan Normal University
2022
Chinese University of Hong Kong
2018-2021
Beijing Forestry University
2021
Huazhong University of Science and Technology
2019
University of Hong Kong
2017
Nanjing University of Aeronautics and Astronautics
2010-2011
As an essential problem in computer vision, salient object detection (SOD) has attracted increasing amount of research attention over the years. Recent advances SOD are predominantly led by deep learning-based solutions (named SOD). To enable in-depth understanding SOD, this paper, we provide a comprehensive survey covering various aspects, ranging from algorithm taxonomy to unsolved issues. In particular, first review algorithms different perspectives, including network architecture, level...
Increasing numbers of explanatory variables tend to result in information redundancy and “dimensional disaster” the quantitative remote sensing forest aboveground biomass (AGB). Feature selection model factors is an effective method for improving accuracy AGB estimates. Machine learning algorithms are also widely used estimation, although little research has addressed use categorical boosting algorithm (CatBoost) estimation. Both feature regression estimation models typically performed with...
One unique property of time series is that the temporal relations are largely preserved after downsampling into two sub-sequences. By taking advantage this property, we propose a novel neural network architecture conducts sample convolution and interaction for modeling forecasting, named SCINet. Specifically, SCINet recursive downsample-convolve-interact architecture. In each layer, use multiple convolutional filters to extract distinct yet valuable features from downsampled sub-sequences or...
This paper proposes a novel residual attentive learning network architecture for predicting dynamic eye-fixation maps. The proposed model emphasizes two essential issues, i.e., effective spatiotemporal feature integration and multi-scale saliency learning. For the first problem, appearance motion streams are tightly coupled via dense cross connections, which integrate information with multi-layer, comprehensive features in way. Beyond traditional two-stream models separately, such design...
As an essential problem in computer vision, salient object detection (SOD) has attracted increasing amount of research attention over the years. Recent advances SOD are predominantly led by deep learning-based solutions (named SOD). To enable in-depth understanding SOD, this paper, we provide a comprehensive survey covering various aspects, ranging from algorithm taxonomy to unsolved issues. In particular, first review algorithms different perspectives, including network architecture, level...
In this paper, we propose a two-stage fully 3D network, namely DeepFuse, to estimate human pose in space by fusing body-worn Inertial Measurement Unit (IMU) data and multi-view images deeply. The first stage is designed for pure vision estimation. To preserve primitiveness of inputs, the uses multi-channel volume as representation soft-argmax activation layer. second one IMU refinement which introduces an IMU-bone layer fuse earlier at level. without requiring given skeleton model priori,...
Human visual system can selectively attend to parts of a scene for quick perception, biological mechanism known as <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Human attention</i> . Inspired by this, recent deep learning models encode attention mechanisms focus on the most task-relevant input signal further processing, which is called xmlns:xlink="http://www.w3.org/1999/xlink">Machine/Neural/Artificial Understanding relation between human...
ABSTRACT The narrow annulus in small‐bore horizontal wells causes marked differences cuttings transport compared to conventional wells. To address this issue, a CFD‐based numerical model for solid‐liquid two‐phase flow the was developed, accounting eccentricity of drill string. study examines effects key factors, including rate, pipe rotation speed, well inclination angle, and drilling fluid properties, on Results show that increasing speed enhances tangential axial velocities annular by up...
The success of current deep saliency models heavily depends on large amounts annotated human fixation data to fit the highly non-linear mapping between stimuli and visual saliency. Such fully supervised data-driven approaches are annotation-intensive often fail consider underlying mechanisms attention. In contrast, in this paper, we introduce a model based various cognitive theories saliency, which learns attention patterns weakly manner. Our approach incorporates insights from science as...
We present a method for synopsizing multiple videos captured by set of surveillance cameras with some overlapped field-of-views. Currently, object-based approaches that directly shift objects along the time axis are already able to compute compact synopsis results videos. The challenge is how in more and understandable way. Previous show them side on screen, which however difficult user comprehend. In this paper, we solve problem joint object-shifting camera view-switching. Firstly,...
In this paper, we present an empirical study introducing a nuanced evaluation framework for text-to-image (T2I) generative models, applied to human image synthesis. Our categorizes evaluations into two distinct groups: first, focusing on qualities such as aesthetics and realism, second, examining text conditions through concept coverage fairness. We introduce innovative aesthetic score prediction model that assesses the visual appeal of generated images unveils first dataset marked with...
Adversarial examples (AEs) pose severe threats to the applications of deep neural networks (DNNs) safety-critical domains, e.g., autonomous driving.While there has been a vast body AE defense solutions, best our knowledge, they all suffer from some weaknesses, defending against only subset AEs or causing relatively high accuracy loss for legitimate inputs.Moreover, most existing solutions cannot defend adaptive attacks, wherein attackers are knowledgeable about mechanisms and craft...
Deep learning (DL) has achieved unprecedented success in a variety of tasks. However, DL systems are notoriously difficult to test and debug due the lack explainability models huge input space cover. Generally speaking, it is relatively easy collect massive amount data, but labeling cost can be quite high. Consequently, essential conduct selection label only those selected "high quality" bug-revealing inputs for reduction. In this paper, we propose novel prioritization technique that brings...
The partially occluded image recognition (POIR) problem has been a challenge for artificial intelligence long time. A common strategy to handle the POIR is using non-occluded features classification. Unfortunately, this will lose effectiveness when severely occluded, since visible parts can only provide limited information. Several studies in neuroscience reveal that feature restoration which fills information and called amodal completion essential human brains recognize images. However,...
In this paper, we propose a two-stage fully 3D network, namely \textbf{DeepFuse}, to estimate human pose in space by fusing body-worn Inertial Measurement Unit (IMU) data and multi-view images deeply. The first stage is designed for pure vision estimation. To preserve primitiveness of inputs, the uses multi-channel volume as representation soft-argmax activation layer. second one IMU refinement which introduces an IMU-bone layer fuse earlier at level. without requiring given skeleton model...
We present DeepSAT, a novel end-to-end learning framework for the Boolean satisfiability (SAT) problem. Unlike existing solutions trained on random SAT instances with relatively weak supervision, we propose applying knowledge of well-developed electronic design automation (EDA) field solving. Specifically, first resort to logic synthesis algorithms pre-process into optimized and-inverter graphs (AIGs). By doing so, distribution diversity among various can be dramatically reduced, which...
Sensitive and accurate determination of aflatoxin B1 in food samples is urgently required for safety, environment monitoring human health due to its high toxicity. Here, we constructed an aptasensor with two fluorescent signals AFB1 by combining AIEgesns (TPE-Z) as probe, the highly specific aptamer molecular label Cy5 recognition element GO fluorescence quencher. Upon addition AFB1, signal platform turn from "off" "on" signals. The provided a sensitivity (0.04 ng/mL), specificity reliable...
The selective visual attention mechanism in the human system (HVS) restricts amount of information to reach awareness for perceiving natural scenes, allowing near real-time processing with limited computational capacity. This kind selectivity acts as an ‘Information Bottleneck (IB)’, which seeks a trade-off between compression and predictive accuracy. However, such constraints are rarely explored deep neural networks (DNNs). In this paper, we propose IB-inspired spatial module DNN structures...
This paper describes a novel method of real-time virtual view generation for augmented virtuality system. The aim is to rectify the displacement between eye position and video cameras which are attached on head mounted display. It consists three steps: Firstly, stereo calibration was used generate cameras' internal external parameters, then image planes were rectified canonical configuration state that made them alignment or corresponding points same scan lines. At last images two generated...
Nonlocal correlation represents the key feature of quantum mechanics, which is exploited as a resource in information processing. However, loophole issues hamper practical applications. We report first demonstration steering nonlocality with detection closed at telecommunication wavelengths. In this endeavour, we design and fabricate low-loss silicon chip for efficient entanglement generation, further apply direct modulation technique to its optical pump eliminate phase-encoding loss side....