Zhenghui Hu

ORCID: 0000-0002-6106-0416
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Visual Attention and Saliency Detection
  • Advanced Image and Video Retrieval Techniques
  • Video Surveillance and Tracking Methods
  • Robotics and Sensor-Based Localization
  • Music Technology and Sound Studies
  • Music and Audio Processing
  • Optical measurement and interference techniques
  • Smart Agriculture and AI
  • Remote-Sensing Image Classification
  • Domain Adaptation and Few-Shot Learning
  • Image and Object Detection Techniques
  • Retinal Imaging and Analysis
  • Image Processing and 3D Reconstruction
  • Medical Image Segmentation Techniques
  • Image and Video Quality Assessment
  • Anomaly Detection Techniques and Applications
  • Cardiovascular Health and Disease Prevention
  • Electrical and Bioimpedance Tomography
  • Speech and Audio Processing
  • Remote Sensing and Land Use
  • Magnetic Field Sensors Techniques
  • Analog and Mixed-Signal Circuit Design
  • Color Science and Applications
  • IoT Networks and Protocols

Beihang University
2022-2024

State Key Laboratory of Virtual Reality Technology and Systems
2024

Shandong University
2024

Hangzhou Dianzi University
2024

Depth images and thermal contain the spatial geometry information surface temperature information, which can act as complementary for RGB modality.However, quality of depth is often unreliable in some challenging scenarios, will result performance degradation two-modal based salient object detection (SOD).Meanwhile, researchers pay attention to triple-modal SOD task, namely visibledepth-thermal (VDT) SOD, where they attempt explore complementarity image, image.However, existing methods fail...

10.1109/tip.2024.3393365 article EN IEEE Transactions on Image Processing 2024-01-01

Detecting objects from aerial images poses significant challenges due to the following factors: 1) Aerial typically have very large sizes, generally with millions or even hundreds of pixels, while computational resources are limited. 2) Small object size leads insufficient information for effective detection. 3) Non-uniform distribution resource wastage. To address these issues, we propose YOLC (You Only Look Clusters), an efficient and framework that builds on anchor-free detector,...

10.1109/tits.2024.3386928 article EN IEEE Transactions on Intelligent Transportation Systems 2024-04-23

Object counting, which aims to count the accurate number of object instances in images, has been attracting more and attention. However, challenges such as large-scale variation, complex background interference, nonuniform density distribution greatly limit counting accuracy, particularly striking remote-sensing imagery. To mitigate above issues, this article proposes a novel framework for dense incorporates pyramidal scale module (PSM) global context (GCM), dubbed PSGCNet, where PSM is used...

10.1109/tgrs.2022.3153946 article EN IEEE Transactions on Geoscience and Remote Sensing 2022-01-01

Human pose estimation and tracking are fundamental tasks for understanding human behaviors in videos. Existing top-down framework-based methods usually perform three-stage tasks: detection, tracking. Although promising results have been achieved, these rely heavily on high-performance detectors may fail to track persons who occluded or miss-detected. To overcome problems, this paper, we develop a novel keypoint confidence network pipeline improve detection approaches. Specifically, the is...

10.1109/tmm.2023.3330532 article EN IEEE Transactions on Multimedia 2023-11-06

A coarse-to-fine multi-view stereo network with Transformer (MVS-T) is proposed to solve the problems of sparse point clouds and low accuracy in reconstructing 3D scenes from low-resolution images. The uses a strategy estimate depth image progressively reconstruct cloud. First, pyramids features are constructed transfer semantic spatial information among at different scales. Then, module employed aggregate image's global context capture internal correlation feature map. Finally, inferred by...

10.3390/s22197659 article EN cc-by Sensors 2022-10-09

Detecting objects from aerial images poses significant challenges due to the following factors: 1) Aerial typically have very large sizes, generally with millions or even hundreds of pixels, while computational resources are limited. 2) Small object size leads insufficient information for effective detection. 3) Non-uniform distribution resource wastage. To address these issues, we propose YOLC (You Only Look Clusters), an efficient and framework that builds on anchor-free detector,...

10.48550/arxiv.2404.06180 preprint EN arXiv (Cornell University) 2024-04-09

Topological building extraction in remote sensing images is vital for city planning, disaster assessment, and other real-world applications. To meet the requirements of applications, existing approaches predict topological by vectorization binary masks using multiple refinement stages, leading to complex methodology poor generalization. tackle this issue, we propose a approach directly predicting serialized vertices each instance. We observe that order from one inherently bidirectional,...

10.1109/jstars.2024.3399251 article EN cc-by-nc-nd IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2024-01-01

Although multi-view 3D object detection based on the Bird's-Eye-View (BEV) paradigm has garnered widespread attention as an economical and deployment-friendly perception solution for autonomous driving, there is still a performance gap compared to LiDAR-based methods. In recent years, several cross-modal distillation methods have been proposed transfer beneficial information from teacher models student models, with aim of enhancing performance. However, these face challenges due...

10.48550/arxiv.2407.10135 preprint EN arXiv (Cornell University) 2024-07-14

Unsupervised domain adaptation (UDA) is attracting more attention from researchers for boosting the task-specific generalization on target domain. It focuses addressing shift between labeled source and unlabeled Recent biclassifier-based UDA models perform category-level alignment to reduce shift, meanwhile, self-training used improving discriminability of instances. However, error accumulation problem instances with high semantic uncertainty may cause degradation misalignment. To solve this...

10.1109/tnnls.2024.3431283 article EN IEEE Transactions on Neural Networks and Learning Systems 2024-01-01

10.1109/cvpr52733.2024.01608 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Existing symbolic music generation methods usually utilize discriminator to improve the quality of generated via global perception music. However, considering complexity information in music, such as rhythm and melody, a single cannot fully reflect differences these two primary dimensions In this work, we propose decouple melody from design corresponding fine-grained discriminators tackle aforementioned issues. Specifically, equipped with pitch augmentation strategy, discerns variations...

10.48550/arxiv.2408.01696 preprint EN arXiv (Cornell University) 2024-08-03

Given a set of 2D scattering points from an edge detection operator, the aim ellipse fitting is to construct elliptic equation that best fit observations. For data collected often contain noisy, uncertainty, and incompleteness which constitutes considerable challenge for all algorithms. To address this issue, method direct by minimizing L0 algebraic distance presented. Unlike its L2 counterparts assumed error follows Gaussian distribution, our tried model outliers using norm between ideal...

10.1109/iccece58074.2023.10135531 article EN 2023-01-06

Seed sorting is critical for the breeding industry to improve agricultural yield. The seed methods based on convolutional neural networks (CNNs) have achieved excellent recognition accuracy large-scale pretrained network models. However, CNN inference a computationally intensive process that often requires hardware acceleration operate in real time. For embedded devices, high-power consumption of graphics processing units (GPUs) generally prohibitive, and field programmable gate array (FPGA)...

10.1155/2022/5608573 article EN cc-by Journal of Electrical and Computer Engineering 2022-11-17

Semantic segmentation is the process of categorizing all pixels in an image. Given inherent challenges attaining fine labels, researchers have recently embraced weak labels to mitigate annotation burden segmentation. The current work on weakly supervised semantic (WSSS) mainly focuses expanding pseudo-label seeds salient regions image, but there are also many objects outside area that not been discovered. In this work, we propose innovative WSSS method by exploring non-significant areas...

10.1145/3633637.3633697 article EN 2023-10-27

The biological model of the mammal visual mechanisms is very beneficial to feature learning in motionless images. It proved that can improve performance hand-crafted methods and CNNS method. Recently CNNs learn discriminate robust features by changing backbone, processing multi-scale maps, adding attention mechanisms, etc. While they are relatively short network structure with human retina which have been proven a strong extract capability images traditional descriptors. To address this...

10.1145/3581807.3581869 article EN 2022-11-17

To achieve long-term economic growth, competitiveness, and sustainability, speed accuracy are the key requirements when it comes to seed purity sorting. However, current sorting methods suffer from large number of model parameters computational complexity, make a great challenge deploy them in real-time applications, especially on devices with limited resources. issue above problems, this paper, lightweight efficient network pyramid dilated convolution, namely LEPD-Net, is proposed for...

10.1145/3581807.3581888 article EN 2022-11-17
Coming Soon ...