Binglu Wang

ORCID: 0000-0002-9266-4685
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Multimodal Machine Learning Applications
  • Anomaly Detection Techniques and Applications
  • Advanced Image Processing Techniques
  • Advanced Image Fusion Techniques
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Gaze Tracking and Assistive Technology
  • Image Enhancement Techniques
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Video Surveillance and Tracking Methods
  • Visual Attention and Saliency Detection
  • Image and Signal Denoising Methods
  • Target Tracking and Data Fusion in Sensor Networks
  • Advanced Computing and Algorithms
  • Topic Modeling
  • scientometrics and bibliometrics research
  • Remote-Sensing Image Classification
  • Advanced Vision and Imaging
  • Infrared Target Detection Methodologies
  • Hand Gesture Recognition Systems
  • Image Retrieval and Classification Techniques
  • Gaussian Processes and Bayesian Inference
  • COVID-19 diagnosis using AI
  • Tactile and Sensory Interactions

Central South University
2023-2025

Northwestern Polytechnical University
2020-2025

Beijing Institute of Technology
2023-2024

Xi'an University of Architecture and Technology
2022-2024

Wuhan Polytechnic University
2024

Third Xiangya Hospital
2024

Ocean University of China
2023

China Railway Group (China)
2022

Peking Union Medical College Hospital
2021

Chinese Academy of Medical Sciences & Peking Union Medical College
2021

The combination of infrared and visible videos aims to gather more comprehensive feature information from multiple sources reach superior results on various practical tasks, such as detection segmentation, over that a single modality. However, most existing dual-modality object algorithms ignore the modal differences fail consider correlation between extraction fusion, which leads incomplete inadequate fusion features. Hence, there raises an issue how preserve each unique fully utilize...

10.1109/tcsvt.2023.3289142 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-06-26

Object detection for remote sensing is a fundamental task in image processing of sensing; as one the core components, small or tiny object plays an important role. Despite considerable advancements achieved with integration CNN and transformer networks, there remains untapped potential enhancing extraction utilization information associated objects. Particularly within structures, this arises from disregard complex intertwined interplay between spatial context channel during global modeling...

10.3390/rs15163970 article EN cc-by Remote Sensing 2023-08-10

This article presents U 2PNet, a novel unsupervised underwater image restoration network using polarization for improving signal-to-noise ratio and quality in imaging environments. Traditional methods require specific cues or pairs of datasets, which limit their practical applications. Our proposed method requires only one mosaicked polarized the scene does not datasets pretraining cues. We design two subnetworks (T-net B textsubscript ∞ -net) to accurately estimate transmission map...

10.1109/tcyb.2024.3365693 article EN IEEE Transactions on Cybernetics 2024-02-29

Super-resolution neural networks have recently achieved great progress in restoring high-quality remote sensing images at low zoom-in magnitude. However, these often struggle with challenges like shape distortion and blurring effects due to the severe absence of structure texture details large-factor image super-resolution. Addressing challenges, we propose a novel Two-Stage Spatial-Frequency Joint Learning Network (TSFNet). TSFNet innovatively merges insights from both spatial frequency...

10.1109/tgrs.2024.3357173 article EN IEEE Transactions on Geoscience and Remote Sensing 2024-01-01

We propose a novel graph Laplacian-guided coupled tensor decomposition (gLGCTD) model for fusion of hyperspectral image (HSI) and multispectral (MSI) spatial spectral resolution enhancements. The Tucker is employed to capture the global interdependencies across different modes fully exploit intrinsic spatial-spectral information. To preserve local characteristics, complementary submanifold structures embedded in high-resolution (HR)-HSI are encoded by Laplacian regularizations. information...

10.1109/tgrs.2020.2992788 article EN IEEE Transactions on Geoscience and Remote Sensing 2020-05-18

This paper presents a mosaic convolution-attention network (MCAN) for demosaicing spectral images captured using multispectral filter array (MSFA) imaging sensors. MSFA-based systems acquire information of scene in single snap-shot operation. A complete image is reconstructed by an image. To avoid aliasing and artifacts demosaicing, we utilize joint spatial-spectral correlation raw The proposed MCAN includes convolution module (MCM) attention (MAM). MCM extracts features via learning...

10.1109/tci.2021.3102052 article EN IEEE Transactions on Computational Imaging 2021-01-01

Weakly supervised object detection (WSOD) has recently attracted much attention in the field of remote sensing, where only image-level labels that distinguish existence an images are required. However, existing methods frequently treat most discriminative area as optimal solution and, meanwhile, ignore fact more than one instance may exist a certain class sensing (RSIs). To address issue, we propose unique multiple graph (MIG) learning framework for WSOD RSIs. The motivation this work is...

10.1109/tgrs.2021.3123231 article EN IEEE Transactions on Geoscience and Remote Sensing 2021-10-26

Gaze object prediction is a newly proposed task that aims to discover the objects being stared at by humans. It of great application significance but still lacks unified solution framework. An intuitive incorporate an detection branch into existing gaze method. However, previous methods usually use two different networks extract features from scene image and head image, which would lead heavy network architecture prevent each joint optimization. In this paper, we build novel framework named...

10.1109/cvpr52688.2022.01898 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Expert feedback lays the foundation of rigorous research. However, rapid growth scholarly production and intricate knowledge specialization challenge conventional scientific mechanisms. High-quality peer reviews are increasingly difficult to obtain. Researchers who more junior or from under-resourced settings have especially hard times getting timely feedback. With breakthrough large language models (LLM) such as GPT-4, there is growing interest in using LLMs generate on research...

10.48550/arxiv.2310.01783 preprint EN other-oa arXiv (Cornell University) 2023-01-01

The temporal action localization research aims to discover instances from untrimmed videos, representing a fundamental step in the field of intelligent video understanding. With advent deep learning, backbone networks have been instrumental providing representative spatiotemporal features, while end-to-end learning paradigm has enabled development high-quality models through data-driven training. Both supervised and weakly approaches contributed rapid progress localization, resulting...

10.1109/tpami.2023.3330794 article EN cc-by-nc-nd IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-11-06

This paper presents Core, a conceptually simple, effective and communication-efficient model for multi-agent cooperative perception. It addresses the task from novel perspective of reconstruction, based on two key insights: 1) cooperating agents together provide more holistic observation environment, 2) can serve as valuable supervision to explicitly guide learning how reconstruct ideal collaboration. Core instantiates idea with three major components: compressor each agent create compact...

10.1109/iccv51070.2023.00800 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Recently, remote sensing image super-resolution (RSISR) has drawn considerable attention and made great breakthroughs based on convolutional neural networks (CNNs). Due to the scale richness of texture structural information frequently recurring inside same images (RSIs) but varying greatly with different RSIs, state-of-the-art CNN-based methods have begun explore multiscale global features in RSIs by using mechanisms. However, they are still insufficient significant content clues RSIs. In...

10.1109/tgrs.2023.3283769 article EN cc-by IEEE Transactions on Geoscience and Remote Sensing 2023-01-01

The task of long-term action anticipation demands solutions that can effectively model temporal dynamics over extended periods while deeply understanding the inherent semantics actions. Traditional approaches, which primarily rely on recurrent units or Transformer layers to capture dependencies, often fall short in addressing these challenges. Large Language Models (LLMs), with their robust sequential modeling capabilities and extensive commonsense knowledge, present new opportunities for...

10.48550/arxiv.2501.00795 preprint EN arXiv (Cornell University) 2025-01-01

Modeling cross-video relationship is an important issue for the weakly supervised temporal action localization task. To this end, traditional methods operate at level and rely on complicated strategies to prepare triplet samples, which only mines relationships among three videos from two categories. In work, we observe that instances different categories could exhibit similar motion patterns, i.e. subaction, propose sub-action granularity elaborately explore relationships. However, given...

10.1109/tcsvt.2021.3089323 article EN IEEE Transactions on Circuits and Systems for Video Technology 2021-06-14

Temporal action localization aims at discovering instances in untrimmed videos, where RGB and flow are two widely used feature modalities. Specifically, chiefly reveals appearance mainly depicts motion. Given features, previous methods employ the early fusion or late paradigm to mine complementarity between them. By concatenating raw implicitly achieved by network, but it partly discards particularity of each modality. The independently maintains branches explore modality, only fuses...

10.1109/lsp.2021.3061289 article EN IEEE Signal Processing Letters 2021-01-01

Intent perception is a novel task that aims to understand the intention of images, regular classification methods usually perform unsatisfactorily on intent due semantic ambiguity problem,i.e. intra-class variety problem in which images same class may contain objects different categories and inter-class confusion classes similar categories. To address this problem, paper introduces prototype learning into proposes unified framework named PIP-Net reduce influence ambiguity. Specifically, for...

10.1109/tmm.2023.3234817 article EN IEEE Transactions on Multimedia 2023-01-01

Denoising and demosaicking long-wave infrared (LWIR) division-of-focal-plane (DoFP) polarization images are crucial for various vision applications. However, existing methods rely on the sequential application of individual denoising processes, which may result in accumulation errors produced by each process. To address this issue, we propose a joint method LWIR DoFP based three-stage progressive deep convolutional neural network. ensure generalization ability network, it is essential to...

10.1109/tip.2023.3327590 article EN IEEE Transactions on Image Processing 2023-01-01

Many multi-view camera-based 3D object detection models transform the image features into Bird's-Eye-View (BEV) via Lift-Splat-Shoot (LSS) mechanism, which "lifts" 2D camera-view to voxel space based on predicted depth distribution and then "splats" a BEV plane for subsequent detection. However, feature in such one-stage view transformation scheme heavily relies quality of features, further determines final performance. In this paper, we propose BEVRefiner model performs dual refinement both...

10.1109/tits.2024.3394550 article EN IEEE Transactions on Intelligent Transportation Systems 2024-05-10
Coming Soon ...