Bin Wang

ORCID: 0000-0002-4198-8823
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Image Retrieval and Classification Techniques
  • Gait Recognition and Analysis
  • Video Surveillance and Tracking Methods
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Advanced Vision and Imaging
  • Anomaly Detection Techniques and Applications
  • Remote-Sensing Image Classification
  • Text and Document Classification Technologies
  • Robotics and Sensor-Based Localization
  • Power Systems Fault Detection
  • Advanced Clustering Algorithms Research
  • Context-Aware Activity Recognition Systems
  • Image Enhancement Techniques
  • Microgrid Control and Optimization
  • Face and Expression Recognition
  • Hand Gesture Recognition Systems
  • Graph Theory and Algorithms
  • Optimal Power Flow Distribution
  • Video Analysis and Summarization
  • Robotic Path Planning Algorithms
  • Handwritten Text Recognition Techniques
  • Electromagnetic wave absorption materials

Tsinghua University
2018-2025

Institute for Infocomm Research
2024

Agency for Science, Technology and Research
2024

Nanjing University of Finance and Economics
2024

Shanghai Normal University
2014-2023

National University of Defense Technology
2012-2023

China Aerodynamics Research and Development Center
2023

Xi'an University of Technology
2022

China United Network Communications Group (China)
2022

Guangzhou Electronic Technology (China)
2022

We introduce Perceiving Stroke-Semantic Context (PerSec), a new approach to self-supervised representation learning tailored for Scene Text Recognition (STR) task. Considering scene text images carry both visual and semantic properties, we equip our PerSec with dual context perceivers which can contrast learn latent representations from low-level stroke high-level contextual spaces simultaneously via hierarchical contrastive on unlabeled image data. Experiments in un- semi-supervised...

10.1609/aaai.v36i2.20062 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Finding visually identical images in large image collections is important for many applications such as intelligence propriety protection and search result presentation. Several algorithms have been reported the literature, but they are not suitable collections. In this paper, a novel algorithm proposed to handle situation, which each compactly represented by hash code. To detect duplicate images, only codes required. addition, very efficient method implemented quickly group with similar...

10.1109/icme.2006.262509 article EN 2006-07-01

Abstract Impedance matching modulation of the electromagnetic wave (EMW) absorbers toward broad effective absorption bandwidth (EAB) is ultimate aim in EMW attenuation applications. Here, a Joule heating strategy reported for preparation Co‐loaded carbon (Co/C) absorber with tunable impedance characteristics. Typically, size Co can be regulated to range from single‐atoms clusters, and nanocrystals. The varied sizes combined different graphitization degrees result relative input impedances...

10.1002/smll.202308970 article EN Small 2023-12-28

10.1561/116.00000177 article EN cc-by-nc APSIPA Transactions on Signal and Information Processing 2024-01-01

10.1016/j.compag.2024.108741 article EN Computers and Electronics in Agriculture 2024-02-21

Cross-modal hashing has attracted considerable attention for large-scale multimodal data. Recent supervised cross-modal methods using multi-label networks utilize the semantics of multi-labels to enhance retrieval accuracy, where label hash codes are learned independently. However, all these assume that annotations reliably reflect relevance between their corresponding instances, which is not true in real applications. In this paper, we propose a novel framework called Bidirectional...

10.1609/aaai.v36i9.21268 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Early activity prediction, which aims to recognize class labels before actions are fully performed, is a very challenging task since partially observed action sequences contain insufficient class-discrimination information, and thus, many partial belonging different categories may look similar. Therefore, in this paper, we propose novel <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">guidance aware network (GA-Net)</i> boost the ability...

10.1109/tmm.2021.3137745 article EN IEEE Transactions on Multimedia 2021-12-23

Content-based image retrieval has been the most important technique for managing huge amount of images.The fundamental yet highly challenging problem in this field is how to measure content-level similarity based on low-level features.The primary difficulties lie great variance within images, e.g.background, illumination, viewpoint and pose.Intuitively, an ideal should be able adapt data distribution, discover highlight information, robust those variances.Motivated by these observations, we...

10.3837/tiis.2014.08.019 article EN KSII Transactions on Internet and Information Systems 2014-08-29

We propose a Multiscale Locality‐Constrained Spatiotemporal Coding (MLSC) method to improve the traditional bag of features (BoF) algorithm which ignores spatiotemporal relationship local for human action recognition in video. To model this relationship, MLSC involves position feature into coding processing. It projects sub space‐time‐volume (sub‐STV) and encodes them with locality‐constrained linear coding. A group sub‐STV obtained from one video max‐pooling are used classify In...

10.1155/2013/405645 article EN cc-by The Scientific World JOURNAL 2013-01-01

Action recognition in video understanding is a challenging task, largely because of the complexity and difficulty temporal modeling, making it suffer from motion information loss misalignment attention spatial dimensions. To overcome these difficulties, we propose novel modeling method called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Adjoint Enhancement Network</i> (AE-Net), which can fully explore clues time long-range structure. The...

10.1109/tmm.2022.3193057 article EN IEEE Transactions on Multimedia 2022-07-21

The potential for the research of object tracking in computer vision has been well established, but previous object-tracking methods, which consider only continuous and smooth motion, are limited handling abrupt motions. We introduce an efficient algorithm to tackle this limitation. A feature-driven (FD) motion model-based features from accelerated segment test (FAST) feature matching is proposed particle-filtering framework. Various evaluations have demonstrated that model can improve...

10.1117/1.oe.51.4.047203 article EN Optical Engineering 2012-04-17

10.1007/s10044-023-01195-3 article EN Pattern Analysis and Applications 2023-09-27

In many practical engineering applications, the number of actions that have been finished should be known, particularly for an untrimmed video sequence includes event with a series actions, it is important to know finished. this paper, we termed process as visual progress estimation (EPE). However, research related problem few in community. To solve problem, human action analysis-based framework, namely one-shot simultaneously detection and identification (SADI)-EPE, presented paper. The EPE...

10.1109/tcsvt.2018.2847305 article EN IEEE Transactions on Circuits and Systems for Video Technology 2018-06-14

This paper presents a low bit-rate MDCT coder, which is adopted as part of the recently standardized codec for Enhanced Voice Services. To maximize performance NB to SWB input signals bit-rates (7.2 16.4 kbps), new adaptive bit-allocation and spectrum quantization schemes, emphasize perceptually important while efficiently coding full spectrum, was introduced into coder. Further, small symbol switched Huffman exploited reducing bits consumption quantizing band energies spectrum. Finally,...

10.1109/icassp.2015.7179100 article EN 2015-04-01

With the development of renewable energy, inverter-based resources (IBRs) in power grids are rapid development, where grid-following control (GFL) is widely used. Meanwhile, grid-forming control-based (GFM) devices receiving more attention and have been installed grid to provide active support for frequency voltage. In future systems with high penetration or 100% IBRs, combination GFL GFM will be a promising way dynamic voltage (DVS) improve short-term security (STVS) during short-circuit...

10.1109/aeees61147.2024.10545005 article EN 2022 4th Asia Energy and Electrical Engineering Symposium (AEEES) 2024-03-28

At present, many indoor scene classification algorithms are based on abstract feature descriptions and lack semantic interpretation for results. For robot applications, these methods fail to reuse prior knowledge of environmental objects, resulting in a waste computing resources, not conducive interaction. In addition, large number existing have been verified applications because training test performed the same dataset. Here, intermediate structure, called object vector, is proposed...

10.1145/3512826.3512846 article EN 2022-01-11

Learning the content-level similarity over images is a fundamental problem in content-based image retrieval, and highly challenging due to variance within images. Such requires measure be adaptive enough. In this paper, we propose learning approach based on probabilistic modeling of First, derive measure, free energy score space kernel (FESS kernel), from models. FESS essentially function observed data, hidden variable model parameters, where variables are very informative absent previous...

10.1109/icip.2013.6738533 article EN 2013-09-01
Coming Soon ...