Bing Su

ORCID: 0000-0001-8560-1910
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Advanced Image and Video Retrieval Techniques
  • Gait Recognition and Analysis
  • Video Analysis and Summarization
  • Video Surveillance and Tracking Methods
  • Anomaly Detection Techniques and Applications
  • Image and Signal Denoising Methods
  • Face and Expression Recognition
  • COVID-19 diagnosis using AI
  • Gear and Bearing Dynamics Analysis
  • Image Retrieval and Classification Techniques
  • Handwritten Text Recognition Techniques
  • Time Series Analysis and Forecasting
  • Advanced Graph Theory Research
  • Music and Audio Processing
  • Optimization and Search Problems
  • Advanced Image Processing Techniques
  • Image Processing Techniques and Applications
  • Human Motion and Animation
  • Machine Learning and Algorithms
  • Hand Gesture Recognition Systems
  • Interconnection Networks and Systems
  • Digital Imaging for Blood Diseases

Renmin University of China
2020-2025

Beijing Institute of Big Data Research
2022-2024

Henan University of Science and Technology
2011-2022

Institute of Software
2016-2020

Chinese Academy of Sciences
2008-2019

Academia Sinica
2018

Tsinghua University
2010-2015

Xi'an Technological University
2012-2014

Xi'an Jiaotong University
2004-2008

Institute of Geographic Sciences and Natural Resources Research
2008

In long-term time series forecasting, most Transformer-based methods adopt the standard point-wise attention mechanism, which not only has high complexity but also cannot explicitly capture predictive dependencies from contexts since corresponding key and value are transformed same point. This paper proposes a model called Preformer. Preformer introduces novel efficient Multi-Scale Segment-Correlation mechanism that divides into segments utilizes segment-wise correlation-based to replace...

10.1109/icassp49357.2023.10096881 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Person Re-IDentification (P-RID), as an instance-level recognition problem, still remains challenging in computer vision community. Many P-RID works aim to learn faithful and discriminative features/metrics from offline training data directly use them for the unseen online testing data. However, their performance is largely limited due severe shifting issue between Therefore, we propose joint multi-metric adaptation model adapt learned models by learning a series of metrics all...

10.1109/cvpr42600.2020.00298 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

How to estimate the uncertainty of a given model is crucial problem. Current calibration techniques treat different classes equally and thus implicitly assume that distribution training data balanced, but ignore fact real-world often follows long-tailed distribution. In this paper, we explore problem calibrating trained from Due difference between imbalanced balanced test distribution, existing methods such as temperature scaling can not generalize well Specific for domain adaptation are...

10.1109/cvpr52729.2023.01913 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

We present a new distance measure between sequences that can tackle local temporal distortion and periodic with arbitrary starting points. Through viewing the instances of as empirical samples an unknown distribution, we cast calculation optimal transport problem. To preserve inherent relationships in sequences, smooth problem two novel regularization terms. The inverse difference moment enforces homogeneous structures, KL-divergence prior distribution prevents far positions. show this be...

10.1109/cvpr.2017.310 article EN 2017-07-01

10.1016/j.colsurfa.2024.136061 article EN Colloids and Surfaces A Physicochemical and Engineering Aspects 2025-01-01

Single image defocus deblurring (SIDD) aims to restore an all-in-focus from a defocused one. Distribution shifts in images generally lead performance degradation of existing methods during out-of-distribution inferences. In this work, we gauge the intrinsic reason behind degradation, which is identified as heterogeneity lens-specific point spread functions. Empirical evidence supports finding, motivating us employ continual test-time adaptation (CTTA) paradigm for SIDD. However, traditional...

10.48550/arxiv.2501.09052 preprint EN arXiv (Cornell University) 2025-01-15

Long-tailed learning has garnered increasing attention due to its practical significance. Among the various approaches, fine-tuning paradigm gained considerable interest with advent of foundation models. However, most existing methods primarily focus on leveraging knowledge from these models, overlooking inherent biases introduced by imbalanced training data they rely on. In this paper, we examine how such imbalances pre-training affect long-tailed downstream tasks. Specifically, find...

10.48550/arxiv.2501.15955 preprint EN arXiv (Cornell University) 2025-01-27

10.1109/icassp49660.2025.10887833 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Since the observables at particular time instants in a temporal sequence exhibit dependencies, they are not independent samples. Thus, it is plausible to apply i.i.d. assumption-based dimensionality reduction methods data. This paper presents novel supervised approach for data, called Linear Sequence Discriminant Analysis (LSDA). It learns linear discriminative projection of feature vectors sequences lower-dimensional subspace by maximizing separability classes such that entire holistically...

10.1109/tpami.2017.2665545 article EN publisher-specific-oa IEEE Transactions on Pattern Analysis and Machine Intelligence 2017-02-07

A meaningful video is semantically coherent and changes smoothly. However, most existing fine-grained representation learning methods learn frame-wise features by aligning frames across videos or exploring relevance between multiple views, neglecting the inherent dynamic process of each video. In this paper, we propose to representations modeling Video as Stochastic Processes (VSP) via a novel process-based contrastive framework, which aims discriminate processes simultaneously capture...

10.1109/cvpr52729.2023.00221 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

Recognizing and segmenting actions from long videos is a challenging problem. Most existing methods focus on designing temporal convolutional models. However, these models are limited in their flexibility ability to model long-term dependencies. Transformers have recently been used various tasks. But the lack of inductive bias inefficiency handling video sequences limit application action segmentation. In this paper, we present pure Transformer-based without convolutions segmentation, called...

10.1109/icme55011.2023.00178 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

From the online point of view, we study Canadian Traveller Problem (CTP), in which traveller knows advance structure graph and costs all edges. However, some edges may fail only observes that upon reaching an adjacent vertex blocked edge. The goal is to find least-cost route from source O destination D, more precisely, adaptive strategy minimizing competitive ratio, compares performance this with a hypothetical offline algorithm entire topology advance. In paper, present two strategies—a...

10.1007/s10878-008-9156-y article EN cc-by-nc Journal of Combinatorial Optimization 2008-04-08

Dimensionality reduction for vectors in sequences is challenging since labels are attached to as a whole. This paper presents model-based dimensionality method vector sequences, namely linear sequence discriminant analysis (LSDA), which attempts find subspace of the same class projected together while those different classes far possible. For each class, an HMM built from states statistics extracted. Means these linked order form mean sequence, and variance defined sum all variances...

10.1109/iccv.2013.115 article EN 2013-12-01

We present new distance measures between sequences that can tackle local temporal distortion and periodic with arbitrary starting points. Through viewing the instances of each sequence as empirical samples an unknown distribution, we cast calculations distances optimal transport problems. To preserve inherent relationships in sequences, propose two methods through incorporating information into spatial ground metric concentrating novel regularization terms, respectively. The inverse...

10.1109/tpami.2018.2870154 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-09-14

Unsupervised domain adaptation (UDA) requires source samples with clean ground truth labels during training. Accurately labeling a large number of is time-consuming and laborious. An alternative to utilize noisy for However, training can greatly reduce the performance UDA. In this paper, we address problem that learning UDA models only access propose novel method called robust local preserving global aligning network (RLPGA). RLPGA improves robustness label noise from two aspects. One...

10.1109/tkde.2021.3112815 article EN IEEE Transactions on Knowledge and Data Engineering 2021-01-01

Many discriminant analysis methods such as LDA and HLDA actually maximize the average pairwise distances between classes, which often causes class separation problem. Max-min distance (MMDA) addresses this problem by maximizing minimum in latent subspace, but it is developed under homoscedastic assumption. This paper proposes Heteroscedastic MMDA (HMMDA) that explore discriminative information difference of intra-class scatters for dimensionality reduction. WHMMDA maximizes minimal Chenoff...

10.1109/cvpr.2015.7299084 article EN 2015-06-01

Video moment localization aims to retrieve the target segment of an untrimmed video according natural language query. Weakly supervised methods gains attention recently, as precise temporal location is not always available. However, one greatest challenges encountered by weakly method implied in mismatch between and induced coarse annotations. To refine vision-language alignment, recent works contrast cross-modality similarities driven reconstructing masked queries positive negative...

10.1145/3581783.3612495 article EN 2023-10-26

Generally, the evolution of an action is not uniform across video, but exhibits quite complex rhythms and non-stationary dynamics. To model such non-uniform temporal dynamics, in this paper, we describe a novel hierarchical dynamic parsing encoding method to capture both locally smooth dynamics globally drastic changes. It parses into different layers encodes multi-layer information joint representation for recognition. At first layer, sequence parsed unsupervised manner several...

10.1109/tip.2017.2745212 article EN publisher-specific-oa IEEE Transactions on Image Processing 2017-08-25

HMM-based analytical methods have been widely used for Arabic handwriting recognition. A key factor influencing the performance of systems is features extracted from a sliding window. In this paper, we propose novel baseline-independent feature set wider window to directly capture contextual information. This combination center mass based log-space distribution and inverse percentile features. Center use normalized histogram describe foreground pixels in different direction distances with...

10.1109/icdar.2013.253 article EN 2013-08-01
Coming Soon ...