Wenbo Li

ORCID: 0000-0002-3122-5400
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Surveillance and Tracking Methods
  • Human Pose and Action Recognition
  • Generative Adversarial Networks and Image Synthesis
  • Topic Modeling
  • Advanced Image Processing Techniques
  • Natural Language Processing Techniques
  • Geological and Geochemical Analysis
  • Anomaly Detection Techniques and Applications
  • Geochemistry and Geologic Mapping
  • Advanced Image and Video Retrieval Techniques
  • Face recognition and analysis
  • Image Retrieval and Classification Techniques
  • Gait Recognition and Analysis
  • Visual Attention and Saliency Detection
  • Advanced Vision and Imaging
  • Multimodal Machine Learning Applications
  • Text and Document Classification Technologies
  • earthquake and tectonic studies
  • Video Analysis and Summarization
  • Advanced Text Analysis Techniques
  • High-pressure geophysics and materials
  • Medical Image Segmentation Techniques
  • Hand Gesture Recognition Systems
  • Image Enhancement Techniques
  • Primate Behavior and Ecology

Xidian University
2023-2025

Anhui University
2022-2025

Chinese Academy of Sciences
2007-2024

Changsha University of Science and Technology
2023-2024

Samsung (United States)
2020-2024

Research!America (United States)
2020-2024

Shenzhen Institutes of Advanced Technology
2024

Nazarbayev University
2024

Second Xiangya Hospital of Central South University
2023

Central South University
2023

The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by VOT initiative. Results of 51 trackers are presented; many state-of-the-art published at major computer vision conferences or journals in recent years. evaluation included standard and other popular methodologies a new "real-time" experiment simulating situation where processes images as if provided continuously running sensor. Performance tested typically far exceeds baselines. source...

10.1109/iccvw.2017.230 preprint EN 2017-10-01

In this paper, we propose Object-driven Attentive Generative Adversarial Newtorks (Obj-GANs) that allow attention-driven, multi-stage refinement for synthesizing complex images from text descriptions. With a novel object-driven attentive generative network, the Obj-GAN can synthesize salient objects by paying attention to their most relevant words in descriptions and pre-generated class label. addition, object-wise discriminator based on Fast R-CNN model is proposed provide rich...

10.1109/cvpr.2019.01245 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While methods are seemingly more suited for the task, their performance in current practice is not as good This work studies this issue. We argue that methods' unsatisfactory comes from insufficiency various design choices. propose several improvements, including module design, cross stage feature aggregation, coarse-to-fine supervision. The resulting method establishes new state-of-the-art on...

10.48550/arxiv.1901.00148 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Multi-target tracking is an interesting but challenging task in computer vision field. Most previous data association based methods merely consider the relationships (e.g. appearance and motion pattern similarities) between detections local limited temporal domain, leading to their difficulties handling long-term occlusion distinguishing spatially close targets with similar crowded scenes. In this paper, a novel approach on undirected hierarchical relation hypergraph proposed, which...

10.1109/cvpr.2014.167 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

In this work, we present the RNN Tree (RNN-T), an adaptive learning framework for skeleton based human action recognition. Our method categorizes classes and uses multiple Recurrent Neural Networks (RNNs) in a treelike hierarchy. The RNNs RNN-T are co-trained with category hierarchy, which determines structure of RNN-T. Actions skeletal representations recognized via hierarchical inference process, during individual differentiate finer-grained increasing confidence. Inference ends when any...

10.1109/iccv.2017.161 article EN 2017-10-01

Existing human action recognition systems for 3D sequences obtained from the depth camera are designed to cope with only one category, either single-person or two-person interaction, and difficult be extended scenarios where both categories co-exist. In this paper, we propose category-blind method (CHARM) which can recognize a without making assumptions of category. our CHARM approach, represent (either interaction) class using co-occurrence motion primitives. Subsequently, classify an...

10.1109/iccv.2015.505 article EN 2015-12-01

ChatGPT is a powerful large language model (LLM) that covers knowledge resources such as Wikipedia and supports natural question answering using its own knowledge. Therefore, there growing interest in exploring whether can replace traditional knowledge-based (KBQA) models. Although have been some works analyzing the performance of ChatGPT, still lack large-scale, comprehensive testing various types complex questions to analyze limitations model. In this paper, we present framework follows...

10.48550/arxiv.2303.07992 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Recent advances in online visual tracking focus on designing part-based model to handle the deformation and occlusion challenges. However, previous methods usually consider only pairwise structural dependences of target parts two consecutive frames rather than higher order constraints multiple frames, making them less effective handling large This paper describes a new efficient method for deformable object tracking. Different from most existing methods, this exploits different frames. We...

10.1109/tip.2016.2570556 article EN IEEE Transactions on Image Processing 2016-05-18

This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control. While attention control has proven effective image pre-trained generation models, there are currently no large-scale models publicly available. Video-P2P addresses this limitation by adapting an diffusion model to complete various tasks. Specifically, we propose first tune Text-to-Set (T2S) approximate inversion and then optimize shared unconditional embedding achieve accurate small...

10.48550/arxiv.2303.04761 preprint EN other-oa arXiv (Cornell University) 2023-01-01

ABSTRACT The nighttime behavior of diurnal species is a “black box.” Although animals spend approximately half their lives in the dark, research has, for too long, relied on simplifying assumption that what we can't observe isn't important. Advances our ability to monitor reveal this incorrect; essential biological and behavioral processes play out dark which are critical understanding species' ecology evolution. We conducted study from November 2021 January 2022, using noninvasive 4G...

10.1002/ajp.70016 article EN American Journal of Primatology 2025-02-27

Multi-function radar (MFR) work modes recognition is an important research component of the electronic reconnaissance field. When facing MFR systems equipped with complex mode-waveform mapping relationships and flexible beam scanning techniques, intercepted mode pulse sequences have a wide temporal range feature distributions variable durations, which bring significant challenges for accurate recognition. To address this issue, study constructs novel hierarchical signal model waveform...

10.3390/rs17061054 article EN cc-by Remote Sensing 2025-03-17

Global climate change and human activities are significant threats to biodiversity, contributing the endangerment of approximately 41% amphibian species worldwide. In this study, we applied field survey methods, MaxEnt model, integrated activity data predict potential changes in diversity distribution Huangshan Mountain, China. have found 23 species, belonging two orders, eight families, 18 genera. The models showed that distance from farmland (contributing 26.2%), shrubs (15.6%),...

10.3390/ani15070938 article EN cc-by Animals 2025-03-25

10.3724/sp.j.1016.2008.00620 article EN Chinese Journal of Computers 2009-09-28

In this paper, we explore synthesizing person images with multiple conditions for various backgrounds. To end, propose a framework named ``MISC" conditional image generation and compositing. For generation, improve the existing condition injection mechanisms by leveraging inter-condition correlations. compositing, theoretically prove weaknesses of cutting-edge methods, make it more robust removing spatially-invariance constraint, enabling bounding mechanism spatial adaptability. We show...

10.1109/cvpr42600.2020.00776 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Permian granitoid emplacement represents one of the most important tectonothermal events in northern margin North China Craton (NCC). In this study, we collected geochronological and geochemical data regional northwestern NCC, investigated Dongshengmiao pluton, using it as an example to constrain petrogenesis its geodynamic settings. The pluton contains porphyritic granite quartz diorite. LA-ICP-MS zircon U-Pb dating results have constrained be ca. 287‒275 Ma. granitoids a SiO2 range...

10.1080/00206814.2015.1039087 article EN International Geology Review 2015-04-27
Coming Soon ...