Shuang Wu

ORCID: 0000-0001-9245-6037
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Video Surveillance and Tracking Methods
  • Video Analysis and Summarization
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Music and Audio Processing
  • Anomaly Detection Techniques and Applications
  • Gait Recognition and Analysis
  • Advanced Chemical Sensor Technologies
  • Advanced Neural Network Applications
  • Robotics and Sensor-Based Localization
  • Visual Attention and Saliency Detection
  • Automated Road and Building Extraction
  • Biometric Identification and Security
  • Remote Sensing and LiDAR Applications
  • Phonocardiography and Auscultation Techniques
  • Face recognition and analysis
  • COVID-19 diagnosis using AI
  • Spectroscopy and Chemometric Analyses
  • Diverse Aspects of Tourism Research
  • Inertial Sensor and Navigation
  • Retinal Imaging and Analysis
  • Advanced Sensor and Energy Harvesting Materials
  • Advanced Control Systems Optimization
  • Structural Engineering and Materials Analysis

Nvidia (United States)
2025

Beijing Institute of Technology
2024

Shanghai Tongji Urban Planning and Design Institute
2024

Chongqing University of Technology
2023-2024

Qingdao University
2024

Shandong University of Science and Technology
2024

Sichuan University
2020-2023

Sesame Workshop
2023

Qilu University of Technology
2023

Shanghai Jiao Tong University
2012-2022

Camouflaged object detection (COD) aims to identify objects that are perfectly embedded in their environment, which has various downstream applications fields such as medicine, art, and agriculture. However, it is an extremely challenging task spot camouflaged with the perception ability of human eyes. Hence, we claim goal COD not just mimic visual a single RGB domain, but go beyond biological vision. We then introduce frequency domain additional clue better detect from backgrounds. To well...

10.1109/cvpr52688.2022.00446 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Combining multiple low-level visual features is a proven and effective strategy for range of computer vision tasks. However, limited attention has been paid to combining such with information from other modalities, as audio videotext, large scale analysis web videos. In our work, we rigorously analyze combine set that capture appearance, color, motion, audio-visual co-occurrence patterns in We also evaluate the utility high-level (i.e., semantic) obtained detecting scene, object, action...

10.1109/cvpr.2012.6247814 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

Current state-of-the-art systems for visual content analysis require large training sets each class of interest, and performance degrades rapidly with fewer examples. In this paper, we present a general framework the zeroshot learning problem performing high-level event detection no exemplars, using only textual descriptions. This task goes beyond traditional zero-shot adapting given set classes data to unseen classes. We leverage video image collections free-form text descriptions from...

10.1109/cvpr.2014.341 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

The ABC/2 method is usually applied to evaluate intracerebral hemorrhage (ICH) volume on computed tomography (CT), although it might be inaccurate and not applicable in estimating extradural or subdural (EDH, SDH) due their irregular hematoma shapes. This study aimed deep framework optimized for the segmentation quantification of ICH, EDH, SDH.The training datasets were 3,000 images retrospectively collected from a collaborating hospital (Hospital A) segmented by Dense U-Net framework. Three...

10.3389/fnins.2020.541817 article EN cc-by Frontiers in Neuroscience 2021-01-11

10.1016/j.imavis.2024.105039 article EN Image and Vision Computing 2024-04-23

Emphasizing self-improvement, simulation-driven refinement, and reduced human oversight for autonomous machines development.

10.1145/3708012 article EN Communications of the ACM 2025-03-19

Pneumoconiosis staging has been a very challenging task, both for certified radiologists and computer-aided detection algorithms. Although deep learning shown proven advantages in the of pneumoconiosis, it remains pneumoconiosis due to stage ambiguity noisy samples caused by misdiagnosis when they are used training models. In this article, we propose fully paradigm that comprises segmentation procedure procedure. The extracts lung fields chest radiographs through an Asymmetric...

10.1109/jbhi.2022.3190923 article EN IEEE Journal of Biomedical and Health Informatics 2022-07-14

Vectorized high-definition (HD) map is essential for autonomous driving, providing detailed and precise environmental information advanced perception planning. However, current vectorization methods often exhibit deviations, the existing evaluation metric lacks sufficient sensitivity to detect these deviations. To address limitations, we propose integrating philosophy of rasterization into vectorization. Specifically, introduce a new rasterization-based metric, which has superior better...

10.48550/arxiv.2306.10502 preprint EN cc-by-nc-sa arXiv (Cornell University) 2023-01-01

This paper presents a novel method for traffic sign detection and visibility evaluation from mobile Light Detection Ranging (LiDAR) point clouds the corresponding images. Our algorithm involves two steps. Firstly, based on high retro-reflectivity of MLS is designed in complicated road scenes. To solve spatial features signs, we also create geo-referenced relations between signs roads according to normal ground. Secondly, propose estimation evaluate level combination visual appearance...

10.1109/igarss.2015.7325826 article EN 2015-07-01

In this paper, we propose a novel deep end-to-end network to automatically learn the spatial-temporal fusion features for video-based person re-identification. Specifically, proposed consists of CNN and RNN jointly both spatial temporal input image sequences. The is optimized by utilizing siamese softmax losses simultaneously pull instances same closer push different persons apart. Our trained on full-body part-body sequences respectively complementary representations from holistic local...

10.1109/cvprw.2017.191 article EN 2017-07-01

UAV remote sensing has been widely used in emergency rescue, disaster relief, environmental monitoring, urban planning, and so on. Image recognition image location monitoring become an academic hotspot the field of computer vision. Convolution neural network model is most commonly processing model. Compared with traditional artificial model, convolution more hidden layers. Its unique pooling operations have higher efficiency processing. It incomparable advantages other forms two-dimensional...

10.1186/s13640-018-0391-6 article EN cc-by EURASIP Journal on Image and Video Processing 2018-12-01

10.1016/j.patcog.2017.06.026 article EN publisher-specific-oa Pattern Recognition 2017-06-28

10.1007/s11042-017-4568-2 article EN Multimedia Tools and Applications 2017-03-20

In this paper, we propose a graph correspondence transfer (GCT) approach for person re-identification. Unlike existing methods, the GCT model formulates re-identification as an off-line matching and on-line transferring problem. specific, during training, aims to learn set of templates from positive training pairs with various pose-pair configurations via patch-wise matching. During testing, each pair test samples, select few most similar references, correspondences these references feature...

10.48550/arxiv.1804.00242 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Face recognition using a single sample per person is challenging problem in computer vision. In this scenario, due to the lack of training samples, it difficult distinguish between inter-class variations caused by identity and intra-class external factors such as illumination, pose, etc. To address problem, we propose scheme improve rate both generating additional samples enrich intra-variation eliminating extract invariant features. Firstly, 3D face modeling module proposed recover...

10.3390/app10020601 article EN cc-by Applied Sciences 2020-01-14

Crowd video retrieval is an important problem in surveillance management the era of big data, e.g., indexing and browsing. In this paper, we address issue from motion-level perspective by using hand-drawn sketches as queries. Motion sketch based crowd naturally suffers challenges representation. We tackle them leveraging motion structure coding algorithm to extract robust structure-preserved descriptors. For indexing, use decomposition separate sub-motion vector fields with typical patterns...

10.1109/icip.2016.7532549 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2016-08-17
Coming Soon ...