Song Wang

ORCID: 0000-0003-4152-5295
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Surveillance and Tracking Methods
  • Advanced Image and Video Retrieval Techniques
  • Human Pose and Action Recognition
  • Medical Image Segmentation Techniques
  • Image Retrieval and Classification Techniques
  • Advanced Neural Network Applications
  • Image Enhancement Techniques
  • Advanced Vision and Imaging
  • Anomaly Detection Techniques and Applications
  • Image Processing and 3D Reconstruction
  • Gait Recognition and Analysis
  • Advanced Image Processing Techniques
  • Handwritten Text Recognition Techniques
  • Visual Attention and Saliency Detection
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Image and Object Detection Techniques
  • Image and Signal Denoising Methods
  • Robotics and Sensor-Based Localization
  • Industrial Vision Systems and Defect Detection
  • Advanced Image Fusion Techniques
  • Natural Language Processing Techniques
  • Video Analysis and Summarization
  • Adversarial Robustness in Machine Learning
  • 3D Shape Modeling and Analysis

Shenzhen Technology University
2024-2025

Shenzhen University
2024-2025

Xinyang Normal University
2012-2025

Jilin Academy of Agricultural Sciences
2023-2025

University of South Carolina
2015-2024

Wuhan Polytechnic University
2019-2024

Toronto Metropolitan University
2024

Tianjin University
2018-2024

Hitachi (Japan)
2024

Northeastern University
2013-2024

How to effectively learn temporal variation of target appearance, exclude the interference cluttered background, while maintaining real-time response, is an essential problem visual object tracking. Recently, Siamese networks have shown great potentials matching based trackers in achieving balanced accuracy and beyond realtime speed. However, they still a big gap classification & updating tolerating changes objects imaging conditions. In this paper, we propose dynamic network, via fast...

10.1109/iccv.2017.196 article EN 2017-10-01

Cracks are typical line structures that of interest in many computer-vision applications. In practice, cracks, e.g., pavement show poor continuity and low contrast, which brings great challenges to image-based crack detection by using low-level features. this paper, we propose DeepCrack - an end-to-end trainable deep convolutional neural network for automatic learning high-level features representation. method, multi-scale learned at hierarchical stages fused together capture the structures....

10.1109/tip.2018.2878966 article EN IEEE Transactions on Image Processing 2018-10-31

Software defect prediction, which predicts defective code regions, can help developers find bugs and prioritize their testing efforts. To build accurate prediction models, previous studies focus on manually designing features that encode the characteristics of programs exploring different machine learning algorithms. Existing traditional often fail to capture semantic differences programs, such a capability is needed for building models.

10.1145/2884781.2884804 article EN Proceedings of the 44th International Conference on Software Engineering 2016-05-13

Human visual perception shows good consistency for many multi-label image classification tasks under certain spatial transforms, such as scaling, rotation, flipping and translation. This has motivated the data augmentation strategy widely used in CNN classifier training -- transformed images are included by assuming same class labels their original images. In this paper, we further propose assumption of perceptual attention regions i.e., region a follows transform if input is spatially...

10.1109/cvpr.2019.00082 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

After going through the deep network, there will be some loss of pedestrian information, which cause disappearance gradients, causing inaccurate detection. This paper improves network structure YOLO algorithm and proposes a new YOLO-R. First, three Passthrough layers were added to original network. The layer consists Route Reorg layer. Its role is connect shallow features link high low resolution features. pass characteristic information specified current layer, then use reorganize feature...

10.1109/icma.2018.8484698 article EN 2022 IEEE International Conference on Mechatronics and Automation (ICMA) 2018-08-01

Just like many other topics in computer vision, image classification has achieved significant progress recently by using deep learning neural networks, especially the Convolutional Neural Networks (CNNs). Most of existing works focused on classifying very clear natural images, evidenced widely used databases, such as Caltech-256, PASCAL VOCs, and ImageNet. However, real applications, acquired images may contain certain degradations that lead to various kinds blurring, noise, distortions. One...

10.1109/tpami.2019.2950923 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-11-01

Semantic segmentation of nighttime images plays an equally important role as that daytime in autonomous driving, but the former is much more challenging due to poor illuminations and arduous human annotations. In this paper, we propose a novel domain adaptation network (DANNet) for semantic without using labeled image data. It employs adversarial training with dataset unlabeled contains coarsely aligned day-night pairs. Specifically, pairs, use pixel-level predictions static object...

10.1109/cvpr46437.2021.01551 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

This paper proposes a new cost function, cut ratio, for segmenting images using graph-based methods. The ratio is defined as the of corresponding sums two different weights edges along boundary and models mean affinity between segments separated by per unit length. function allows image perimeter to be segmented, guarantees that produced bipartitioning are connected, does not introduce size, shape, smoothness, or boundary-length bias. latter it produce segmentations where boundaries aligned...

10.1109/tpami.2003.1201819 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2003-06-01

Recognizing human activities in partially observed videos is a challenging problem and has many practical applications. When the unobserved subsequence at end of video, reduced to activity prediction from unfinished streaming, which been studied by researchers. However, general case, an may occur any time yielding temporal gap video. In this paper, we propose new method that can recognize case. Specifically, formulate into probabilistic framework: 1) dividing each multiple ordered segments,...

10.1109/cvpr.2013.343 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

We propose a single network trained by pixel-to-label deep learning to address the general issue of automatic multiple organ segmentation in three-dimensional (3D) computed tomography (CT) images. Our method can be described as voxel-wise multiple-class classification scheme for automatically assigning labels each pixel/voxel 2D/3D CT image.We simplify algorithms anatomical structures (including organs) image (generally 3D) majority voting over semantic 2D slices drawn from different...

10.1002/mp.12480 article EN cc-by-nc Medical Physics 2017-07-21

Because of the various appearance (different writers, writing styles, noise, etc.), handwritten character recognition is one most challenging task in pattern recognition. Through decades research, traditional method has reached its limit while emergence deep learning provides a new way to break this limit. In paper, CNN-based framework proposed. framework, proper sample generation, training scheme and CNN network structure are employed according properties characters. experiments, proposed...

10.1109/acpr.2015.7486592 article EN 2015-11-01

Hash coding has been widely used in the approximate nearest neighbor search for large-scale image retrieval. Recently, many deep hashing methods have proposed and shown largely improved performance over traditional feature-learning methods. Most of these examine pairwise similarity on semantic-level labels, where is generally defined a hard-assignment way. That is, “1” if they share no less than one class label “0” do not any. However, such definition cannot reflect ranking images that hold...

10.1109/tmm.2019.2929957 article EN IEEE Transactions on Multimedia 2019-07-31

Existing shadow detection methods suffer from an intrinsic limitation in relying on limited labeled datasets, and they may produce poor results some complicated situations. To boost the performance, this paper presents a multi-task mean teacher model for semi-supervised by leveraging unlabeled data exploring learning of multiple information shadows simultaneously. be specific, we first build baseline to simultaneously detect regions, edges, count their complementary assign student network....

10.1109/cvpr42600.2020.00565 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Building on studies of the hybrid media system and attention economy, we develop concept amplification to explore how activities social media–based publics may enlarge paid a given person or message. We apply 2016 US election, asking who constituted Donald Trump’s enormous Twitter following that contributed his success at attracting attention, including from mainstream press. Using spectral clustering based network similarity, identify key demonstrate particular amplified presence in...

10.1177/1461444817744390 article EN New Media & Society 2017-12-04

The accuracy of stereo matching has been greatly improved by using deep learning with convolutional neural networks. To further capture the details disparity maps, in this paper, we propose a novel semantic network named SSPCV-Net, which includes newly designed pyramid cost volumes for describing and spatial information on multiple levels. features are inferred segmentation subnetwork while derived hierarchical pooling. In end, design 3D multi-cost aggregation module to integrate extracted...

10.1109/iccv.2019.00758 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

We propose a new learning-based method for estimating 2D human pose from single image, using Dual-Source Deep Convolutional Neural Networks (DS-CNN). Recently, many methods have been developed to estimate by priors that are estimated physiologically inspired graphical models or learned holistic perspective. In this paper, we integrate both the local (body) part appearance and view of each more accurate estimation. Specifically, proposed DS-CNN takes set image patches (category-independent...

10.1109/cvpr.2015.7298740 article EN 2015-06-01

With a good balance between tracking accuracy and speed, correlation filter (CF) has become one of the best object frameworks, based on which many successful trackers have been developed. Recently, spatially regularized CF (SRDCF) developed to remedy annoying boundary effects tracking, thus further boosting performance. However, SRDCF uses fixed spatial regularization map constructed from loose bounding box its performance inevitably degrades when target or background show significant...

10.1109/tip.2019.2895411 article EN IEEE Transactions on Image Processing 2019-01-25

Remote sensing (RS) scene classification plays an important role in the field of earth observation. With rapid development RS techniques, a large number images are available. As manually labeling large-scale is both labor and time consuming, when new unlabeled data set obtained, how to use existing labeled sets classify research direction. Different image may be taken from different type sensors, vary imaging modalities, spatial resolutions, scales, so distribution discrepancy exists among...

10.1109/lgrs.2019.2896411 article EN publisher-specific-oa IEEE Geoscience and Remote Sensing Letters 2019-02-25

Gait has been considered as a promising and unique biometric for person identification. Traditionally, gait data are collected using either color sensors, such CCD camera, depth Microsoft Kinect, or inertial an accelerometer. However, single type of sensors may only capture part the dynamic features make recognition sensitive to complex covariate conditions, leading fragile gait-based identification systems. In this paper, we propose combine all three types collection recognition, which can...

10.1109/tcyb.2017.2682280 article EN IEEE Transactions on Cybernetics 2017-03-31

We propose a new learning-based method for estimating 2D human pose from single image, using Dual-Source Deep Convolutional Neural Networks (DS-CNN). Recently, many methods have been developed to estimate by priors that are estimated physiologically inspired graphical models or learned holistic perspective. In this paper, we integrate both the local (body) part appearance and view of each more accurate estimation. Specifically, proposed DS-CNN takes set image patches (category-independent...

10.48550/arxiv.1504.07159 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Shadow removal is a computer-vision task that aims to restore the image content in shadow regions. While almost all recent shadow-removal methods require shadow-free images for training, ECCV 2020 Le and Samaras introduces an innovative approach without this requirement by cropping patches with shadows from as training samples. However, it still laborious time-consuming construct large amount of such unpaired patches. In paper, we propose new G2R-ShadowNet which leverages generation...

10.1109/cvpr46437.2021.00489 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Shadow removal can significantly improve the image visual quality and has many applications in computer vision. Deep learning methods based on CNNs have become most effective approach for shadow by training either paired data, where both underlying shadow-free versions of an are known, or unpaired images totally different with no correspondence. In practice, CNN data is more preferred given easiness collection. this paper, we present a new Lightness-Guided Removal Network (LG-ShadowNet)...

10.1109/tip.2020.3048677 article EN IEEE Transactions on Image Processing 2021-01-01

Long-tailed data distribution is common in many multi-label visual recognition tasks and the direct use of these for training usually leads to relatively low performance on tail classes. While re-balanced sampling can improve classes, it may also hurt head classes due label co-occurrence. In this paper, we propose a new approach train both uniform samplings collaborative way, resulting improvement More specifically, design network with two branches: one takes as input while other input. For...

10.1109/cvpr46437.2021.01484 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01
Coming Soon ...