NFDI4DS | UHH-SEMS - Publication Details

Sheng Guo

ORCID: 0000-0003-1385-3152

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100687098

Research Areas

Advanced Image and Video Retrieval Techniques
Advanced Neural Network Applications
Domain Adaptation and Few-Shot Learning
Human Pose and Action Recognition
Multimodal Machine Learning Applications
Image Retrieval and Classification Techniques
Video Surveillance and Tracking Methods
Higher Education and Teaching Methods
Advanced Vision and Imaging
Generative Adversarial Networks and Image Synthesis
Infrared Target Detection Methodologies
Remote-Sensing Image Classification
Image Enhancement Techniques
Anomaly Detection Techniques and Applications
Education and Work Dynamics
Hand Gesture Recognition Systems
Robotics and Sensor-Based Localization
Face recognition and analysis
Advanced Image Processing Techniques
Machine Learning and Data Classification
Image and Object Detection Techniques
Advanced Image Fusion Techniques
Higher Education Learning Practices
Hospitality and Tourism Education
Advanced Measurement and Detection Methods

Hubei University of Technology
2024

Beihang University
2017-2024

Wuhan University of Science and Technology
2022

Alibaba Group (United States)
2021

Wilmington University
2020

MSIGHT Technologies (China)
2019-2020

State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing
2018

Wuhan University
2018

Shenzhen Institutes of Advanced Technology
2015-2017

University of Chinese Academy of Sciences
2014-2017

AdaMixer: A Fast-Converging Query-Based Object Detector

OPENALEX - Publications

Ziteng Gao Limin Wang Bing Han Sheng Guo

Traditional object detectors employ the dense paradigm of scanning over locations and scales in an image. The recent query-based break this convention by decoding image features with a set learnable queries. However, still suffers from slow convergence, limited performance, design complexity extra networks between backbone decoder. In paper, we find that key to these issues is adaptability decoders for casting queries varying objects. Accordingly, propose fast-converging detector, named...

10.1109/cvpr52688.2022.00529 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs

OPENALEX - Publications

Limin Wang Sheng Guo Weilin Huang Yuanjun Xiong Yu Qiao

Convolutional neural networks (CNNs) have made remarkable progress on scene recognition, partially due to these recent large-scale datasets, such as the Places and Places2. Scene categories are often defined by multi-level information, including local objects, global layout, background environment, thus leading large intra-class variations. In addition, with increasing number of categories, label ambiguity has become another crucial issue in classification. This paper focuses recognition...

10.1109/tip.2017.2675339 article EN IEEE Transactions on Image Processing 2017-02-24

Places205-VGGNet Models for Scene Recognition

OPENALEX - Publications

Limin Wang Sheng Guo Weilin Huang Yu Qiao

VGGNets have turned out to be effective for object recognition in still images. However, it is unable yield good performance by directly adapting the VGGNet models trained on ImageNet dataset scene recognition. This report describes our implementation of training large-scale Places205 dataset. Specifically, we train three models, namely VGGNet-11, VGGNet-13, and VGGNet-16, using a Multi-GPU extension Caffe toolbox with high computational efficiency. We verify Places205-VGGNet datasets:...

10.48550/arxiv.1508.01667 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Locally Supervised Deep Hybrid Model for Scene Recognition

OPENALEX - Publications

Sheng Guo Weilin Huang Limin Wang Yu Qiao

Convolutional neural networks (CNN) have recently achieved remarkable successes in various image classification and understanding tasks. The deep features obtained at the top fully-connected layer of CNN (FC-features) exhibit rich global semantic information are extremely effective classification. On other hand, convolutional middle layers also contain meaningful local information, but not fully explored for representation. In this paper, we propose a novel Locally-Supervised Deep Hybrid...

10.1109/tip.2016.2629443 article EN IEEE Transactions on Image Processing 2016-11-16

The iMaterialist Fashion Attribute Dataset

OPENALEX - Publications

Sheng Guo Weilin Huang Xiao Zhang Prasanna Srikhanta Yin Cui and 4 more

Many Large-scale image databases such as ImageNet have significantly advanced classification and other visual recognition tasks. However much of these datasets are constructed only for single-label coarse object-level classification. For real-world applications, multiple labels fine-grained categories often needed, yet very few exist publicly, especially those large-scale high quality. In this work, we contribute to the community a new dataset called iMaterialist Fashion Attribute...

10.1109/iccvw.2019.00377 preprint EN 2019-10-01

Neutralizing the impact of atmospheric turbulence on complex scene imaging via deep learning

OPENALEX - Publications

Darui Jin Ying Chen Yi Lu Junzhang Chen Peng Wang and 3 more

10.1038/s42256-021-00392-1 article EN Nature Machine Intelligence 2021-10-14

Multiple Feature Analysis for Infrared Small Target Detection

OPENALEX - Publications

Yanguang Bi Xiangzhi Bai Ting Jin Sheng Guo

Detection of small target has been an important and challenging task in infrared systems. Most detection algorithms which only use single metric are difficult to separate from clutter completely. The false alarm may be high when there exists complex backgrounds. In this letter, multiple novel features proposed four aspects establish elaborate description. Each feature reflects specific characteristic target. best vector is selected apply these for detection. Then, learning-based classifier...

10.1109/lgrs.2017.2711047 article EN IEEE Geoscience and Remote Sensing Letters 2017-07-03

Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation

OPENALEX - Publications

Weifeng Ge Weilin Huang Sheng Guo Matthew R. Scott

Weakly-supervised instance segmentation aims to detect and segment object instances precisely, given image-level labels only. Unlike previous methods which are composed of multiple offline stages, we propose Sequential Label Propagation Enhancement Networks (referred as Label-PEnet) that progressively transforms pixel-wise in a coarse-to-fine manner. We design four cascaded modules including multi-label classification, detection, refinement segmentation, implemented sequentially by sharing...

10.1109/iccv.2019.00344 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Brain SegNet: 3D local refinement network for brain lesion segmentation

OPENALEX - Publications

Xiaojun Hu Weijian Luo Jiliang Hu Sheng Guo Weilin Huang and 4 more

MR images (MRIs) accurate segmentation of brain lesions is important for improving cancer diagnosis, surgical planning, and prediction outcome. However, manual from 3D MRIs highly expensive, time-consuming, prone to user biases. We present an efficient yet conceptually simple network (referred as Brain SegNet), which a residual framework automatic voxel-wise lesion. Our model able directly predict dense voxel tumor or ischemic stroke regions in MRIs. The proposed can run at about 0.5s per -...

10.1186/s12880-020-0409-2 article EN cc-by BMC Medical Imaging 2020-02-11

V4D:4D Convolutional Neural Networks for Video-level Representation Learning

OPENALEX - Publications

Shiwen Zhang Sheng Guo Weilin Huang Matthew R. Scott Limin Wang

Most existing 3D CNNs for video representation learning are clip-based methods, and thus do not consider video-level temporal evolution of spatio-temporal features. In this paper, we propose Video-level 4D Convolutional Neural Networks, referred as V4D, to model the long-range with convolutions, at same time, preserve strong residual connections. Specifically, design a new block able capture inter-clip interactions, which could enhance power original clip-level CNNs. The blocks can be easily...

10.48550/arxiv.2002.07442 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer

OPENALEX - Publications

Chulun Zhou Liangyu Chen Jiachen Liu Xinyan Xiao Jinsong Su and 2 more

Unsupervised style transfer aims to change the of an input sentence while preserving its original content without using parallel training data. In current dominant approaches, owing lack fine-grained control on influence from target style, they are unable yield desirable output sentences. this paper, we propose a novel attentional sequence-to-sequence (Seq2seq) model that dynamically exploits relevance each word for unsupervised transfer. Specifically, first pretrain classifier, where can be...

10.18653/v1/2020.acl-main.639 article EN cc-by 2020-01-01

Infrared small target detection based on multi-directionality and sparse low-rank recovery

OPENALEX - Publications

Heng Sun Sheng Guo Xiangzhi Bai

10.1016/j.infrared.2025.105828 article EN Infrared Physics & Technology 2025-04-08

Symmetry Information Based Fuzzy Clustering for Infrared Pedestrian Segmentation

OPENALEX - Publications

Xiangzhi Bai Yingfan Wang Haonan Liu Sheng Guo

Pedestrian detection in infrared images is always a challenging task. Segmentation an important step of pedestrian detection. An accurate segmentation could provide more information for further analysis. In this paper, improved Fuzzy C-Means clustering method, which incorporates geometric symmetry information, proposed segmentation. the introduced by Markov random field theory. Moreover, new metric utilized to handle weak pedestrian. addition, whole procedure extract pedestrians. The...

10.1109/tfuzz.2017.2756827 article EN IEEE Transactions on Fuzzy Systems 2017-09-25

Cross-Architecture Self-supervised Video Representation Learning

OPENALEX - Publications

Sheng Guo Zihua Xiong Yujie Zhong Limin Wang Xiaobo Guo and 2 more

In this paper, we present a new cross-architecture contrastive learning (CACL) framework for self-supervised video representation learning. CACL consists of 3D CNN and transformer which are used in parallel to generate diverse positive pairs This allows the model learn strong representations from such yet meaningful pairs. Furthermore, introduce temporal module able predict an Edit distance explicitly between two sequences order. enables rich that compensates strongly video-level learned by...

10.1109/cvpr52688.2022.01867 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

MHSCNET: A Multimodal Hierarchical Shot-Aware Convolutional Network for Video Summarization

OPENALEX - Publications

Wujiang Xu Runzhong Wang Xiaobo Guo Shaoshuai Li Qiongxu Ma and 4 more

Video summarization is an essential problem in signal processing, which intends to produce a concise summary of the original video. Existing video approaches regard task as keyframe selection and generally construct frame-wise representation by combining long-range temporal dependency with either unimodal or bimodal information. The optimal should offer semantic whole content exploiting multimodal shot-level hierarchical natures videos, however, such are not fully exploited existing methods....

10.1109/icassp49357.2023.10096265 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Image-Based Localization Aided Indoor Pedestrian Trajectory Estimation Using Smartphones

OPENALEX - Publications

Yan Zhou Xianwei Zheng Ruizhi Chen Hanjiang Xiong Sheng Guo

Accurately determining pedestrian location in indoor environments using consumer smartphones is a significant step the development of ubiquitous localization services. Many different map-matching methods have been combined with dead reckoning (PDR) to achieve low-cost and bias-free tracking. However, this works only areas dense map constraints error accumulates open areas. In order reliable without constraints, an improved image-based aided trajectory estimation method proposed paper. The...

10.3390/s18010258 article EN cc-by Sensors 2018-01-17

Better Exploiting OS-CNNs for Better Event Recognition in Images

OPENALEX - Publications

Limin Wang Zhe Wang Sheng Guo Yu Qiao

Event recognition from still images is one of the most important problems for image understanding. However, compared with object and scene recognition, event has received much less research attention in computer vision community. This paper addresses problem cultural focuses on applying deep learning methods this problem. In particular, we utilize successful architecture Object-Scene Convolutional Neural Networks (OS-CNNs) to perform recognition. OS-CNNs are composed nets nets, which...

10.1109/iccvw.2015.46 article EN 2015-12-01

InsCLR: Improving Instance Retrieval with Self-Supervision

OPENALEX - Publications

Zelu Deng Yujie Zhong Sheng Guo Weilin Huang

This work aims at improving instance retrieval with self-supervision. We find that fine-tuning using the recently developed self-supervised learning (SSL) methods, such as SimCLR and MoCo, fails to improve performance of retrieval. In this work, we identify learnt representations for should be invariant large variations in viewpoint background etc., whereas self-augmented positives applied by current SSL methods can not provide strong enough signals robust instance-level representations. To...

10.1609/aaai.v36i1.19930 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Knowledge Integration Networks for Action Recognition

OPENALEX - Publications

Shiwen Zhang Sheng Guo Limin Wang Weilin Huang Matthew R. Scott

In this work, we propose Knowledge Integration Networks (referred as KINet) for video action recognition. KINet is capable of aggregating meaningful context features which are great importance to identifying an action, such human information and scene context. We design a three-branch architecture consisting main branch recognition, two auxiliary branches parsing recognition allow the model encode knowledge explore pre-trained models teacher networks distill training tasks KINet....

10.1609/aaai.v34i07.6983 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

A Closer Look at Few-Shot Video Classification: A New Baseline and Benchmark

OPENALEX - Publications

Zhenxi Zhu Limin Wang Sheng Guo Gangshan Wu

The existing few-shot video classification methods often employ a meta-learning paradigm by designing customized temporal alignment module for similarity calculation. While significant progress has been made, these fail to focus on learning effective representations, and heavily rely the ImageNet pre-training, which might be unreasonable recognition setting due semantics overlap. In this paper, we aim present an in-depth study making three contributions. First, perform consistent comparative...

10.48550/arxiv.2110.12358 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Label-PEnet: Sequential Label Propagation and Enhancement Networks for Weakly Supervised Instance Segmentation

OPENALEX - Publications

Weifeng Ge Sheng Guo Weilin Huang Matthew R. Scott

Weakly-supervised instance segmentation aims to detect and segment object instances precisely, given imagelevel labels only. Unlike previous methods which are composed of multiple offline stages, we propose Sequential Label Propagation Enhancement Networks (referred as Label-PEnet) that progressively transform image-level pixel-wise in a coarse-to-fine manner. We design four cascaded modules including multi-label classification, detection, refinement segmentation, implemented sequentially by...

10.48550/arxiv.1910.02624 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Traffic thermal infrared texture generation based on siamese semantic CycleGAN

OPENALEX - Publications

Peng Wang Heng Sun Xiangzhi Bai Sheng Guo Darui Jin

10.1016/j.infrared.2021.103748 article EN Infrared Physics & Technology 2021-04-24

Infrared simulation of large-scale urban scene through LOD

OPENALEX - Publications

Sheng Guo Xixian Xiong Zichao Liu Xiangzhi Bai Fugen Zhou

The growing use of infrared (IR) imaging systems places increasing demands for simulating images real scenes. Utilizing captured from unmanned aerial vehicles (UAV), we propose a semi-automatic pipeline to generate large-scale IR urban scenes in the form levels detail (LODs). It significantly reduces cost labor and time while providing detailed structures. Starting surface meshes generated by multi-view stereo (MVS) systems, produce watertight LODs via semantic segmentation structure-aware...

10.1364/oe.26.023980 article EN cc-by Optics Express 2018-08-31

Coming Soon ...