NFDI4DS | UHH-SEMS - Publication Details

Benchmarking Single-Image Dehazing and Beyond

OPENALEX - Publications

Boyi Li Wenqi Ren Dengpan Fu Dacheng Tao Dan Feng and 2 more

In this paper, we present a comprehensive study and evaluation of existing single image dehazing algorithms, using new large-scale benchmark consisting both synthetic real-world hazy images, called REalistic Single Image DEhazing (RESIDE). RESIDE highlights diverse data sources contents, is divided into five subsets, each serving different training or purposes. We further provide rich variety criteria for algorithm evaluation, ranging from full-reference metrics, to no-reference subjective...

10.1109/tip.2018.2867951 article EN publisher-specific-oa IEEE Transactions on Image Processing 2018-08-30

Image-adaptive watermarking using visual models

OPENALEX - Publications

Christine Podilchuk Wenjun Zeng

The huge success of the Internet allows for transmission, wide distribution, and access electronic data in an effortless manner. Content providers are faced with challenge how to protect their data. This problem has generated a flurry research activity area digital watermarking content copyright protection. here is introduce watermark that does not alter perceived quality content, while being extremely robust attack. For instance, case image data, editing picture or illegal tampering should...

10.1109/49.668975 article EN IEEE Journal on Selected Areas in Communications 1998-05-01

Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks

OPENALEX - Publications

Wentao Zhu Cuiling Lan Junliang Xing Wenjun Zeng Yanghao Li and 2 more

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network recognition. Inspired by observation co-occurrences joints intrinsically characterize...

10.1609/aaai.v30i1.10451 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2016-03-05

An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

OPENALEX - Publications

Sijie Song Cuiling Lan Junliang Xing Wenjun Zeng Jiaying Liu

Human action recognition is an important task in computer vision. Extracting discriminative spatial and temporal features to model the evolutions of different actions plays a key role accomplishing this task. In work, we propose end-to-end attention for human from skeleton data. We build our on top Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM), which learns selectively focus joints within each frame inputs pays levels outputs frames. Furthermore, ensure effective...

10.1609/aaai.v31i1.11212 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2017-02-12

A Twofold Siamese Network for Real-Time Object Tracking

OPENALEX - Publications

Anfeng He Chong Luo Xinmei Tian Wenjun Zeng

Observing that Semantic features learned in an image classification task and Appearance a similarity matching complement each other, we build twofold Siamese network, named SA-Siam, for real-time object tracking. SA-Siam is composed of semantic branch appearance branch. Each similaritylearning network. An important design choice to separately train the two branches keep heterogeneity types features. In addition, propose channel attention mechanism Channel-wise weights are computed according...

10.1109/cvpr.2018.00508 preprint EN 2018-06-01

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

OPENALEX - Publications

Pengfei Zhang Cuiling Lan Junliang Xing Wenjun Zeng Jianru Xue and 1 more

Skeleton-based human action recognition has recently attracted increasing attention due to the popularity of 3D skeleton data. One main challenge lies in large view variations captured actions. We propose a novel adaptation scheme automatically regulate observation viewpoints during occurrence an action. Rather than re-positioning skeletons based on defined prior criterion, we design adaptive recurrent neural network (RNN) with LSTM architecture, which enables itself adapt most suitable from...

10.1109/iccv.2017.233 article EN 2017-10-01

Relation-Aware Global Attention for Person Re-Identification

OPENALEX - Publications

Zhizheng Zhang Cuiling Lan Wenjun Zeng Xin Jin Zhibo Chen

For person re-identification (re-id), attention mechanisms have become attractive as they aim at strengthening discriminative features and suppressing irrelevant ones, which matches well the key of re-id, i.e., feature learning. Previous approaches typically learn using local convolutions, ignoring mining knowledge from global structure patterns. Intuitively, affinities among spatial positions/nodes in map provide clustering-like information are helpful for inferring semantics thus...

10.1109/cvpr42600.2020.00325 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Generalizing to Unseen Domains: A Survey on Domain Generalization

OPENALEX - Publications

Jindong Wang Cuiling Lan Chang Liu Yidong Ouyang Tao Qin and 4 more

Machine learning systems generally assume that the training and testing distributions are same. To this end, a key requirement is to develop models can generalize unseen distributions. Domain generalization (DG), i.e., out-of-distribution generalization, has attracted increasing interests in recent years. deals with challenging setting where one or several different but related domain(s) given, goal learn model an test domain. Great progress been made area of domain for This paper presents...

10.1109/tkde.2022.3178128 article EN IEEE Transactions on Knowledge and Data Engineering 2022-01-01

Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

OPENALEX - Publications

Pengfei Zhang Cuiling Lan Wenjun Zeng Junliang Xing Jianru Xue and 1 more

Skeleton-based human action recognition has attracted great interest thanks to the easy accessibility of skeleton data. Recently, there is a trend using very deep feedforward neural networks model 3D coordinates joints without considering computational efficiency. In this paper, we propose simple yet effective semantics-guided network (SGN) for skeleton-based recognition. We explicitly introduce high level semantics (joint type and frame index) into enhance feature representation capability....

10.1109/cvpr42600.2020.00119 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

OPENALEX - Publications

Sijie Song Cuiling Lan Junliang Xing Wenjun Zeng Jiaying Liu

Human action recognition is an important task in computer vision. Extracting discriminative spatial and temporal features to model the evolutions of different actions plays a key role accomplishing this task. In work, we propose end-to-end attention for human from skeleton data. We build our on top Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM), which learns selectively focus joints within each frame inputs pays levels outputs frames. Furthermore, ensure effective...

10.48550/arxiv.1611.06067 preprint EN other-oa arXiv (Cornell University) 2016-01-01

View Adaptive Neural Networks for High Performance Skeleton-Based Human Action Recognition

OPENALEX - Publications

Pengfei Zhang Cuiling Lan Junliang Xing Wenjun Zeng Jianru Xue and 1 more

Skeleton-based human action recognition has recently attracted increasing attention thanks to the accessibility and popularity of 3D skeleton data. One key challenges in lies large variations representations when they are captured from different viewpoints. In order alleviate effects view variations, this paper introduces a novel adaptation scheme, which automatically determines virtual observation viewpoints over course an learning based data driven manner. Instead re-positioning skeletons...

10.1109/tpami.2019.2896631 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-01-31

CNSA: a data repository for archiving omics data

OPENALEX - Publications

Xueqin Guo Fengzhen Chen Fei Gao Ling Li Ke Liu and 21 more

Abstract With the application and development of high-throughput sequencing technology in life health sciences, massive multi-omics data brings problem efficient management utilization. Database biocuration are prerequisites for reuse these big data. Here, relying on China National GeneBank (CNGB), we present CNGB Sequence Archive (CNSA) archiving omics data, including raw its further analyzed results which organized into six objects, namely Project, Sample, Experiment, Run, Assembly...

10.1093/database/baaa055 article EN cc-by Database 2020-01-01

Style Normalization and Restitution for Generalizable Person Re-Identification

OPENALEX - Publications

Xin Jin Cuiling Lan Wenjun Zeng Zhibo Chen Li Zhang

Existing fully-supervised person re-identification (ReID) methods usually suffer from poor generalization capability caused by domain gaps. The key to solving this problem lies in filtering out identity-irrelevant interference and learning domain-invariant representations. In paper, we aim design a generalizable ReID framework which trains model on source domains yet is able generalize/perform well target domains. To achieve goal, propose simple effective Style Normalization Restitution...

10.1109/cvpr42600.2020.00321 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Densely Semantically Aligned Person Re-Identification

OPENALEX - Publications

Zhizheng Zhang Cuiling Lan Wenjun Zeng Zhibo Chen

We propose a densely semantically aligned person re-identification (re-ID) framework. It fundamentally addresses the body misalignment problem caused by pose/viewpoint variations, imperfect detection, occlusion, etc.. By leveraging estimation of dense semantics image, we construct set part images (DSAP-images), where same spatial positions have across different images. design two-stream network that consists main full image stream (MF-Stream) and semantically-aligned guiding (DSAG-Stream)....

10.1109/cvpr.2019.00076 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking

OPENALEX - Publications

Guangting Wang Chong Luo Zhiwei Xiong Wenjun Zeng

The greatest challenge facing visual object tracking is the simultaneous requirements on robustness and discrimination power. In this paper, we propose a SiamFC-based tracker, named SPM-Tracker, to tackle challenge. basic idea address two in separate matching stages. Robustness strengthened coarse (CM) stage through generalized training while power enhanced fine (FM) distance learning network. stages are connected series as input proposals of FM generated by CM stage. They also parallel...

10.1109/cvpr.2019.00376 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network

OPENALEX - Publications

Dacheng Yin Chong Luo Zhiwei Xiong Wenjun Zeng

Time-frequency (T-F) domain masking is a mainstream approach for single-channel speech enhancement. Recently, focuses have been put to phase prediction in addition amplitude prediction. In this paper, we propose phase-and-harmonics-aware deep neural network (DNN), named PHASEN, task. Unlike previous methods which directly use complex ideal ratio mask supervise the DNN learning, design two-stream network, where stream and are dedicated We discover that two streams should communicate with each...

10.1609/aaai.v34i05.6489 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition

OPENALEX - Publications

Yizhou Zhou Xiaoyan Sun Zheng-Jun Zha Wenjun Zeng

Human actions in videos are three-dimensional (3D) signals. Recent attempts use 3D convolutional neural networks (CNNs) to explore spatio-temporal information for human action recognition. Though promising, CNNs have not achieved high performance on this task with respect their well-established two-dimensional (2D) counterparts visual recognition still images. We argue that the training complexity of fusion and huge memory cost convolution hinder current CNNs, which stack convolutions layer...

10.1109/cvpr.2018.00054 article EN 2018-06-01

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

OPENALEX - Publications

Wentao Zhu Cuiling Lan Junliang Xing Wenjun Zeng Yanghao Li and 2 more

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network recognition. Inspired by observation co-occurrences joints intrinsically characterize...

10.48550/arxiv.1603.07772 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection

OPENALEX - Publications

Sijie Song Cuiling Lan Junliang Xing Wenjun Zeng Jiaying Liu

Human action analytics has attracted a lot of attention for decades in computer vision. It is important to extract discriminative spatio-temporal features model the spatial and temporal evolutions different actions. In this paper, we propose explore human recognition detection from skeleton data. We build our networks based on recurrent neural with long short-term memory units. The learned capable selectively focusing joints skeletons within each input frame paying levels outputs frames. To...

10.1109/tip.2018.2818328 article EN IEEE Transactions on Image Processing 2018-03-22

Cross View Fusion for 3D Human Pose Estimation

OPENALEX - Publications

Haibo Qiu Chunyu Wang Jingdong Wang Naiyan Wang Wenjun Zeng

We present an approach to recover absolute 3D human poses from multi-view images by incorporating geometric priors in our model. It consists of two separate steps: (1) estimating the 2D and (2) recovering poses. First, we introduce a cross-view fusion scheme into CNN jointly estimate for multiple views. Consequently, pose estimation each view already benefits other Second, recursive Pictorial Structure Model gradually improves accuracy with affordable computational cost. test method on...

10.1109/iccv.2019.00444 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking

OPENALEX - Publications

Yifu Zhang Chunyu Wang Xinggang Wang Wenjun Zeng Wenyu Liu

10.1007/s11263-021-01513-4 article EN International Journal of Computer Vision 2021-09-03

Tracking by Instance Detection: A Meta-Learning Approach

OPENALEX - Publications

Guangting Wang Chong Luo Xiaoyan Sun Zhiwei Xiong Wenjun Zeng

We consider the tracking problem as a special type of object detection problem, which we call instance detection. With proper initialization, detector can be quickly converted into tracker by learning new from single image. find that model-agnostic meta-learning (MAML) offers strategy to initialize satisfies our needs. propose principled three-step approach build high-performance tracker. First, pick any modern trained with gradient descent. Second, conduct offline training (or...

10.1109/cvpr42600.2020.00632 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Pseudo-sequence-based light field image compression

OPENALEX - Publications

Dong Liu Lizhi Wang Li Li Zhiwei Xiong Feng Wu and 1 more

We propose a pseudo-sequence-based scheme for light field image compression. In our scheme, the raw captured by camera is decomposed into multiple views according to lenslet array of that camera. These constitute pseudo sequence like video, and redundancy between exploited video encoder. The specific coding order views, prediction structure, rate allocation have been investigated encoding sequence. Experimental results show superior performance which achieves as high 6.6 dB gain compared...

10.1109/icmew.2016.7574674 article EN 2016-07-01