NFDI4DS | UHH-SEMS - Publication Details

Jing Liu

ORCID: 0000-0003-0903-9131

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5108392430

Research Areas

Advanced Image and Video Retrieval Techniques
Multimodal Machine Learning Applications
Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Image Retrieval and Classification Techniques
Video Surveillance and Tracking Methods
Human Pose and Action Recognition
Video Analysis and Summarization
Visual Attention and Saliency Detection
Remote-Sensing Image Classification
Topic Modeling
Advanced Vision and Imaging
Face and Expression Recognition
Natural Language Processing Techniques
Generative Adversarial Networks and Image Synthesis
Medical Image Segmentation Techniques
Anomaly Detection Techniques and Applications
Image Processing Techniques and Applications
Robotics and Sensor-Based Localization
Image Enhancement Techniques
Music and Audio Processing
Gait Recognition and Analysis
Image and Object Detection Techniques
Text and Document Classification Technologies
Infrared Target Detection Methodologies

Chinese Academy of Sciences
2016-2025

Shandong Institute of Automation
2013-2025

Institute of Microelectronics
2025

Institute of Automation
2014-2024

Mitsubishi Electric (United States)
2024

University of Chinese Academy of Sciences
2018-2024

Beijing Academy of Artificial Intelligence
2020-2024

Shandong University of Traditional Chinese Medicine
2017-2024

Jinling Institute of Technology
2024

China University of Mining and Technology
2024

Dual Attention Network for Scene Segmentation

OPENALEX - Publications

Jun Fu Jing Liu Haijie Tian Yong Li Yongjun Bao and 2 more

In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on self-attention mechanism. Unlike previous works that capture contexts multi-scale features fusion, propose a Dual Attention Networks (DANet) to adaptively integrate local with their global dependencies. Specifically, append two types of attention modules top traditional dilated FCN, which model semantic interdependencies in spatial and channel dimensions respectively. The position module...

10.1109/cvpr.2019.00326 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Unsupervised Feature Selection Using Nonnegative Spectral Analysis

OPENALEX - Publications

Zechao Li Yi Yang Jing Liu Xiaofang Zhou Hanqing Lu

In this paper, a new unsupervised learning algorithm, namely Nonnegative Discriminative Feature Selection (NDFS), is proposed. To exploit the discriminative information in scenarios, we perform spectral clustering to learn cluster labels of input samples, during which feature selection performed simultaneously. The joint and matrix enables NDFS select most features. more accurate labels, nonnegative constraint explicitly imposed class indicators. reduce redundant or even noisy features,...

10.1609/aaai.v26i1.8289 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-09-20

Scene Segmentation With Dual Relation-Aware Attention Network

OPENALEX - Publications

Jun Fu Jing Liu Jie Jiang Yong Li Yongjun Bao and 1 more

In this article, we propose a Dual Relation-aware Attention Network (DRANet) to handle the task of scene segmentation. How efficiently exploit context is essential for pixel-level recognition. To address issue, adaptively capture contextual information based on relation-aware attention mechanism. Especially, append two types modules top dilated fully convolutional network (FCN), which model dependencies in spatial and channel dimensions, respectively. modules, adopt self-attention mechanism...

10.1109/tnnls.2020.3006524 article EN IEEE Transactions on Neural Networks and Learning Systems 2020-08-03

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

OPENALEX - Publications

Longteng Guo Jing Liu Xinxin Zhu Peng Yao Shichen Lu and 1 more

Self-attention (SA) network has shown profound value in image captioning. In this paper, we improve SA from two aspects to promote the performance of First, propose Normalized Self-Attention (NSA), a reparameterization that brings benefits normalization inside SA. While is previously only applied outside SA, introduce novel method and demonstrate it both possible beneficial perform on hidden activations Second, compensate for major limit Transformer fails model geometry structure input...

10.1109/cvpr42600.2020.01034 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Stacked Deconvolutional Network for Semantic Segmentation

OPENALEX - Publications

Jun Fu Jing Liu Yuhang Wang Jin Zhou Changyong Wang and 1 more

Recent progress in semantic segmentation has been driven by improving the spatial resolution under Fully Convolutional Networks (FCNs). To address this problem, we propose a Stacked Deconvolutional Network (SDN) for segmentation. In SDN, multiple shallow deconvolutional networks, which are called as SDN units, stacked one to integrate contextual information and bring fine recovery of localization information. Meanwhile, inter-unit intra-unit connections designed assist network training...

10.1109/tip.2019.2895460 article EN IEEE Transactions on Image Processing 2019-01-25

Adaptive Context Network for Scene Parsing

OPENALEX - Publications

Jun Fu Jing Liu Yuhang Wang Yong Li Yongjun Bao and 2 more

Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network exploit useful contexts across all pixels equally. However, in this paper, we find that the context demands are varying from or regions each image. Based on observation, propose an Adaptive Context Network (ACNet) capture pixel-aware competitive fusion global local according per-pixel demands. Specifically, when given pixel, demand is...

10.1109/iccv.2019.00685 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

AAformer: Auto-Aligned Transformer for Person Re-Identification

OPENALEX - Publications

Kuan Zhu Haiyun Guo Shiliang Zhang Yaowei Wang Jing Liu and 2 more

In person re-identification (re-ID), extracting part-level features from images has been verified to be crucial offer fine-grained information. Most of the existing CNN-based methods only locate human parts coarsely, or rely on pretrained parsing models and fail in locating identifiable nonhuman (e.g., knapsack). this article, we introduce an alignment scheme transformer architecture for first time propose auto-aligned (AAformer) automatically both ones at patch level. We "Part tokens...

10.1109/tnnls.2023.3301856 article EN IEEE Transactions on Neural Networks and Learning Systems 2023-08-25

Image annotation via graph learning

OPENALEX - Publications

Jing Liu Mingjing Li Qingshan Liu Hanqing Lu Songde Ma

10.1016/j.patcog.2008.04.012 article EN Pattern Recognition 2008-05-04

MSCap: Multi-Style Image Captioning With Unpaired Stylized Text

OPENALEX - Publications

Longteng Guo Jing Liu Peng Yao Jiangwei Li Hanqing Lu

In this paper, we propose an adversarial learning network for the task of multi-style image captioning (MSCap) with a standard factual caption dataset and multi-stylized language corpus without paired images. How to learn single model unpaired data is challenging necessary task, whereas rarely studied in previous works. The proposed framework mainly includes four contributive modules following typical encoder. First, style dependent generator output sentence conditioned on encoded specified...

10.1109/cvpr.2019.00433 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Aligning Linguistic Words and Visual Semantic Units for Image Captioning

OPENALEX - Publications

Longteng Guo Jing Liu Jinhui Tang Jiangwei Li Luo Wei and 1 more

Image captioning attempts to generate a sentence composed of several linguistic words, which are used describe objects, attributes, and interactions in an image, denoted as visual semantic units this paper. Based on view, we propose explicitly model the object semantics geometry based Graph Convolutional Networks (GCNs), fully exploit alignment between words for image captioning. Particularly, construct graph graph, where each node corresponds unit, i.e., object, attribute, or (geometrical)...

10.1145/3343031.3350943 preprint EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

CPTR: Full Transformer Network for Image Captioning

OPENALEX - Publications

Wei Liu Sihan Chen Longteng Guo Xinxin Zhu Jing Liu

In this paper, we consider the image captioning task from a new sequence-to-sequence prediction perspective and propose CaPtion TransformeR (CPTR) which takes sequentialized raw images as input to Transformer. Compared "CNN+Transformer" design paradigm, our model can global context at every encoder layer beginning is totally convolution-free. Extensive experiments demonstrate effectiveness of proposed surpass conventional methods on MSCOCO dataset. Besides, provide detailed visualizations...

10.48550/arxiv.2101.10804 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Modified particle swarm optimization-based multilevel thresholding for image segmentation

OPENALEX - Publications

Yi Liu Caihong Mu Weidong Kou Jing Liu

10.1007/s00500-014-1345-2 article EN Soft Computing 2014-06-24

Deep Incremental Hashing Network for Efficient Image Retrieval

OPENALEX - Publications

Dayan Wu Qi Dai Jing Liu Bo Li Weiping Wang

Hashing has shown great potential in large-scale image retrieval due to its storage and computation efficiency, especially the recent deep supervised hashing methods. To achieve promising performance, methods require a large amount of training data from different classes. However, when images new categories emerge, existing have retrain CNN model generate hash codes for all database again, which is impractical system. In this paper, we propose novel framework, called Deep Incremental Network...

10.1109/cvpr.2019.00928 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Optimized YOLOv3 Algorithm and Its Application in Traffic Flow Detections

OPENALEX - Publications

Yiqi Huang Jiachun Zheng Shidan Sun Cheng‐Fu Yang Jing Liu

In the intelligent traffic system, real-time and accurate detections of vehicles in images video data are very important challenging work. Especially situations with complex scenes, different models, high density, it is difficult to accurately locate classify these during flows. Therefore, we propose a single-stage deep neural network YOLOv3-DL, which based on Tensorflow framework improve this problem. The structure optimized by introducing idea spatial pyramid pooling, then loss function...

10.3390/app10093079 article EN cc-by Applied Sciences 2020-04-28

Captioning Transformer with Stacked Attention Modules

OPENALEX - Publications

Xinxin Zhu Lixiang Li Jing Liu Haipeng Peng Xinxin Niu

Image captioning is a challenging task. Meanwhile, it important for the machine to understand meaning of an image better. In recent years, usually use long-short-term-memory (LSTM) as decoder generate sentence, and these models show excellent performance. Although LSTM can memorize dependencies, structure has complicated inherently sequential across time problems. To address issues, works have shown benefits Transformer translation. Inspired by their success, we develop Captioning (CT) model...

10.3390/app8050739 article EN cc-by Applied Sciences 2018-05-07

Dual Attention Network for Scene Segmentation

OPENALEX - Publications

Jun Fu Jing Liu Haijie Tian Yong Li Yongjun Bao and 2 more

In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on selfattention mechanism. Unlike previous works that capture contexts multi-scale features fusion, propose a Dual Attention Networks (DANet) to adaptively integrate local with their global dependencies. Specifically, append two types of attention modules top traditional dilated FCN, which model semantic interdependencies in spatial and channel dimensions respectively. The position module...

10.48550/arxiv.1809.02983 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Scalable Vision Transformers with Hierarchical Pooling

OPENALEX - Publications

Zizheng Pan Bohan Zhuang Jing Liu Haoyu He Jianfei Cai

The recently proposed Visual image Transformers (ViT) with pure attention have achieved promising performance on recognition tasks, such as classification. However, the routine of current ViT model is to maintain a full-length patch sequence during inference, which redundant and lacks hierarchical representation. To this end, we propose Hierarchical Transformer (HVT) progressively pools visual tokens shrink length hence reduces computational cost, analogous feature maps downsampling in...

10.1109/iccv48922.2021.00043 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Less Is More: Pay Less Attention in Vision Transformers

OPENALEX - Publications

Zizheng Pan Bohan Zhuang Haoyu He Jing Liu Jianfei Cai

Transformers have become one of the dominant architectures in deep learning, particularly as a powerful alternative to convolutional neural networks (CNNs) computer vision. However, Transformer training and inference previous works can be prohibitively expensive due quadratic complexity self-attention over long sequence representations, especially for high-resolution dense prediction tasks. To this end, we present novel Less attention vIsion (LIT), building upon fact that early layers still...

10.1609/aaai.v36i2.20099 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception

OPENALEX - Publications

Kun Yang Dingkang Yang Jingyu Zhang Mingcheng Li Yang Liu and 4 more

Multi-agent collaborative perception as a potential application for vehicle-to-everything communication could significantly improve the performance of autonomous vehicles over single-agent perception. However, several challenges remain in achieving pragmatic information sharing this emerging research. In paper, we propose SCOPE, novel frame-work that aggregates spatio-temporal awareness characteristics across on-road agents an end-to-end manner. Specifically, SCOPE has three distinct...

10.1109/iccv51070.2023.02137 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

OPENALEX - Publications

Jing Liu Sihan Chen Xingjian He Longteng Guo Xinxin Zhu and 2 more

In this paper, we propose the Vision-Audio-Language Omni-peRception pretraining model (VALOR) for multimodal understanding and generation. Unlike widely-studied vision-language models, VALOR jointly models relationships among vision, audio, language in an end-to-end manner. It consists of three separate encoders single modality representations a decoder conditional text We design two pretext tasks to pretrain model: Multimodal Grouping Alignment (MGA) Captioning (MGC). MGA projects language,...

10.1109/tpami.2024.3479776 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

Dual cross-media relevance model for image annotation

OPENALEX - Publications

Jing Liu Bin Wang Mingjing Li Zhiwei Li Wei‐Ying Ma and 2 more

Image annotation has been an active research topic in recent years due to its potential impact on both image understanding and web retrieval. Existing relevance-model-based methods perform by maximizing the joint probability of images words, which is calculated expectation over training images. However, semantic gap dependence data restrict their performance scalability. In this paper, a dual cross-media relevance model (DCMRM) proposed for automatic annotation, estimates words pre-defined...

10.1145/1291233.1291380 article EN Proceedings of the 30th ACM International Conference on Multimedia 2007-09-29

Image captioning with triple-attention and stack parallel LSTM

OPENALEX - Publications

Xinxin Zhu Lixiang Li Jing Liu Ziyi Li Haipeng Peng and 1 more

10.1016/j.neucom.2018.08.069 article EN Neurocomputing 2018-09-04

Fine-Grained Image Classification via Low-Rank Sparse Coding With General and Class-Specific Codebooks

OPENALEX - Publications

Chunjie Zhang Chao Liang Liang Li Jing Liu Qingming Huang and 1 more

This paper tries to separate fine-grained images by jointly learning the encoding parameters and codebooks through low-rank sparse coding (LRSC) with general class-specific codebook generation. Instead of treating each local feature independently, we encode features within a spatial region LRSC. ensures that spatially nearby similar visual characters are encoded correlated parameters. In this way, can make more consistent for image representation. Besides, also learn number in combination...

10.1109/tnnls.2016.2545112 article EN IEEE Transactions on Neural Networks and Learning Systems 2016-04-07

Coming Soon ...