NFDI4DS | UHH-SEMS - Publication Details

Xiangyang Li

ORCID: 0000-0002-3944-4704

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100341799

Research Areas

Advanced Image and Video Retrieval Techniques
Multimodal Machine Learning Applications
Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Human Pose and Action Recognition
Collaboration in agile enterprises
Adversarial Robustness in Machine Learning
Image Retrieval and Classification Techniques
COVID-19 diagnosis using AI
Innovation and Knowledge Management
Topic Modeling
Sparse and Compressive Sensing Techniques
Cryptography and Data Security
Optical measurement and interference techniques
Video Analysis and Summarization
Supply Chain and Inventory Management
Metaheuristic Optimization Algorithms Research
Image Processing Techniques and Applications
Robotics and Sensor-Based Localization
Service-Oriented Architecture and Web Services
Safety and Risk Management
Supply Chain Resilience and Risk Management
Machine Learning and Data Classification
Generative Adversarial Networks and Image Synthesis
Digital Image Processing Techniques

Institute of Computing Technology
2016-2024

University of Chinese Academy of Sciences
2019-2024

University of Science and Technology of China
2020-2024

Aerospace Information Research Institute
2023

South China University of Technology
2023

Chinese Academy of Sciences
2016-2023

Yunnan University
2023

Accenture (United States)
2022

China Three Gorges University
2022

Harbin Institute of Technology
2006-2019

Scene Recognition with CNNs: Objects, Scales and Dataset Bias

OPENALEX - Publications

Luis Herranz Shuqiang Jiang Xiangyang Li

Since scenes are composed in part of objects, accurate recognition requires knowledge about both and objects. In this paper we address two related problems: 1) scale induced dataset bias multi-scale convolutional neural network (CNN) architectures, 2) how to combine effectively scene-centric object-centric (i.e. Places ImageNet) CNNs. An earlier attempt, Hybrid-CNN[23], showed that incorporating ImageNet did not help much. Here propose an alternative method taking the into account, resulting...

10.1109/cvpr.2016.68 preprint EN 2016-06-01

Know More Say Less: Image Captioning Based on Scene Graphs

OPENALEX - Publications

Xiangyang Li Shuqiang Jiang

Automatically describing the content of an image has been attracting considerable research attention in multimedia field. To represent image, many approaches directly utilize convolutional neural networks (CNNs) to extract visual representations, which are fed into recurrent generate natural language. Recently, some have detected semantic concepts from images and then encoded them high-level representations. Although substantial progress achieved, most previous methods treat entities...

10.1109/tmm.2019.2896516 article EN IEEE Transactions on Multimedia 2019-01-30

KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation

OPENALEX - Publications

Xiangyang Li Zihan Wang Jiahao Yang Yaowei Wang Shuqiang Jiang

Vision-and-language navigation (VLN) is the task to enable an embodied agent navigate a remote location following natural language instruction in real scenes. Most of previous approaches utilize entire features or object-centric represent navigable candidates. However, these representations are not efficient enough for perform actions arrive target location. As knowledge provides crucial information which complementary visible content, this paper, we propose Knowledge Enhanced Reasoning...

10.1109/cvpr52729.2023.00254 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

GridMM: Grid Memory Map for Vision-and-Language Navigation

OPENALEX - Publications

Zihan Wang Xiangyang Li Jiahao Yang Yeqi Liu Shuqiang Jiang

Vision-and-language navigation (VLN) enables the agent to navigate a remote location following natural language instruction in 3D environments. To represent previously visited environment, most approaches for VLN implement memory using recurrent states, topological maps, or top-down semantic maps. In contrast these approaches, we build egocentric and dynamically growing Grid Memory Map (i.e., GridMM) structure environment. From global perspective, historical observations are projected into...

10.1109/iccv51070.2023.01432 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Learning Object Context for Dense Captioning

OPENALEX - Publications

Xiangyang Li Shuqiang Jiang Jungong Han

Dense captioning is a challenging task which not only detects visual elements in images but also generates natural language sentences to describe them. Previous approaches do leverage object information for this task. However, objects provide valuable cues help predict the locations of caption regions as often highly overlap with (i.e. are usually parts or combinations them). Meanwhile, important describing target region corresponding description depicts its properties, involves interactions...

10.1609/aaai.v33i01.33018650 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

SampleLLM: Optimizing Tabular Data Synthesis in Recommendations

OPENALEX - Publications

Jingtong Gao Z. Z. Du Xiaopeng Li Yichao Wang Xiangyang Li and 3 more

Tabular data synthesis is crucial in machine learning, yet existing general methods-primarily based on statistical or deep learning models-are highly data-dependent and often fall short recommender systems. This limitation arises from their difficulty capturing complex distributions understanding complicated feature relations sparse limited data, along with inability to grasp semantic relations. Recently, Large Language Models (LLMs) have shown potential generating synthetic through few-shot...

10.32388/a9u1sh preprint EN cc-by 2025-02-07

FFCG: Effective and Fast Family Column Generation for Solving Large-Scale Linear Program

OPENALEX - Publications

Y. Hu Fengshi Wu Shaoang Li Yonghao Zhao Xiangyang Li

Column Generation (CG) is an effective and iterative algorithm to solve large-scale linear programs (LP). During each CG iteration, new columns are added improve the solution of LP. Typically, greedily selects one column with most negative reduced cost, which can be improved by adding more at once. However, selecting all costs would lead addition redundant that do not objective value. Therefore, appropriate add still open problem previous machine-learning-based approaches for only a constant...

10.1609/aaai.v39i11.33222 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Dataset Bias in Few-Shot Image Recognition

OPENALEX - Publications

Shuqiang Jiang Yaohui Zhu Chenlong Liu Xinhang Song Xiangyang Li and 1 more

The goal of few-shot image recognition (FSIR) is to identify novel categories with a small number annotated samples by exploiting transferable knowledge from training data (base categories). Most current studies assume that the can be well used categories. However, such capability may impacted dataset bias, and this problem has rarely been investigated before. Besides, most learning methods are biased different datasets, which also an important issue needs deeply. In paper, we first...

10.1109/tpami.2022.3153611 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-02-24

Image Captioning with both Object and Scene Information

OPENALEX - Publications

Xiangyang Li Xinhang Song Luis Herranz Yaohui Zhu Shuqiang Jiang

Recently, automatic generation of image captions has attracted great interest not only because its extensive applications but also it connects computer vision and natural language processing. By combining convolutional neural networks (CNNs), which learn visual representations from images, recurrent (RNNs), translate the learned features into text sequences, content a can be transformed linguistic sequences. Existing approaches typically focus on extracted form an object-oriented CNN (train...

10.1145/2964284.2984069 article EN Proceedings of the 30th ACM International Conference on Multimedia 2016-09-29

Visual relationship detection with object spatial distribution

OPENALEX - Publications

Yaohui Zhu Shuqiang Jiang Xiangyang Li

Recently, object recognition techniques have been rapidly developed. Most of existing focused on recognizing several independent concepts. The relationship objects is also an important problem, which shows in-depth semantic information images. In this work, toward general visual detection, we propose a method to integrate spatial distribution facilitate relation detection. Spatial can not only reflect positional but describe structural between objects. distributions are described with...

10.1109/icme.2017.8019448 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2017-07-01

Bundled Object Context for Referring Expressions

OPENALEX - Publications

Xiangyang Li Shuqiang Jiang

Referring expressions are natural language descriptions of objects within a given scene. Context is crucial importance for referring expression, as the description not only depicts properties object but also involves relationships referred with other ones. Most previous work uses either whole image or one particular contextual context. However, context these approaches holistic and insufficient, expression often describes multiple in an image. To leverage rich information from all image,...

10.1109/tmm.2018.2811621 article EN IEEE Transactions on Multimedia 2018-03-07

PIC: Enable Large-Scale Privacy Preserving Content-Based Image Search on Cloud

OPENALEX - Publications

Lan Zhang Taeho Jung Puchun Feng Kebin Liu Xiangyang Li and 1 more

Many cloud platforms emerge to meet urgent requirements for large-volume personal image store, sharing and search. Though most would agree that images contain rich sensitive information (e.g., People, location event) people's privacy concerns hinder their participation into untrusted services, today's provide little support protection. Facing large-scale from multiple users, it is extremely challenging the maintain index structure schedule parallel computation without learning anything about...

10.1109/icpp.2015.104 article EN 2015-09-01

Detecting Insulator Strings as Linked Chain Structure in Smart Grid Inspection

OPENALEX - Publications

Ning Wei Xiangyang Li Jiaqi Jin Peng Chen Shuifa Sun

In high-voltage power systems, insulators are essential components in transmission lines for increasing shooting distance and securing wires. Unmanned aerial vehicle imaging becomes a common way of inspecting the state insulators. However, automatic detection with complex backgrounds is still challenging task. Most existing object methods based on anchors, which do not have sufficient ability to describe objects that string-like structure. To tackle it, inspired by keypoints-based method, we...

10.1109/tii.2022.3224956 article EN IEEE Transactions on Industrial Informatics 2022-11-28

Class Agnostic Image Common Object Detection

OPENALEX - Publications

Shuqiang Jiang Sisi Liang Chengpeng Chen Yaohui Zhu Xiangyang Li

Learning similarity of two images is an important problem in computer vision and has many potential applications. Most the previous works focus on generating image similarities three aspects: global feature distance computing, local matching, concepts comparison. However, task directly detecting class agnostic common objects from not been studied before, which goes one step further to capture at region level. In this paper, we propose end-to-end Common Object Detection Network (CODN) detect...

10.1109/tip.2019.2891124 article EN IEEE Transactions on Image Processing 2019-01-09

Multi-branch Attentive Transformer

OPENALEX - Publications

Fan Yang Shufang Xie Yingce Xia Lijun Wu Tao Qin and 2 more

While the multi-branch architecture is one of key ingredients to success computer vision tasks, it has not been well investigated in natural language processing, especially sequence learning tasks. In this work, we propose a simple yet effective variant Transformer called attentive (briefly, MAT), where attention layer average multiple branches and each branch an independent multi-head layer. We leverage two training techniques regularize training: drop-branch, which randomly drops...

10.48550/arxiv.2006.10270 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Modality-specific and hierarchical feature learning for RGB-D hand-held object recognition

OPENALEX - Publications

Lv Xiong Xinda Liu Xiangyang Li Xue Li Shuqiang Jiang and 1 more

10.1007/s11042-016-3375-5 article EN Multimedia Tools and Applications 2016-03-04

FenceMask: A Data Augmentation Approach for Pre-extracted Image Features

OPENALEX - Publications

Pu Li Xiangyang Li Xiang Long

We propose a novel data augmentation method named 'FenceMask' that exhibits outstanding performance in various computer vision tasks. It is based on the 'simulation of object occlusion' strategy, which aim to achieve balance between occlusion and information retention input data. By enhancing sparsity regularity block, our overcome difficulty small notably improve over baselines. Sufficient experiments prove better than other simulate approaches. tested it CIFAR10, CIFAR100 ImageNet datasets...

10.48550/arxiv.2006.07877 preprint EN other-oa arXiv (Cornell University) 2020-01-01

A cross-region transfer learning method for classification of community service cases with small datasets

OPENALEX - Publications

Zhao-ge Liu Xiangyang Li Limin Qiao Dilawar Khan Durrani

10.1016/j.knosys.2019.105390 article EN Knowledge-Based Systems 2019-12-18

Cloud-based Privacy Preserving Image Storage, Sharing and Search

OPENALEX - Publications

Lan Zhang Taeho Jung Puchun Feng Xiangyang Li Yunhao Liu

High-resolution cameras produce huge volume of high quality images everyday. It is extremely challenging to store, share and especially search those images, for which increasing number cloud services are presented support such functionalities. However, tend contain rich sensitive information (\eg, people, location event), people's privacy concerns hinder their readily participation into the provided by untrusted third parties. In this work, we introduce PIC: a Privacy-preserving large-scale...

10.48550/arxiv.1410.6593 preprint EN other-oa arXiv (Cornell University) 2014-01-01

MemBridge: Video-Language Pre-Training With Memory-Augmented Inter-Modality Bridge

OPENALEX - Publications

Jiahao Yang Xiangyang Li Mao Zheng Zihan Wang Yongqing Zhu and 4 more

Video-language pre-training has attracted considerable attention recently for its promising performance on various downstream tasks. Most existing methods utilize the modality-specific or modality-joint representation architectures cross-modality pre-training. Different from previous methods, this paper presents a novel architecture named Memory-augmented Inter-Modality Bridge (MemBridge), which uses learnable intermediate modality representations as bridge interaction between videos and...

10.1109/tip.2023.3283916 article EN IEEE Transactions on Image Processing 2023-01-01

A Systems Thinking Model for Innovation Management: The Knowledge Management Perspective

OPENALEX - Publications

Xiangyu Kong Xiangyang Li

Innovation is a complex process, involving variety of factors at different levels. process and the that affect it should be coordinated managed in systematic way. However, coherent framework for innovation management does not yet exist. In this paper, an attempt made to develop systems thinking simultaneously addresses exploitative exploratory innovation. By placing larger context thinking, influencing on its success or failure can better recognized understood. Drawing theory knowledge...

10.1109/icmse.2007.4422055 article EN International Conference on Management Science and Engineering 2007-08-01

Coming Soon ...