NFDI4DS | UHH-SEMS - Publication Details

Xun Yang

ORCID: 0000-0003-0201-1638

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5034737032

Research Areas

Multimodal Machine Learning Applications
Human Pose and Action Recognition
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Video Analysis and Summarization
Video Surveillance and Tracking Methods
Wireless Networks and Protocols
Anomaly Detection Techniques and Applications
Recommender Systems and Techniques
Generative Adversarial Networks and Image Synthesis
Advanced Wireless Network Optimization
Wireless Communication Networks Research
Gait Recognition and Analysis
Image Retrieval and Classification Techniques
Advanced MIMO Systems Optimization
Face recognition and analysis
Indoor and Outdoor Localization Technologies
3D Shape Modeling and Analysis
IPv6, Mobility, Handover, Networks, Security
Topic Modeling
Consumer Market Behavior and Pricing
Intraocular Surgery and Lenses
Intelligent Tutoring Systems and Adaptive Learning
Stochastic processes and financial applications
Traumatic Ocular and Foreign Body Injuries

University of Science and Technology of China
2022-2025

Zhejiang University of Science and Technology
2023-2024

Zhejiang Cancer Hospital
2022-2024

Chinese Academy of Sciences
2024

Shanghai Jiao Tong University
2024

Zhengzhou University
2022-2024

Soochow University
2019-2023

Chongqing Dazu District People's Hospital
2023

Chongqing University
2023

University of Chinese Academy of Sciences
2022-2023

Person Re-Identification With Metric Learning Using Privileged Information

OPENALEX - Publications

Xun Yang Meng Wang Dacheng Tao

Despite the promising progress made in recent years, person re-identification remains a challenging task due to complex variations human appearances from different camera views. This paper presents logistic discriminant metric learning method for this problem. Different with most existing algorithms, it exploits both original data and auxiliary during training, which is motivated by new machine paradigm-learning using privileged information. Such information kind of knowledge, only available...

10.1109/tip.2017.2765836 article EN IEEE Transactions on Image Processing 2017-10-23

Dual Encoding for Video Retrieval by Text

OPENALEX - Publications

Jianfeng Dong Xirong Li Chaoxi Xu Xun Yang Gang Yang and 2 more

This paper attacks the challenging problem of video retrieval by text. In such a paradigm, an end user searches for unlabeled videos ad-hoc queries described exclusively in form natural-language sentence, with no visual example provided. Given as sequences frames and words, effective sequence-to-sequence cross-modal matching is crucial. To that end, two modalities need to be first encoded into real-valued vectors then projected common space. this we achieve proposing dual deep encoding...

10.1109/tpami.2021.3059295 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-01-01

Deconfounded Video Moment Retrieval with Causal Intervention

OPENALEX - Publications

Xun Yang Fuli Feng Wei Ji Meng Wang Tat‐Seng Chua

We tackle the task of video moment retrieval (VMR), which aims to localize a specific in according textual query. Existing methods primarily model matching relationship between query and by complex cross-modal interactions. Despite their effectiveness, current models mostly exploit dataset biases while ignoring content, thus leading poor generalizability. argue that issue is caused hidden confounder VMR, i.e., temporal location moments, spuriously correlates input prediction. How design...

10.1145/3404835.3462823 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021-07-11

Video Moment Retrieval With Cross-Modal Neural Architecture Search

OPENALEX - Publications

Xun Yang Shanshan Wang Jian Dong Jianfeng Dong Meng Wang and 1 more

The task of video moment retrieval (VMR) is to retrieve the specific from an untrimmed video, according a textual query. It challenging that requires effective modeling complex cross-modal matching relationship. Recent efforts primarily model interactions by hand-crafted network architectures. Despite their effectiveness, they rely heavily on expert experience select architectures and have numerous hyperparameters need be carefully tuned, which significantly limit applications in real-world...

10.1109/tip.2022.3140611 article EN IEEE Transactions on Image Processing 2022-01-01

Joint Transmit and Reflective Beamforming for IRS-Assisted Integrated Sensing and Communication

OPENALEX - Publications

Xianxin Song Ding Zhao Haocheng Hua Tony Xiao Han Xun Yang and 1 more

This paper studies an intelligent reflecting surface (IRS)-assisted integrated sensing and communication (ISAC) system, in which one IRS with a uniform linear array (ULA) is deployed to not only assist the wireless from multi-antenna base station (BS) single-antenna user (CU), but also create virtual line-of-sight (LoS) links for potential targets at areas LoS blocked. We consider that BS transmits combined information signals ISAC. Under this setup, we jointly optimize transmit beamforming...

10.1109/wcnc51071.2022.9771801 article EN 2022 IEEE Wireless Communications and Networking Conference (WCNC) 2022-04-10

Repetitive Action Counting with Hybrid Temporal Relation Modeling

OPENALEX - Publications

Kun Li Xinge Peng Dan Guo Xun Yang Meng Wang

10.1109/tmm.2025.3535385 article EN IEEE Transactions on Multimedia 2025-01-01

Person Reidentification via Structural Deep Metric Learning

OPENALEX - Publications

Xun Yang Peicheng Zhou Meng Wang

Despite the promising progress made in recent years, person reidentification (re-ID) remains a challenging task due to complex variations human appearances from different camera views. This paper proposes tackle this by jointly learning feature representation and distance metric an end-to-end manner. Existing deep learning-based re-ID methods usually encounter following two weaknesses: 1) most works based on pairwise or triplet constraints often suffer slow convergence poor local optima,...

10.1109/tnnls.2018.2861991 article EN IEEE Transactions on Neural Networks and Learning Systems 2018-08-24

Annotating Objects and Relations in User-Generated Videos

OPENALEX - Publications

Xindi Shang Donglin Di Junbin Xiao Yu Cao Xun Yang and 1 more

Understanding the objects and relations between them is indispensable to fine-grained video content analysis, which widely studied in recent research works multimedia computer vision. However, existing are limited evaluating with either small datasets or indirect metrics, such as performance over images. The underlying reason that construction of a large-scale dataset dense annotation tricky costly. In this paper, we address several main issues annotating user-generated videos, propose an...

10.1145/3323873.3325056 article EN 2019-06-05

Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval

OPENALEX - Publications

Xun Yang Jianfeng Dong Yixin Cao Xun Wang Meng Wang and 1 more

The rapid growth of user-generated videos on the Internet has intensified need for text-based video retrieval systems. Traditional methods mainly favor concept-based paradigm with simple queries, which are usually ineffective complex queries that carry far more semantics. Recently, embedding-based emerged as a popular approach. It aims to map and into shared embedding space where semantically-similar texts much closer each other. Despite its simplicity, it forgoes exploitation syntactic...

10.1145/3397271.3401151 article EN 2020-07-25

Enhancing Person Re-identification in a Self-Trained Subspace

OPENALEX - Publications

Xun Yang Meng Wang Richang Hong Qi Tian Yong Rui

Despite the promising progress made in recent years, person re-identification (re-ID) remains a challenging task due to complex variations human appearances from different camera views. For this problem, large variety of algorithms have been developed fully supervised setting, requiring access amount labeled training data. However, main bottleneck for re-ID is limited availability samples. To address we propose self-trained subspace learning paradigm that effectively utilizes both and...

10.1145/3089249 article EN ACM Transactions on Multimedia Computing Communications and Applications 2017-06-28

Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising

OPENALEX - Publications

Di Wu Xiujun Chen Xun Yang Hao Wang Qing Tan and 3 more

Real-time bidding (RTB) is an important mechanism in online display advertising, where a proper bid for each page view plays essential role good marketing results. Budget constrained typical scenario RTB the advertisers hope to maximize total value of winning impressions under pre-set budget constraint. However, optimal strategy hard be derived due complexity and volatility auction environment. To address these challenges, this paper, we formulate as Markov Decision Process propose...

10.1145/3269206.3271748 preprint EN 2018-10-17

Saliency Detection on Light Field

OPENALEX - Publications

Jun Zhang Meng Wang Liang Lin Xun Yang Jun Gao and 1 more

Saliency detection has recently received increasing research interest on using high-dimensional datasets beyond two-dimensional images. Despite the many available capturing devices and algorithms, there still exists a wide spectrum of challenges that need to be addressed achieve accurate saliency detection. Inspired by success light-field technique, in this article, we propose new computational scheme detect salient regions integrating multiple visual cues from First, prior maps are...

10.1145/3107956 article EN ACM Transactions on Multimedia Computing Communications and Applications 2017-07-27

Interpretable Fashion Matching with Rich Attributes

OPENALEX - Publications

Xun Yang Xiangnan He Xiang Wang Yunshan Ma Fuli Feng and 2 more

Understanding the mix-and-match relationships of fashion items receives increasing attention in industry. Existing methods have primarily utilized visual content to learn compatibility and performed matching a latent space. Despite their effectiveness, these work like black box cannot reveal reasons that two match well. The rich attributes associated with items, e.g.,off-shoulder dress skinny jean, which describe semantics human-interpretable way, largely been ignored.

10.1145/3331184.3331242 article EN 2019-07-18

TransNFCM: Translation-Based Neural Fashion Compatibility Modeling

OPENALEX - Publications

Xun Yang Yunshan Ma Lizi Liao Meng Wang Tat‐Seng Chua

Identifying mix-and-match relationships between fashion items is an urgent task in a e-commerce recommender system. It will significantly enhance user experience and satisfaction. However, due to the challenges of inferring rich yet complicated set compatibility patterns large corpus items, this still underexplored. Inspired by recent advances multirelational knowledge representation learning deep neural networks, paper proposes novel Translation-based Neural Fashion Compatibility Modeling...

10.1609/aaai.v33i01.3301403 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Multi-Agent Reinforcement Learning-Based Distributed Channel Access for Next Generation Wireless Networks

OPENALEX - Publications

Ziyang Guo Zhenyu Chen Peng Liu Jianjun Luo Xun Yang and 1 more

In the next generation wireless networks, more applications will emerge, covering virtual reality movies, augmented reality, holographic three-dimensional telepresence, haptic telemedicine and so on, which require provisioning of high bandwidth efficiency low latency services. order to better support aforementioned services, novel distributed channel access (DCA) schemes are necessary. Therefore, we propose a new MAC protocol, QMIX-advanced Listen-Before-Talk (QLBT), based on cutting-edge...

10.1109/jsac.2022.3143251 article EN IEEE Journal on Selected Areas in Communications 2022-01-14

Transformer-Based Visual Grounding with Cross-Modality Interaction

OPENALEX - Publications

Kun Li Jiaxiu Li Dan Guo Xun Yang Meng Wang

This article tackles the challenging yet important task of Visual Grounding (VG), which aims to localize a visual region in given image referred by natural language query. Existing efforts on VG are twofold: (1) two-stage methods first extract proposals and then rank them according their similarities with referring expression, usually leads suboptimal results due quality proposals; (2) one-stage predict all possible coordinates target online leveraging modern object detection architectures,...

10.1145/3587251 article EN ACM Transactions on Multimedia Computing Communications and Applications 2023-03-09

Emotional Video Captioning With Vision-Based Emotion Interpretation Network

OPENALEX - Publications

Peipei Song Dan Guo Xun Yang Shengeng Tang Meng Wang

Effectively summarizing and re-expressing video content by natural languages in a more human-like fashion is one of the key topics field multimedia understanding. Despite good progress made recent years, existing efforts usually overlooked emotions user-generated videos, thus making generated sentence bit boring soulless. To fill research gap, this paper presents novel emotional captioning framework which we design Vision-based Emotion Interpretation Network to effectively capture conveyed...

10.1109/tip.2024.3359045 article EN IEEE Transactions on Image Processing 2024-01-01

Learning Hierarchical Visual Transformation for Domain Generalizable Visual Matching and Recognition

OPENALEX - Publications

Xun Yang Tianyu Chang Tianzhu Zhang Shanshan Wang Richang Hong and 1 more

10.1007/s11263-024-02106-7 article EN International Journal of Computer Vision 2024-05-27

Flexible Single Microwire X‐Ray Detector with Ultrahigh Sensitivity for Portable Radiation Detection System

OPENALEX - Publications

Yancheng Chen Shifeng Niu Ying Li Wenjie Dou Xun Yang and 2 more

Abstract Sensitive, flexible, and low false alarm rate X‐ray detector is crucial for medical diagnosis, industrial inspection, scientific research. However, most semiconductors detectors are susceptible to interference from ambient light, their high thickness hinders application in wearable electronics. Herein, a flexible visible‐blind ultraviolet‐blind based on Indium‐doped Gallium oxide (Ga 2 O 3 :In) single microwire prepared. Joint experiment−theory characterizations reveal that the Ga...

10.1002/adma.202404656 article EN Advanced Materials 2024-08-19

Dual-State Personalized Knowledge Tracing With Emotional Incorporation

OPENALEX - Publications

Shanshan Wang Fangzheng Yuan Keyang Wang Xun Yang Xingyi Zhang and 1 more

10.1109/tkde.2025.3538121 article EN IEEE Transactions on Knowledge and Data Engineering 2025-01-01

Bid Optimization by Multivariable Control in Display Advertising

OPENALEX - Publications

Xun Yang Yasong Li Hao Wang Di Wu Qing Tan and 2 more

Real-Time Bidding (RTB) is an important paradigm in display advertising, where advertisers utilize extended information and algorithms served by Demand Side Platforms (DSPs) to improve advertising performance. A common problem for DSPs help gain as much value possible with budget constraints. However, would routinely add certain key performance indicator (KPI) constraints that the campaign must meet due practical reasons. In this paper, we study case aim maximize quantity of conversions, set...

10.1145/3292500.3330681 preprint EN 2019-07-25

Interventional Video Relation Detection

OPENALEX - Publications

Yicong Li Xun Yang Xindi Shang Tat‐Seng Chua

Video Visual Relation Detection (VidVRD) aims to semantically describe the dynamic interactions across visual concepts localized in a video form of subject, predicate, object. It can help mitigate semantic gap between vision and language understanding, thus receiving increasing attention multimedia communities. Existing efforts primarily leverage multimodal/spatio-temporal feature fusion augment representation object trajectories as well their formulate prediction predicates multi-class...

10.1145/3474085.3475540 article EN Proceedings of the 30th ACM International Conference on Multimedia 2021-10-17

Knowledge Enhanced Neural Fashion Trend Forecasting

OPENALEX - Publications

Yunshan Ma Yujuan Ding Xun Yang Lizi Liao Wai Keung Wong and 1 more

Fashion trend forecasting is a crucial task for both academia andindustry. Although some efforts have been devoted to tackling this challenging task, they only studied limited fashion elements with highly seasonal or simple patterns, which could hardly reveal thereal trends. Towards insightful forecasting,this work focuses on investigating fine-grained element trends specific user groups. We first contribute large-scale dataset (FIT) collected from Instagram extracted time series records and...

10.1145/3372278.3390677 article EN 2020-06-02

Self-Supervised Graph Learning for Long-Tailed Cognitive Diagnosis

OPENALEX - Publications

Shanshan Wang Zhen Zeng Xun Yang Xingyi Zhang

Cognitive diagnosis is a fundamental yet critical research task in the field of intelligent education, which aims to discover proficiency level different students on specific knowledge concepts. Despite effectiveness existing efforts, previous methods always considered mastery whole students, so they still suffer from Long Tail Effect. A large number who have sparse interaction records are usually wrongly diagnosed during inference. To relieve situation, we proposed Self-supervised Diagnosis...

10.1609/aaai.v37i1.25082 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

MJPNet-S*: Multistyle Joint-Perception Network With Knowledge Distillation for Drone RGB-Thermal Crowd Density Estimation in Smart Cities

OPENALEX - Publications

Wujie Zhou Xun Yang Xiena Dong Meixin Fang Weiqing Yan and 1 more

Crowd density estimation has gained significant research interest owing to its potential in various industries and social applications. Therefore, this paper proposes a multistyle joint-perception network based on knowledge distillation-trained student (MJPNet-S*) for drone-based red–green–blue, thermal/depth (RGB-T/D) crowd tasks. To provide superior accuracy efficiency, novel trimodal working module effectively combines the modalities facilitate comprehensive extraction utilization. A...

10.1109/jiot.2024.3369642 article EN IEEE Internet of Things Journal 2024-02-26

Coming Soon ...