NFDI4DS | UHH-SEMS - Publication Details

Xun Jiang

ORCID: 0000-0003-2209-651X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101786662

Research Areas

Multimodal Machine Learning Applications
Video Analysis and Summarization
Human Pose and Action Recognition
Advanced Image and Video Retrieval Techniques
Advanced Computational Techniques and Applications
Sentiment Analysis and Opinion Mining
Music and Audio Processing
Image Retrieval and Classification Techniques
Anomaly Detection Techniques and Applications
Data Mining Algorithms and Applications
Time Series Analysis and Forecasting
Scientific Computing and Data Management
Data Analysis with R
Technology and Security Systems
Embedded Systems and FPGA Design
Layered Double Hydroxides Synthesis and Applications
Research Data Management Practices
Embedded Systems Design Techniques
Software Engineering Techniques and Practices
Advanced Text Analysis Techniques
Quantum Computing Algorithms and Architecture
Diabetes Treatment and Management
Advanced Materials and Mechanics
Dynamics and Control of Mechanical Systems
Advanced Software Engineering Methodologies

University of Electronic Science and Technology of China
2012-2025

Collaborative Innovation Center of Advanced Microstructures
2024

Nanjing University
2012-2024

Liaoning University
2024

NARI Group (China)
2023

Amgen (United States)
2021-2022

Yale University
2021

Zhejiang Ocean University
2019

Chongqing University of Posts and Telecommunications
2017

University of Connecticut
2013

Accelerating materials property predictions using machine learning

OPENALEX - Publications

Ghanshyam Pilania Chenchen Wang Xun Jiang Sanguthevar Rajasekaran Rampi Ramprasad

The materials discovery process can be significantly expedited and simplified if we learn effectively from available knowledge data. In the present contribution, show that efficient accurate prediction of a diverse set properties material systems is possible by employing machine (or statistical) learning methods trained on quantum mechanical computations in combination with notions chemical similarity. Using family one-dimensional chain systems, general formalism allows us to discover...

10.1038/srep02810 article EN cc-by Scientific Reports 2013-09-30

pH-sensitive ZnO/carboxymethyl cellulose/chitosan bio-nanocomposite beads for colon-specific release of 5-fluorouracil

OPENALEX - Publications

Xiaoxiao Sun Chao Liu Ahmed M. Omer Wuhuan Lu Shuxing Zhang and 4 more

10.1016/j.ijbiomac.2019.01.140 article EN International Journal of Biological Macromolecules 2019-01-26

Semi-supervised Video Paragraph Grounding with Contrastive Encoder

OPENALEX - Publications

Xun Jiang Xing Xu Jingran Zhang Fumin Shen Zuo Cao and 1 more

Video events grounding aims at retrieving the most relevant moments from an untrimmed video in terms of a given natural language query. Most previous works focus on Sentence Grounding (VSG), which localizes moment with sentence Recently, researchers extended this task to Paragraph (VPG) by multiple paragraph. However, we find existing VPG methods may not perform well context modeling and highly rely video-paragraph annotations. To tackle problem, propose novel method termed Semi-supervised...

10.1109/cvpr52688.2022.00250 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion

OPENALEX - Publications

Zixian Gao Xun Jiang Xing Xu Fumin Shen Yujie Li and 1 more

10.1109/cvpr52733.2024.02538 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing

OPENALEX - Publications

Xun Jiang Xing Xu Zhiguo Chen Jingran Zhang Jingkuan Song and 3 more

The Weakly-Supervised Audio-Visual Video Parsing (AVVP) task aims to parse a video into temporal segments and predict their event categories in terms of modalities, labeling them as either audible, visible, or both. Since the boundaries modalities annotations are not provided, only video-level labels available, this is more challenging than conventional understanding tasks.Most previous works attempt analyze videos by jointly modeling audio data then learning information from segment-level...

10.1145/3503161.3548309 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Multi-Grained Attention Network With Mutual Exclusion for Composed Query-Based Image Retrieval

OPENALEX - Publications

Shenshen Li Xing Xu Xun Jiang Fumin Shen Xin Liu and 1 more

The <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Composed Query-Based Image Retrieval (CQBIR)</i> task aims to precisely obtain the preserved and modified parts, based on multi-grained semantics learned from composed query. Since query includes a reference image modification text, not just single modality, this is more challenging than general retrieval tasks. Most previous methods attempt learn parts via different attention modules fuse...

10.1109/tcsvt.2023.3306738 article EN IEEE Transactions on Circuits and Systems for Video Technology 2023-08-21

So Far Yet So Near: Time Series Data Augmentation with Exploring non-Semantic Boundaries based on Reinforcement Learning

OPENALEX - Publications

Haoran Li Zhibo Zhang Jiarong Kang Xun Jiang Xiaoli Gong and 3 more

10.1109/icassp49660.2025.10889129 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Bridging 3D Molecular Structures and Artificial Intelligence by a Conformation Description Language

OPENALEX - Publications

Jiacheng Xiong Yuqi Shi Wei Zhang Runze Zhang Zhiyi Chen and 5 more

Artificial intelligence, particularly language models (LMs), is reshaping research paradigms across scientific domains. In the fields of chemistry and pharmacy, chemical (CLMs) have achieved remarkable success in two-dimensional (2D) molecular modeling tasks by leveraging one-dimensional (1D) representations molecules, such as SMILES SELFIES. However, extending these successes to three-dimensional (3D) remains a significant challenge, largely due absence effective 1D for capturing 3D...

10.1101/2025.05.07.652440 preprint EN cc-by-nc-nd 2025-05-12

Zero-Shot Video Moment Retrieval With Angular Reconstructive Text Embeddings

OPENALEX - Publications

Xun Jiang Xing Xu Zailei Zhou Yang Yang Fumin Shen and 1 more

10.1109/tmm.2024.3396272 article EN IEEE Transactions on Multimedia 2024-01-01

Faster Video Moment Retrieval with Point-Level Supervision

OPENALEX - Publications

Xun Jiang Zailei Zhou Xing Xu Yang Yang Guoqing Wang and 1 more

Video Moment Retrieval (VMR) aims at retrieving the most relevant events from an untrimmed video with natural language queries. Existing VMR methods suffer two defects: (1) massive expensive temporal annotations are required to obtain satisfying performance; (2) complicated cross-modal interaction modules deployed, which lead high computational cost and low efficiency for retrieval process. To address these issues, we propose a novel method termed Cheaper Faster (CFMR), balances accuracy,...

10.1145/3581783.3612394 article EN 2023-10-26

Joint Searching and Grounding: Multi-Granularity Video Content Retrieval

OPENALEX - Publications

Zhiguo Chen Xun Jiang Xing Xu Zuo Cao Yijun Mo and 1 more

Text-based video retrieval is a well-studied task aimed at retrieving relevant videos from large collection in response to given text query. Most existing TVR works assume that are already trimmed and fully the query thus ignoring most real-world scenarios untrimmed contain massive irrelevant content. Moreover, as users' queries only events rather than complete videos, it also more practical provide specific an list. In this paper, we introduce challenging but realistic called...

10.1145/3581783.3612349 article EN 2023-10-26

SDN: Semantic Decoupling Network for Temporal Language Grounding

OPENALEX - Publications

Xun Jiang Xing Xu Jingran Zhang Fumin Shen Zuo Cao and 1 more

Temporal language grounding (TLG) is one of the most challenging cross-modal video understanding tasks, which aims at retrieving relevant segment from an untrimmed according to a natural sentence. The existing methods can be separated into two dominant types: 1) proposal-based and 2) proposal-free methods, where former conduct contextual interactions latter localizes timestamps flexibly. However, constant-scale candidates in limit localization precision bring extra computational costs. In...

10.1109/tnnls.2022.3211850 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-11-03

Cross-Modal Attention Preservation with Self-Contrastive Learning for Composed Query-Based Image Retrieval

OPENALEX - Publications

Shenshen Li Xing Xu Xun Jiang Fumin Shen Zhe Sun and 1 more

In this article, we study the challenging cross-modal image retrieval task, Composed Query-Based Image Retrieval (CQBIR) , in which query is not a single text but composed query, i.e., reference image, and modification text. Compared with conventional image-text CQBIR more as it requires properly preserving modifying specific region according to multi-level semantic information learned from multi-modal query. Most recent works focus on extracting preserved modified compositing into unified...

10.1145/3639469 article EN ACM Transactions on Multimedia Computing Communications and Applications 2024-01-09

Joint Objective and Subjective Fuzziness Denoising for Multimodal Sentiment Analysis

OPENALEX - Publications

Xun Jiang Xing Xu Huimin Lu Lianghua He Heng Tao Shen

Multimodal Sentiment Analysis (MSA) aims at teaching computers or robotics to understand human sentiment with diverse multimodal signals, including audio, vision, and text. Current MSA approaches primarily concentrate on devising fusion strategies for signals trying learn better joint representations. However, employing directly is not appropriate since the psychological states are fuzzy can be categorized easily, which undermines effectiveness of existing methods. In this paper, we regard...

10.1109/tfuzz.2024.3405541 article EN IEEE Transactions on Fuzzy Systems 2024-01-01

Fuzzy Multimodal Graph Reasoning for Human-Centric Instructional Video Grounding

OPENALEX - Publications

Yujie Li Xun Jiang Xing Xu Huimin Lu Heng Tao Shen

10.1109/tfuzz.2024.3436030 article EN IEEE Transactions on Fuzzy Systems 2024-07-31

Language-enhanced object reasoning networks for video moment retrieval with text query

OPENALEX - Publications

Gongmian Wang Xun Jiang Ning Liu Xing Xu

10.1016/j.compeleceng.2022.108137 article EN Computers & Electrical Engineering 2022-06-24

On the Programmatic Generation of Reproducible Documents

OPENALEX - Publications

Michael J. Kane Xun Jiang Simon Urbanek

Reproducible document standards, like R Markdown, facilitate the programmatic creation of documents whose content is itself programmatically generated. While alone may not be sufficient for a rendered since it does include prose (content generated by an author to provide context, narrative, etc.) generation can substantial efficiencies structuring and constructing documents. This paper explores reproducible distinguishing components that created computational means from those requiring...

10.18637/jss.v103.i08 article EN cc-by Journal of Statistical Software 2022-01-01

GTLR: Graph-Based Transformer with Language Reconstruction for Video Paragraph Grounding

OPENALEX - Publications

Xun Jiang Xing Xu Jingran Zhang Fumin Shen Zuo Cao and 1 more

Video Paragraph Grounding aims at retrieving multiple relevant moments from an untrimmed video with a given natural language paragraph query. However, the complex query brings more challenges to multimodal fusion and context modeling, which limited performance of existing VPG methods. To this end, we propose novel framework for in paper, termed Graph-based Transformer Language Reconstruction (GTLR). It consists three components: (1) Multimodal Graph Encoder conducting graph reasoning...

10.1109/icme52920.2022.9859847 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2022-07-18

A User-Configurable and High-Reliability Transmission Design

OPENALEX - Publications

Xun Jiang Jian Li Yi Shu Yan Song

10.1109/icpeca60615.2024.10471177 article EN 2024-01-26

PTAN: Principal Token-aware Adjacent Network for Compositional Temporal Grounding

OPENALEX - Publications

Z. X. Wei Xun Jiang Zheng Wang Fumin Shen Xing Xu

Compositional temporal grounding (CTG) aims to localize the most relevant segment from an untrimmed video based on a given natural language sentence, and test samples for this task contain novel components not seen in training. However, existing CTG methods suffer two shortcomings: (1) Most adopt transformers model global information only, thus failing balance long-range perception regional representation of sequences; (2) Due lack aligning videos sentences at fine-grained level, model's...

10.1145/3652583.3658113 article EN 2024-05-30

Image Compression and Reconstruction Based on Quantum Network

OPENALEX - Publications

Xun Jiang Qin Liu Shan Huang Andi Chen Shengjun Wang

10.1109/ipdpsw63119.2024.00184 article EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2024-05-27

Uncertainty-Debiased Multimodal Fusion: Learning Deterministic Joint Representation for Multimodal Sentiment Analysis

OPENALEX - Publications

Zixian Gao Xun Jiang Hua Chen Yujie Li Yang Yang and 1 more

10.1109/icme57554.2024.10688376 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2024-07-15

Temporal Self-Paced Proposal Learning for Weakly-Supervised Video Moment Retrieval and Highlight Detection

OPENALEX - Publications

Liqing Zhu Xun Jiang Fumin Shen Guoqing Wang Yang Yang and 1 more

10.1109/icme57554.2024.10687638 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2024-07-15

Coming Soon ...