NFDI4DS | UHH-SEMS - Publication Details

Jian Sun

ORCID: 0009-0006-9443-4046

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5104089339

Research Areas

Advanced Neural Network Applications
Topic Modeling
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Speech and dialogue systems
Natural Language Processing Techniques
Multimodal Machine Learning Applications
Robotics and Sensor-Based Localization
Distributed Control Multi-Agent Systems
Video Surveillance and Tracking Methods
Robotic Path Planning Algorithms
Human Pose and Action Recognition
AI in Service Interactions
Stochastic Gradient Optimization Techniques
Visual Attention and Saliency Detection
Advanced Graph Neural Networks
Sparse and Compressive Sensing Techniques
Brain Tumor Detection and Classification
Advanced Vision and Imaging
Optical Network Technologies
Quantum Information and Cryptography
Machine Learning and Data Classification
Anomaly Detection Techniques and Applications
Software Testing and Debugging Techniques
Digital Media Forensic Detection

Hohai University
2024-2025

Beijing Institute of Technology
2022-2025

Beihang University
2022-2023

Chongqing University of Technology
2023

China XD Group (China)
2023

Electric Power Research Institute
2023

QuantumCTek (China)
2023

Anhui University
2023

Megvii (China)
2022

Taizhou University
2022

Rethinking on Multi-Stage Networks for Human Pose Estimation

OPENALEX - Publications

Wenbo Li Zhicheng Wang Binyi Yin Qixiang Peng Yuming Du and 5 more

Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While methods are seemingly more suited for the task, their performance in current practice is not as good This work studies this issue. We argue that methods' unsatisfactory comes from insufficiency various design choices. propose several improvements, including module design, cross stage feature aggregation, coarse-to-fine supervision. The resulting method establishes new state-of-the-art on...

10.48550/arxiv.1901.00148 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Unifying Voxel-based Representation with Transformer for 3D Object Detection

OPENALEX - Publications

Yanwei Li Yilun Chen Xiaojuan Qi Zeming Li Jian Sun and 1 more

In this work, we present a unified framework for multi-modality 3D object detection, named UVTR. The proposed method aims to unify representations in the voxel space accurate and robust single- or cross-modality detection. To end, modality-specific is first designed represent different inputs feature space. Different from previous our approach preserves without height compression alleviate semantic ambiguity enable spatial connections. make full use of sensors, interaction then proposed,...

10.48550/arxiv.2206.00630 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Bundling features for large scale partial-duplicate web image search

OPENALEX - Publications

Wu Zhong Qifa Ke M. Isard Jian Sun

In state-of-the-art image retrieval systems, an is represented by a bag of visual words obtained quantizing high-dimensional local descriptors, and scalable schemes inspired text are then applied for large scale indexing retrieval. Bag-of-words representations, however: 1) reduce the discriminative power features due to feature quantization; 2) ignore geometric relationships among words. Exploiting such constraints, estimating 2D affine transformation between query each candidate image, has...

10.1109/cvprw.2009.5206566 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009-06-01

GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-Supervised Learning and Explicit Policy Injection

OPENALEX - Publications

Wanwei He Yinpei Dai Yinhe Zheng Yuchuan Wu Zheng Cao and 7 more

Pre-trained models have proved to be powerful in enhancing task-oriented dialog systems. However, current pre-training methods mainly focus on understanding and generation tasks while neglecting the exploitation of policy. In this paper, we propose GALAXY, a novel pre-trained model that explicitly learns policy from limited labeled dialogs large-scale unlabeled corpora via semi-supervised learning. Specifically, introduce act prediction task for optimization during employ consistency...

10.48550/arxiv.2111.14592 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Research on the Digital Twin Polder Area System Driven by Integrating the Xin’anjiang Model and the N‐BEATS Model

OPENALEX - Publications

Feng Ye Zishuo Jin Peng Zhang Dong Xu Lan Lin and 1 more

Digital twins are propelling the next generation of industrial revolution and serve as a key technology in enabling intelligent water conservancy. However, due to diversity objects within conservancy scenarios complexity related factors, research application digital field remain immature. There still significant challenges constructing fine‐grained, high‐fidelity twin for their corresponding scenarios. In this context, taking polder areas subjects, area system is proposed, which includes...

10.1155/int/8899669 article EN cc-by International Journal of Intelligent Systems 2025-01-01

Adaptive Dynamic Programming for Optimal Control of Unknown LTI System via Interval Excitation

OPENALEX - Publications

Yongsheng Ma Jian Sun Yong Xu Shisheng Cui Zheng‐Guang Wu

10.1109/tac.2025.3542328 article EN IEEE Transactions on Automatic Control 2025-01-01

Proton: Probing Schema Linking Information from Pre-trained Language Models for Text-to-SQL Parsing

OPENALEX - Publications

Lihan Wang Bowen Qin Binyuan Hui Bowen Li Min Yang and 6 more

The importance of building text-to-SQL parsers which can be applied to new databases has long been acknowledged, and a critical step achieve this goal is schema linking, i.e., properly recognizing mentions unseen columns or tables when generating SQLs. In work, we propose novel framework elicit relational structures from large-scale pre-trained language models (PLMs) via probing procedure based on Poincaré distance metric, use the induced relations augment current graph-based for better...

10.1145/3534678.3539305 article EN Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022-08-12

A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions

OPENALEX - Publications

Bowen Qin Binyuan Hui Lihan Wang Min Yang Jinyang Li and 7 more

Text-to-SQL parsing is an essential and challenging task. The goal of text-to-SQL to convert a natural language (NL) question its corresponding structured query (SQL) based on the evidences provided by relational databases. Early systems from database community achieved noticeable progress with cost heavy human engineering user interactions systems. In recent years, deep neural networks have significantly advanced this task generation models, which automatically learn mapping function input...

10.48550/arxiv.2208.13629 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs

OPENALEX - Publications

Xiaohan Ding Xiangyu Zhang Yizhuang Zhou Jungong Han Guiguang Ding and 1 more

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances vision transformers (ViTs), this paper, we demonstrate that using a few kernels instead of stack small could be more powerful paradigm. suggested five guidelines, e.g., applying re-parameterized depth-wise convolutions, to efficient high-performance large-kernel CNNs. Following the propose RepLKNet, pure CNN architecture whose size is as 31x31, contrast commonly used 3x3. RepLKNet...

10.48550/arxiv.2203.06717 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

OPENALEX - Publications

Yingxiu Zhao Zhiliang Tian Huaxiu Yao Yinhe Zheng Dongkyu Lee and 3 more

Yingxiu Zhao, Zhiliang Tian, Huaxiu Yao, Yinhe Zheng, Dongkyu Lee, Yiping Song, Jian Sun, Nevin Zhang. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.44 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Anchor DETR: Query Design for Transformer-Based Object Detection

OPENALEX - Publications

Ying‐Ming Wang Xiangyu Zhang Tong Yang Jian Sun

In this paper, we propose a novel query design for the transformer-based object detection. previous detectors, queries are set of learned embeddings. However, each embedding does not have an explicit physical meaning and cannot explain where it will focus on. It is difficult to optimize as prediction slot specific mode. other words, on region. To solved these problems, in our design, based anchor points, which widely used CNN-based detectors. So focuses objects near point. Moreover, can...

10.48550/arxiv.2109.07107 preprint EN other-oa arXiv (Cornell University) 2021-01-01

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

OPENALEX - Publications

Yingfei Liu Junjie Yan Fan Jia Shuailin Li Qi Gao and 3 more

In this paper, we propose PETRv2, a unified framework for 3D perception from multi-view images. Based on PETR, PETRv2 explores the effectiveness of temporal modeling, which utilizes information previous frames to boost object detection. More specifically, extend position embedding (3D PE) in PETR modeling. The PE achieves alignment different frames. A feature-guided encoder is further introduced improve data adaptability PE. To support multi-task learning (e.g., BEV segmentation and lane...

10.48550/arxiv.2206.01256 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Data-driven control of consensus tracking for discrete-time multi-agent systems

OPENALEX - Publications

Xiufeng Zhang Gang Wang Jian Sun

10.1016/j.jfranklin.2023.02.036 article EN Journal of the Franklin Institute 2023-03-10

Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots

OPENALEX - Publications

Haomin Fu Yeqin Zhang Haiyang Yu Jian Sun Fei Huang and 3 more

This paper introduces Doc2Bot, a novel dataset for building machines that help users seek information via conversations. is of particular interest companies and organizations own large number manuals or instruction books. Despite its potential, the nature our task poses several challenges: (1) documents contain various structures hinder ability to comprehend, (2) user needs are often underspecified. Compared prior datasets either focus on single structural type overlook role questioning...

10.18653/v1/2022.findings-emnlp.131 article EN cc-by 2022-01-01

Efficient reversible data hiding via two layers of double-peak embedding

OPENALEX - Publications

Fuhu Wu Jian Sun Shun Zhang Naixue Xiong Hong Zhong

10.1016/j.ins.2023.119264 article EN Information Sciences 2023-06-01

Scalable distributed least square algorithms for large-scale linear equations via an optimization approach

OPENALEX - Publications

Yi Huang Ziyang Meng Jian Sun

10.1016/j.automatica.2022.110572 article EN Automatica 2022-09-06

Flash matting

OPENALEX - Publications

Jian Sun Yin Li Sing Bing Kang Heung‐Yeung Shum

In this paper, we propose a novel approach to extract mattes using pair of flash/no-flash images. Our approach, which call flash matting, was inspired by the simple observation that most noticeable difference between and no-flash images is foreground object if background scene sufficiently distant. We apply new matting algorithm called joint Bayesian robustly recover matte from images, even for scenes in are similar or complex. Experimental results involving variety complex indoors outdoors...

10.1145/1179352.1141954 article EN 2006-01-01

Single image haze removal using dark channel prior

OPENALEX - Publications

Kaiming He Jian Sun Xiaoou Tang

In this paper, we propose a simple but effective image prior - dark channel to remove haze from single input image. The is kind of statistics the haze-free outdoor images. It based on key observation most local patches in images contain some pixels which have very low intensities at least one color channel. Using with imaging model, can directly estimate thickness and recover high quality Results variety demonstrate power proposed prior. Moreover, depth map also be obtained as by-product removal.

10.1109/cvprw.2009.5206515 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009-06-01

Field test of quantum key distribution over aerial fiber based on simple and stable modulation

OPENALEX - Publications

yan-lin tang Zhi-Lin Xie Chun Zhou Dexiang Zhang Mu-Lan Xu and 15 more

We have developed a simple time-bin phase encoding quantum key distribution system, using the optical injection locking technique. This setup incorporates both merits of simplicity and stability in encoding, immunity to channel disturbance. demonstrated field implementation over long-distance deployed aerial fiber automatically. During 70-day test, we achieved approximately 1.0 kbps secure rate with stable performance. Our work takes an important step toward widespread QKD systems diverse...

10.1364/oe.494318 article EN cc-by Optics Express 2023-07-04

Simple Baselines for Image Restoration

OPENALEX - Publications

Liangyu Chen Xiaojie Chu Xiangyu Zhang Jian Sun

Although there have been significant advances in the field of image restoration recently, system complexity state-of-the-art (SOTA) methods is increasing as well, which may hinder convenient analysis and comparison methods. In this paper, we propose a simple baseline that exceeds SOTA computationally efficient. To further simplify baseline, reveal nonlinear activation functions, e.g. Sigmoid, ReLU, GELU, Softmax, etc. are not necessary: they could be replaced by multiplication or removed....

10.48550/arxiv.2204.04676 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue

OPENALEX - Publications

Yingxiu Zhao Yinhe Zheng Zhiliang Tian Chang Gao Jian Sun and 1 more

Lifelong learning (LL) is vital for advanced task-oriented dialogue (ToD) systems. To address the catastrophic forgetting issue of LL, generative replay methods are widely employed to consolidate past knowledge with generated pseudo samples. However, most existing use only a single task-specific token control their models. This scheme usually not strong enough constrain model due insufficient information involved. In this paper, we propose novel method, prompt conditioned VAE lifelong...

10.18653/v1/2022.emnlp-main.766 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

Expression recognition method combining convolutional features and Transformer

OPENALEX - Publications

Xiaoning Zhu Zhongyi Li Jian Sun

Expression recognition has been an important research direction in the field of psychology, which can be used traffic, medical, security, and criminal investigation by expressing human feelings through muscles corners mouth, eyes, face. Most existing work uses convolutional neural networks (CNN) to recognize face images thus classify expressions, does achieve good results, but CNN do not have enough ability extract global features. The Transformer advantages for feature extraction, is more...

10.3934/mfc.2022018 article EN Mathematical Foundations of Computing 2022-07-04

SPACE-3: Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation

OPENALEX - Publications

Wanwei He Yinpei Dai Min Yang Jian Sun Fei Huang and 2 more

Recently, pre-training methods have shown remarkable success in task-oriented dialog (TOD) systems. However, most existing pre-trained models for TOD focus on either understanding or generation, but not both. In this paper, we propose SPACE-3, a novel unified semi-supervised conversation model learning from large-scale corpora with limited annotations, which can be effectively fine-tuned wide range of downstream tasks. Specifically, SPACE-3 consists four successive components single...

10.48550/arxiv.2209.06664 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Coming Soon ...