NFDI4DS | UHH-SEMS - Publication Details

Dan Su

ORCID: 0000-0003-0072-0967

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101913303

Research Areas

Gaze Tracking and Assistive Technology
Visual Attention and Saliency Detection
Glaucoma and retinal disorders
Advanced Image and Video Retrieval Techniques
Advanced Neural Network Applications
Handwritten Text Recognition Techniques
E-commerce and Technology Innovations
Retinal Imaging and Analysis
Speech Recognition and Synthesis
Advanced Computing and Algorithms
COVID-19 diagnosis using AI
Infrared Target Detection Methodologies
Higher Education and Teaching Methods
Spectroscopy Techniques in Biomedical and Chemical Research
Image and Video Quality Assessment
Olfactory and Sensory Function Studies
Advanced Algorithms and Applications
Image and Object Detection Techniques
Multimodal Machine Learning Applications
Face and Expression Recognition
Sharing Economy and Platforms
Blasting Impact and Analysis
Video Surveillance and Tracking Methods
Advanced Electrical Measurement Techniques
EEG and Brain-Computer Interfaces

Hong Kong University of Science and Technology
2023

University of Hong Kong
2023

Lanzhou University
2023

Chinese Academy of Sciences
2020-2022

Guangzhou Regenerative Medicine and Health Guangdong Laboratory
2022

Guangzhou Institutes of Biomedicine and Health
2022

Shenzhen Institutes of Advanced Technology
2020-2021

Heihe University
2014-2020

City University of Hong Kong
2016-2019

Tencent (China)
2019

Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection

OPENALEX - Publications

Hao Chen Youfu Li Dan Su

10.1016/j.patcog.2018.08.007 article EN Pattern Recognition 2018-08-13

Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training

OPENALEX - Publications

Wenliang Dai Zihan Liu Ziwei Ji Dan Su Pascale Fung

Large-scale vision-language pre-trained (VLP) models are prone to hallucinate non-existent visual objects when generating text based on information. In this paper, we systematically study the object hallucination problem from three aspects. First, examine recent state-of-the-art VLP models, showing that they still frequently and achieving better scores standard metrics (e.g., CIDEr) could be more unfaithful. Second, investigate how different types of image encoding in influence...

10.18653/v1/2023.eacl-main.156 article EN cc-by 2023-01-01

Discriminative Cross-Modal Transfer Learning and Densely Cross-Level Feedback Fusion for RGB-D Salient Object Detection

OPENALEX - Publications

Hao Chen Youfu Li Dan Su

This article addresses two key issues in RGB-D salient object detection based on the convolutional neural network (CNN). 1) How to bridge gap between "data-hungry" nature of CNNs and insufficient labeled training data depth modality? 2) take full advantages complementary information among modalities. To solve first problem, we model depth-induced saliency as a CNN-based cross-modal transfer learning problem. Instead directly adopting RGB CNN initialization, additionally train modality...

10.1109/tcyb.2019.2934986 article EN IEEE Transactions on Cybernetics 2019-08-30

Attention-Aware Cross-Modal Cross-Level Fusion Network for RGB-D Salient Object Detection

OPENALEX - Publications

Hao Chen Youfu Li Dan Su

Convolutional neural networks have achieved wide success in RGB saliency detection. Recently, the advent of RGB-D sensors such as Kinect provide additional geometric cues. However, key challenge for salient object detection that how to fuse and depth information sufficiently is still under-studied. Traditional works mainly follow two-stream architecture combine features/decisions an early or late point. The multi-modal fusion stage performed by directly concatenating features from two...

10.1109/iros.2018.8594373 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018-10-01

Emerging Governance Approaches for Tourism in the Protected Areas of China

OPENALEX - Publications

Dan Su Geoffrey Wall Paul F.J. Eagles

10.1007/s00267-006-0185-y article EN Environmental Management 2007-04-19

Cross-Validated Locally Polynomial Modeling for 2-D/3-D Gaze Tracking With Head-Worn Devices

OPENALEX - Publications

Dan Su Youfu Li Hao Chen

In the context of wearable gaze tracking techniques, problems two-dimensional (2-D) and three-dimensional (3-D) estimation can be viewed as inferring 2-D epipolar lines 3-D visual axes from eye monitoring cameras. To this end, in article, a simple local polynomial model is proposed to back-project pupil center onto its corresponding axis. Based on approximation, homographylike relation derived manner, via Leave-One-Out cross-validation criterion, training samples at one certain depth...

10.1109/tii.2019.2933481 article EN IEEE Transactions on Industrial Informatics 2019-08-06

Toward Precise Gaze Estimation for Mobile Head-Mounted Gaze Tracking Systems

OPENALEX - Publications

Dan Su Youfu Li Hao Chen

The gaze estimation in the mobile scenario often suffers from extrapolation and parallax errors. In this paper, we propose a novel calibration framework to achieve precise for head-mounted trackers. Our proposed consists of two steps learn point-to-point point-to-line relations, respectively. aim step I is infer relation between pupil centers spatially constrained points regard. By adopting "CalibMe" data acquisition method, sparse Gaussian Process using pseudo-inputs used capture smooth...

10.1109/tii.2018.2867952 article EN IEEE Transactions on Industrial Informatics 2018-08-30

Constructing convolutional neural network by utilizing nematode connectome: A brain-inspired method

OPENALEX - Publications

Dan Su Liangming Chen Xiaohao Du Mei Liu Long Jin

10.1016/j.asoc.2023.110992 article EN Applied Soft Computing 2023-10-30

M3Net: Multi-scale multi-path multi-modal fusion network and example application to RGB-D salient object detection

OPENALEX - Publications

Hao Chen Youfu Li Dan Su

Fusing RGB and depth data is compelling in boosting performance for various robotic computer vision tasks. Typically, the streams of information are merged into a single fusion point an early or late stage to generate combined features decisions. The also means path, which congested inflexible fuse all from different modalities. As result, process brute-force consequently insufficient. To address this problem, we propose multi-scale multi-path multi-modal network (M <sup...

10.1109/iros.2017.8206370 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017-09-01

Bilateral Denoising Diffusion Models

OPENALEX - Publications

Max W. Y. Lam Jun Wang Rongjie Huang Dan Su Dong Yu

Denoising diffusion probabilistic models (DDPMs) have emerged as competitive generative yet brought challenges to efficient sampling. In this paper, we propose novel bilateral denoising (BDDMs), which take significantly fewer steps generate high-quality samples. From a modeling objective, BDDMs parameterize the forward and reverse processes with score network scheduling network, respectively. We show that new lower bound tighter than standard evidence can be derived surrogate objective for...

10.48550/arxiv.2108.11514 preprint EN other-oa arXiv (Cornell University) 2021-01-01

DISR: Deep Infrared Spectral Restoration Algorithm for Robot Sensing and Intelligent Visual Tracking Systems

OPENALEX - Publications

Hai Liu Youfu Li Dan Su Zhaoli Zhang Sannyuya Liu and 1 more

Infrared imaging spectrometer (IRIS) often suffers from overlapped bands and random noises, which limit the precision of subsequent processing in robot vision sensing. To address this problem, we propose a novel Gabor transform-based infrared spectrum restoration method by successfully exploring intrinsic structure clean IR degraded one. At first, total variation (TV) regularized coefficients adjustment descriptor is designed incorporated into model. Then, proposed model inferred via an...

10.1109/iros40897.2019.8967891 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019-11-01

EISRP: Efficient infrared signal restoration processing for object tracking in human-robot interaction

OPENALEX - Publications

Dan Su Rui Feng

10.1016/j.infrared.2020.103544 article EN Infrared Physics & Technology 2020-10-06

A Moving Target Detection and Localization Strategy Based on Optical Flow and Pin-hole Imaging Methods Using Monocular Vision

OPENALEX - Publications

Shun Wang Qingqiang Guo Sheng Xu Dan Su

This paper proposes a new strategy for moving target detection and localization based on monocular vision. Firstly, to detect with large displacement high speed accurately, two consecutive video images captured by camera are preprocessed using the enhancement denoising methods. Then, optical flow representing motion information is calculated iteratively modified Lucas-Kanade method. Secondly, interest region extraction method developed overcome negative impacts caused noises in background....

10.1109/rcar52367.2021.9517462 article EN 2022 IEEE International Conference on Real-time Computing and Robotics (RCAR) 2021-07-15

A frequency estimation algorithm based on cross information fusion

OPENALEX - Publications

Dan Su Yaqing Tu Jian-yuan Luo Yanlin Shen Xiao Wei

To improve frequency estimation accuracy, a algorithm based on cross information fusion was proposed. The suitable for signals of short duration and low signal-to-noise ratio (SNR), which are common in engineering. Firstly, several different signal groups were obtained by grouping multisegment according to the guidelines combination. Secondly, rotation factors complementary each group. Thirdly, average spectrum achieved arithmetic mean value all spectra, calculated factors. Finally,...

10.1088/0957-0233/26/1/015004 article EN Measurement Science and Technology 2014-12-01

Toward flexible calibration of head-mounted gaze trackers with parallax error compensation

OPENALEX - Publications

Dan Su Youfu Li

Although the mobile head-mounted gaze tracker (HMGT) has gained its great success in human-machine interactions, real implementation of HMGT still poses several significant challenges. The parallax error and tedious calibration procedure, as two these challenges, will be addressed our proposed two-step method. In first step, instead fixating at pre-defined points successively, user is only required to change his or her head pose while gazing one marker with allowance short-period...

10.1109/robio.2016.7866370 article EN 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO) 2016-12-01

Fuse after Align: Improving Face-Voice Association Learning via Multimodal Encoder

OPENALEX - Publications

Chong Peng Liqiang He Dan Su

Today, there have been many achievements in learning the association between voice and face. However, most previous work models rely on cosine similarity or L2 distance to evaluate likeness of voices faces following contrastive learning, subsequently applied retrieval matching tasks. This method only considers embeddings as high-dimensional vectors, utilizing a minimal scope available information. paper introduces novel framework within an unsupervised setting for voice-face associations. By...

10.48550/arxiv.2404.09509 preprint EN arXiv (Cornell University) 2024-04-15

Intelligent acquisition model of traffic congestion information in the vehicle networking environment based on multi-sensor fusion

OPENALEX - Publications

Kun Jiang Dan Su Yanfu Zheng

Aiming at the problems of low bandwidth, poor anti-interference ability and low-detection accuracy traditional multi-path coherent vehicle network model, an intelligent acquisition model traffic congestion information in environment based on multi-features is proposed. The clusters uses multi-sensor fusion identification method to mine flow. In networking environment, analysed by theory, cross-fusion, text information, location image, audio, video other information-aware technologies,...

10.1504/ijvics.2019.101512 article EN International Journal of Vehicle Information and Communication Systems 2019-01-01

Research on Intelligent Vehicle Data Fusion Technology Based on Multiple Sensors

OPENALEX - Publications

Zhiqiang Gao Kaixin Tang Weitong Ji Dan Su

Edge intelligence is the development trend of integration Ubiquitous computing and artificial intelligence, autonomous systems represented by smart cars are playing an increasingly important role in edge architecture design, verification, application services, etc. This article takes accurate indoor mapping intelligent vehicles as research object, systematically designs a boundary point generation scheme that covers exploration, filtering, publishing, other parts. A hybrid algorithm...

10.1109/icemi59194.2023.10270059 article EN 2023-08-09

Automatic threshold selection for valid points detection in digital fringe projection

OPENALEX - Publications

Yi Xiao Youfu Li Dan Su

In digital fringe projection (DFP) techniques, invalid points such as shadows and background cause ambiguity to the measurement. Manually segmenting object is time-wasting, improper selection of threshold makes errors in this paper, we propose an automatic technique based on both modulation histogram intensity histogram, which can segment from a complex without losing useful information. The feasibility method verified by experiments binary defocusing at different defocus levels.

10.1109/icinfa.2017.8078905 article EN 2017-07-01

Precise gaze estimation for mobile gaze trackers based on hybrid two-view geometry

OPENALEX - Publications

Dan Su Youfu Li Yao Guo

In this paper, we propose a novel calibration framework for the gaze estimation of mobile tracking systems. our method, user's eye and camera are modeled as central catadioptric camera. Thus epipolar geometry tracker can be described by hybrid two-view geometry. To calibrate model, user is asked to at points distributed in 3-D space but not all located on one plane. light binocular training data, apply 3×6 local hybrid-fundamental matrix register pupil centers with lines scene image. image...

10.1109/robio.2017.8324434 article EN 2021 IEEE International Conference on Robotics and Biomimetics (ROBIO) 2017-12-01

Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training

OPENALEX - Publications

Wenliang Dai Zihan Liu Ziwei Ji Dan Su Pascale Fung

Large-scale vision-language pre-trained (VLP) models are prone to hallucinate non-existent visual objects when generating text based on information. In this paper, we systematically study the object hallucination problem from three aspects. First, examine recent state-of-the-art VLP models, showing that they still frequently, and achieving better scores standard metrics (e.g., CIDEr) could be more unfaithful. Second, investigate how different types of image encoding in influence...

10.48550/arxiv.2210.07688 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data

OPENALEX - Publications

Jun Wang Dan Su Jie Chen Shulin Feng Dongpeng Ma and 2 more

In this work, we try to answer two questions: Can deeply learned features with discriminative power benefit an ASR system's robustness acoustic variability? And how learn them without requiring framewise labelled sequence training data? As existing methods usually require knowing where the labels occur in input sequence, they have so far been limited many real-world learning tasks. We propose a novel method which simultaneously models both and feature within single network architecture, that...

10.1109/icassp.2019.8683088 preprint EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Coming Soon ...