NFDI4DS | UHH-SEMS - Publication Details

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

OPENALEX - Publications

Yuhong Li Xiaofan Zhang Deming Chen

We propose a network for Congested Scene Recognition called CSRNet to provide data-driven and deep learning method that can understand highly congested scenes perform accurate count estimation as well present high-quality density maps. The proposed is composed of two major components: convolutional neural (CNN) the front-end 2D feature extraction dilated CNN back-end, which uses kernels deliver larger reception fields replace pooling operations. an easy-trained model because its pure...

10.1109/cvpr.2018.00120 article EN 2018-06-01

Prompt-Enhanced Software Vulnerability Detection Using ChatGPT

OPENALEX - Publications

Chenyuan Zhang Hao Liu Jiutian Zeng K Yang Yuhong Li and 1 more

With the increase in software vulnerabilities that cause significant economic and social losses, automatic vulnerability detection has become essential development maintenance. Recently, large language models (LLMs) have received considerable attention due to their stunning intelligence, some studies consider using ChatGPT for detection. However, they do not fully characteristics of LLMs, since designed questions are simple without a prompt design tailored This paper launches study on...

10.1145/3639478.3643065 article EN 2024-04-14

Applications of Artificial Neural Networks in Financial Economics: A Survey

OPENALEX - Publications

Yuhong Li MA Wei-hua

This paper is a survey on the application of artificial neural networks in forecasting financial market prices. The objective this to appraise potential using predict system, as it reflected many relevant articles. It will provide some guidelines and references for research implementation. begins with an introduction theory networks. Subsequently focuses forecast stock prices option pricing based non-linear ANN model. proceeded presentation predicting exchange rates. then reviewed...

10.1109/iscid.2010.70 article EN International Symposium on Computational Intelligence and Design 2010-10-01

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

OPENALEX - Publications

Yuhong Li Xiaofan Zhang Deming Chen

We propose a network for Congested Scene Recognition called CSRNet to provide data-driven and deep learning method that can understand highly congested scenes perform accurate count estimation as well present high-quality density maps. The proposed is composed of two major components: convolutional neural (CNN) the front-end 2D feature extraction dilated CNN back-end, which uses kernels deliver larger reception fields replace pooling operations. an easy-trained model because its pure...

10.48550/arxiv.1802.10062 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Self-Supervised Learning for Few-Shot Image Classification

OPENALEX - Publications

Da Chen Yuefeng Chen Yuhong Li Feng Mao Yuan He and 1 more

Few-shot image classification aims to classify unseen classes with limited labelled samples. Recent works benefit from the meta-learning process episodic tasks and can fast adapt class training testing. Due number of samples for each task, initial embedding network becomes an essential component largely affect performance in practice. To this end, most existing methods highly rely on efficient network. data, scale is constrained under a supervised learning(SL) manner which bottleneck...

10.1109/icassp39728.2021.9413783 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

EDD: Efficient Differentiable DNN Architecture and Implementation Co-search for Embedded AI Solutions

OPENALEX - Publications

Yuhong Li Cong Hao Xiaofan Zhang Xinheng Liu Yao Chen and 3 more

High quality AI solutions require joint optimization of algorithms and their hardware implementations. In this work, we are the first to propose a fully simultaneous, Efficient Differentiable DNN (deep neural network) architecture implementation co-search (EDD) methodology. We formulate problem by fusing search variables into one solution space, maximize both algorithm accuracy quality. The formulation is differentiable with respect fused variables, so that gradient descent can be applied...

10.1109/dac18072.2020.9218749 article EN 2020-07-01

Parameter optimization of nonlinear grey Bernoulli model using particle swarm optimization

OPENALEX - Publications

Jianzhong Zhou Rengcun Fang Yuhong Li Yongchuan Zhang Bing Peng

10.1016/j.amc.2008.10.045 article EN Applied Mathematics and Computation 2008-11-09

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

OPENALEX - Publications

Tianle Cai Yuhong Li Zhengyang Geng Hongwu Peng Jason D. Lee and 2 more

The inference process in Large Language Models (LLMs) is often limited due to the absence of parallelism auto-regressive decoding process, resulting most operations being restricted by memory bandwidth accelerators. While methods such as speculative have been suggested address this issue, their implementation impeded challenges associated with acquiring and maintaining a separate draft model. In paper, we present Medusa, an efficient method that augments LLM adding extra heads predict...

10.48550/arxiv.2401.10774 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

OPENALEX - Publications

Xichen Pan Pengda Qin Yuhong Li Hui Xue Wenhu Chen

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity. Recently, most works focus on synthesizing independent images; While for real-world applications, it is common and necessary to generate a series of coherent images story-stelling. In this work, we mainly story visualization continuation tasks propose AR-LDM, latent model auto-regressively conditioned history captions generated images. Moreover, AR-LDM can generalize new characters through...

10.1109/wacv57701.2024.00290 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems

OPENALEX - Publications

Xiaofan Zhang Haoming Lu Cong Hao Jiachen Li Bowen Cheng and 7 more

Object detection and tracking are challenging tasks for resource-constrained embedded systems. While these among the most compute-intensive from artificial intelligence domain, they only allowed to use limited computation memory resources on devices. In meanwhile, such implementations often required satisfy additional demanding requirements as real-time response, high-throughput performance, reliable inference accuracy. To overcome challenges, we propose SkyNet, a hardware-efficient neural...

10.48550/arxiv.1909.09709 preprint EN other-oa arXiv (Cornell University) 2019-01-01

SiamVGG: Visual Tracking using Deeper Siamese Networks

OPENALEX - Publications

Yuhong Li Xiaofan Zhang

Recently, we have seen a rapid development of Deep Neural Network (DNN) based visual tracking solutions. Some trackers combine the DNN-based solutions with Discriminative Correlation Filters (DCF) to extract semantic features and successfully deliver state-of-the-art accuracy. However, these are highly compute-intensive, which require long processing time, resulting unsecured real-time performance. To both high accuracy reliable performance, propose novel tracker called...

10.48550/arxiv.1902.02804 preprint EN other-oa arXiv (Cornell University) 2019-01-01

GraphLSHC: Towards large scale spectral hypergraph clustering

OPENALEX - Publications

Yiyang Yang Sucheng Deng Juan Lu Yuhong Li Zhiguo Gong and 2 more

10.1016/j.ins.2020.07.018 article EN Information Sciences 2020-07-24

Attribute-Guided Collaborative Learning for Partial Person Re-Identification

OPENALEX - Publications

Haoyu Zhang Meng Liu Yuhong Li Ming Yan Zan Gao and 2 more

Partial person re-identification (ReID) aims to solve the problem of image spatial misalignment due occlusions or out-of-views. Despite significant progress through introduction additional information, such as human pose landmarks, mask maps, and partial ReID remains challenging noisy keypoints impressionable pedestrian representations. To address these issues, we propose a unified attribute-guided collaborative learning scheme for ReID. Specifically, introduce an adaptive threshold-guided...

10.1109/tpami.2023.3312302 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-09-05

Design and Process Analysis of the Injection Mold for Extrusion Valves

OPENALEX - Publications

Yuhong Li Jae‐Hyun Kang

Abstract By analyzing the product structure of extrusion valve and process problems existing in injection molding, selecting molding parameters polystyrene, mold this kind plastic parts is determined to be one four cavities. According average shrinkage rate size was calculated, conical fine striping mechanism arc side core-pulling were designed mold. The design machining forming parts, pouring system other structures are described detail, working introduced. has a successful trial, flexible...

10.1088/1742-6596/3004/1/012073 article EN Journal of Physics Conference Series 2025-05-01

A Credit Card Fraud Detection Approach Based on Ensemble Machine Learning Classifier with Hybrid Data Sampling

OPENALEX - Publications

Kareem Ahmed Stefan Axelsson Yuhong Li Ali Makki Sagheer

10.1016/j.mlwa.2025.100675 article EN cc-by-nc-nd Machine Learning with Applications 2025-05-01

A BiGRUSA-ResSE-KAN Hybrid Deep Learning Model for Day-Ahead Electricity Price Prediction

OPENALEX - Publications

Nan Yang Guihong Bi Yuhong Li Xiaoling Wang Zhao Luo and 1 more

In the context of clean and low-carbon transformation power systems, addressing challenge day-ahead electricity market price prediction issues triggered by strong stochastic volatility supply output due to high-penetration renewable energy integration, as well problems such limited dataset scales short cycles in test sets associated with existing methods, this paper introduced an innovative approach based on a multi-modal feature fusion BiGRUSA-ResSE-KAN deep learning model. data...

10.3390/sym17060805 article EN Symmetry 2025-05-22

VideoModerator: A Risk-aware Framework for Multimodal Video Moderation in E-Commerce

OPENALEX - Publications

Tan Tang Yanhong Wu Yingcai Wu Lingyun Yu Yuhong Li

Video moderation, which refers to remove deviant or explicit content from e-commerce livestreams, has become prevalent owing social and engaging features. However, this task is tedious time consuming due the difficulties associated with watching reviewing multimodal video content, including frames audio clips. To ensure effective we propose VideoModerator, a risk-aware framework that seamlessly integrates human knowledge machine insights. This incorporates set of advanced learning models...

10.1109/tvcg.2021.3114781 article EN IEEE Transactions on Visualization and Computer Graphics 2021-09-29

What Makes Convolutional Models Great on Long Sequence Modeling?

OPENALEX - Publications

Yuhong Li Tianle Cai Yi Zhang Deming Chen Debadeepta Dey

Convolutional models have been widely used in multiple domains. However, most existing only use local convolution, making the model unable to handle long-range dependency efficiently. Attention overcomes this problem by aggregating global information but also makes computational complexity quadratic sequence length. Recently, Gu et al. [2021] proposed a called S4 inspired state space model. can be efficiently implemented as convolutional whose kernel size equals input much longer sequences...

10.48550/arxiv.2210.09298 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection

OPENALEX - Publications

Xiaomin Zeng Yan Song Zhu Zhuo Yu Zhou Yuhong Li and 3 more

In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD). GeCo exploits Predictive AutoEncoder (PAE) equipped with self-attention as model to perform frame-level prediction. The output of the PAE together original normal samples, are used supervised representative in multi-task framework. Besides cross-entropy loss between classes, is separate samples within each class. aims better capture context information among...

10.1109/icassp49357.2023.10095568 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Self-Supervised Adversarial Training

OPENALEX - Publications

Kejiang Chen Yuefeng Chen Hang Zhou Xiaofeng Mao Yuhong Li and 4 more

Recent work has demonstrated that neural networks are vulnerable to adversarial examples. To escape from the predicament, many works try harden model in various ways, which training is an effective way learns robust feature representation so as resist attacks. Meanwhile, self-supervised learning aims learn and semantic embedding data itself. With these views, we introduce against examples this paper. Specifically, coupled with k-Nearest Neighbour proposed for classification. further...

10.1109/icassp40776.2020.9054475 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering

OPENALEX - Publications

Zi Qian Xin Wang Xuguang Duan Pengda Qin Yuhong Li and 1 more

In the real world, a desirable Visual Question Answering model is expected to provide correct answers new questions and images in continual setting (recognized as CL-VQA). However, existing works formulate CL-VQA from vision-only or language-only perspective, straightforwardly apply uni-modal learning (CL) strategies this multi-modal task, which improper suboptimal. On one hand, such partial formulation may result limited evaluations. other neglecting interactions between modalities will...

10.1109/iccv51070.2023.00276 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks

OPENALEX - Publications

Xiaofeng Mao Yuefeng Chen Yuhong Li Tao Xiong Yuan He and 1 more

The task of Language-Based Image Editing (LBIE) aims at generating a target image by editing the source based on given language description. main challenge LBIE is to disentangle semantics in and text then combine them generate realistic images. Therefore, performance heavily dependent learned representation. In this work, conditional generative adversarial network (cGAN) utilized for LBIE. We find that existing conditioning methods cGAN lack representation power as they cannot learn...

10.1109/icassp.2019.8683008 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Self-Supervised Learning For Few-Shot Image Classification

OPENALEX - Publications

Da Chen Yuefeng Chen Yuhong Li Feng Mao Yuan He and 1 more

Few-shot image classification aims to classify unseen classes with limited labelled samples. Recent works benefit from the meta-learning process episodic tasks and can fast adapt class training testing. Due number of samples for each task, initial embedding network becomes an essential component largely affect performance in practice. To this end, most existing methods highly rely on efficient network. data, scale is constrained under a supervised learning(SL) manner which bottleneck...

10.48550/arxiv.1911.06045 preprint EN other-oa arXiv (Cornell University) 2019-01-01

SkyNet: A Champion Model for DAC-SDC on Low Power Object Detection

OPENALEX - Publications

Xiaofan Zhang Cong Hao Haoming Lu Jiachen Li Yuhong Li and 7 more

Developing artificial intelligence (AI) at the edge is always challenging, since devices have limited computation capability and memory resources but need to meet demanding requirements, such as real-time processing, high throughput performance, inference accuracy. To overcome these challenges, we propose SkyNet, an extremely lightweight DNN with 12 convolutional (Conv) layers only 1.82 megabyte (MB) of parameters following a bottom-up design approach. SkyNet demonstrated in 56th IEEE/ACM...

10.48550/arxiv.1906.10327 preprint EN other-oa arXiv (Cornell University) 2019-01-01

PPGloVe: Privacy-Preserving GloVe for Training Word Vectors in the Dark

OPENALEX - Publications

Zhongyun Hua Yan Tong Yifeng Zheng Yuhong Li Yushu Zhang

Words are treated as atomic units in natural language processing tasks and it is a fundamental step to represent them vectors for supporting subsequent computations. GloVe widely used machine learning model train word vectors. Generally, large corpus high computation resources required high-quality using GloVe, making difficult users their own by themselves. A choice nowadays outsource the training process cloud. However, coming with such cloud-based services serious privacy concerns, which...

10.1109/tifs.2024.3364080 article EN IEEE Transactions on Information Forensics and Security 2024-01-01