Yuhong Li

ORCID: 0009-0009-9185-7133
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Anomaly Detection Techniques and Applications
  • Advanced Image and Video Retrieval Techniques
  • Adversarial Robustness in Machine Learning
  • Video Surveillance and Tracking Methods
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Advanced Vision and Imaging
  • Topic Modeling
  • Speech Recognition and Synthesis
  • Neural Networks and Applications
  • Advanced Malware Detection Techniques
  • Complex Network Analysis Techniques
  • Image Enhancement Techniques
  • Grey System Theory Applications
  • Advanced Numerical Methods in Computational Mathematics
  • CCD and CMOS Imaging Sensors
  • Machine Learning and Data Classification
  • Music and Audio Processing
  • Generative Adversarial Networks and Image Synthesis
  • Internet Traffic Analysis and Secure E-voting
  • Meteorological Phenomena and Simulations
  • Advanced Graph Neural Networks
  • Natural Language Processing Techniques

Harbin Institute of Technology
2022-2024

Griffith University
2022-2024

Alibaba Group (United States)
2020-2024

Alibaba Group (China)
2017-2024

Liaoning Meteorological Bureau
2019-2024

China Meteorological Administration
2024

University of Illinois Urbana-Champaign
2018-2023

Beijing University of Posts and Telecommunications
2009-2023

Stockholm University
2023

Shenzhen University
2018-2019

We propose a network for Congested Scene Recognition called CSRNet to provide data-driven and deep learning method that can understand highly congested scenes perform accurate count estimation as well present high-quality density maps. The proposed is composed of two major components: convolutional neural (CNN) the front-end 2D feature extraction dilated CNN back-end, which uses kernels deliver larger reception fields replace pooling operations. an easy-trained model because its pure...

10.1109/cvpr.2018.00120 article EN 2018-06-01

With the increase in software vulnerabilities that cause significant economic and social losses, automatic vulnerability detection has become essential development maintenance. Recently, large language models (LLMs) have received considerable attention due to their stunning intelligence, some studies consider using ChatGPT for detection. However, they do not fully characteristics of LLMs, since designed questions are simple without a prompt design tailored This paper launches study on...

10.1145/3639478.3643065 article EN 2024-04-14

This paper is a survey on the application of artificial neural networks in forecasting financial market prices. The objective this to appraise potential using predict system, as it reflected many relevant articles. It will provide some guidelines and references for research implementation. begins with an introduction theory networks. Subsequently focuses forecast stock prices option pricing based non-linear ANN model. proceeded presentation predicting exchange rates. then reviewed...

10.1109/iscid.2010.70 article EN International Symposium on Computational Intelligence and Design 2010-10-01

We propose a network for Congested Scene Recognition called CSRNet to provide data-driven and deep learning method that can understand highly congested scenes perform accurate count estimation as well present high-quality density maps. The proposed is composed of two major components: convolutional neural (CNN) the front-end 2D feature extraction dilated CNN back-end, which uses kernels deliver larger reception fields replace pooling operations. an easy-trained model because its pure...

10.48550/arxiv.1802.10062 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Few-shot image classification aims to classify unseen classes with limited labelled samples. Recent works benefit from the meta-learning process episodic tasks and can fast adapt class training testing. Due number of samples for each task, initial embedding network becomes an essential component largely affect performance in practice. To this end, most existing methods highly rely on efficient network. data, scale is constrained under a supervised learning(SL) manner which bottleneck...

10.1109/icassp39728.2021.9413783 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

High quality AI solutions require joint optimization of algorithms and their hardware implementations. In this work, we are the first to propose a fully simultaneous, Efficient Differentiable DNN (deep neural network) architecture implementation co-search (EDD) methodology. We formulate problem by fusing search variables into one solution space, maximize both algorithm accuracy quality. The formulation is differentiable with respect fused variables, so that gradient descent can be applied...

10.1109/dac18072.2020.9218749 article EN 2020-07-01

The inference process in Large Language Models (LLMs) is often limited due to the absence of parallelism auto-regressive decoding process, resulting most operations being restricted by memory bandwidth accelerators. While methods such as speculative have been suggested address this issue, their implementation impeded challenges associated with acquiring and maintaining a separate draft model. In paper, we present Medusa, an efficient method that augments LLM adding extra heads predict...

10.48550/arxiv.2401.10774 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity. Recently, most works focus on synthesizing independent images; While for real-world applications, it is common and necessary to generate a series of coherent images story-stelling. In this work, we mainly story visualization continuation tasks propose AR-LDM, latent model auto-regressively conditioned history captions generated images. Moreover, AR-LDM can generalize new characters through...

10.1109/wacv57701.2024.00290 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03

Object detection and tracking are challenging tasks for resource-constrained embedded systems. While these among the most compute-intensive from artificial intelligence domain, they only allowed to use limited computation memory resources on devices. In meanwhile, such implementations often required satisfy additional demanding requirements as real-time response, high-throughput performance, reliable inference accuracy. To overcome challenges, we propose SkyNet, a hardware-efficient neural...

10.48550/arxiv.1909.09709 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Recently, we have seen a rapid development of Deep Neural Network (DNN) based visual tracking solutions. Some trackers combine the DNN-based solutions with Discriminative Correlation Filters (DCF) to extract semantic features and successfully deliver state-of-the-art accuracy. However, these are highly compute-intensive, which require long processing time, resulting unsecured real-time performance. To both high accuracy reliable performance, propose novel tracker called...

10.48550/arxiv.1902.02804 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Partial person re-identification (ReID) aims to solve the problem of image spatial misalignment due occlusions or out-of-views. Despite significant progress through introduction additional information, such as human pose landmarks, mask maps, and partial ReID remains challenging noisy keypoints impressionable pedestrian representations. To address these issues, we propose a unified attribute-guided collaborative learning scheme for ReID. Specifically, introduce an adaptive threshold-guided...

10.1109/tpami.2023.3312302 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2023-09-05

Abstract By analyzing the product structure of extrusion valve and process problems existing in injection molding, selecting molding parameters polystyrene, mold this kind plastic parts is determined to be one four cavities. According average shrinkage rate size was calculated, conical fine striping mechanism arc side core-pulling were designed mold. The design machining forming parts, pouring system other structures are described detail, working introduced. has a successful trial, flexible...

10.1088/1742-6596/3004/1/012073 article EN Journal of Physics Conference Series 2025-05-01

In the context of clean and low-carbon transformation power systems, addressing challenge day-ahead electricity market price prediction issues triggered by strong stochastic volatility supply output due to high-penetration renewable energy integration, as well problems such limited dataset scales short cycles in test sets associated with existing methods, this paper introduced an innovative approach based on a multi-modal feature fusion BiGRUSA-ResSE-KAN deep learning model. data...

10.3390/sym17060805 article EN Symmetry 2025-05-22

Video moderation, which refers to remove deviant or explicit content from e-commerce livestreams, has become prevalent owing social and engaging features. However, this task is tedious time consuming due the difficulties associated with watching reviewing multimodal video content, including frames audio clips. To ensure effective we propose VideoModerator, a risk-aware framework that seamlessly integrates human knowledge machine insights. This incorporates set of advanced learning models...

10.1109/tvcg.2021.3114781 article EN IEEE Transactions on Visualization and Computer Graphics 2021-09-29

Convolutional models have been widely used in multiple domains. However, most existing only use local convolution, making the model unable to handle long-range dependency efficiently. Attention overcomes this problem by aggregating global information but also makes computational complexity quadratic sequence length. Recently, Gu et al. [2021] proposed a called S4 inspired state space model. can be efficiently implemented as convolutional whose kernel size equals input much longer sequences...

10.48550/arxiv.2210.09298 preprint EN other-oa arXiv (Cornell University) 2022-01-01

In this paper, we propose a joint generative and contrastive representation learning method (GeCo) for anomalous sound detection (ASD). GeCo exploits Predictive AutoEncoder (PAE) equipped with self-attention as model to perform frame-level prediction. The output of the PAE together original normal samples, are used supervised representative in multi-task framework. Besides cross-entropy loss between classes, is separate samples within each class. aims better capture context information among...

10.1109/icassp49357.2023.10095568 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Recent work has demonstrated that neural networks are vulnerable to adversarial examples. To escape from the predicament, many works try harden model in various ways, which training is an effective way learns robust feature representation so as resist attacks. Meanwhile, self-supervised learning aims learn and semantic embedding data itself. With these views, we introduce against examples this paper. Specifically, coupled with k-Nearest Neighbour proposed for classification. further...

10.1109/icassp40776.2020.9054475 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

In the real world, a desirable Visual Question Answering model is expected to provide correct answers new questions and images in continual setting (recognized as CL-VQA). However, existing works formulate CL-VQA from vision-only or language-only perspective, straightforwardly apply uni-modal learning (CL) strategies this multi-modal task, which improper suboptimal. On one hand, such partial formulation may result limited evaluations. other neglecting interactions between modalities will...

10.1109/iccv51070.2023.00276 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

The task of Language-Based Image Editing (LBIE) aims at generating a target image by editing the source based on given language description. main challenge LBIE is to disentangle semantics in and text then combine them generate realistic images. Therefore, performance heavily dependent learned representation. In this work, conditional generative adversarial network (cGAN) utilized for LBIE. We find that existing conditioning methods cGAN lack representation power as they cannot learn...

10.1109/icassp.2019.8683008 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Few-shot image classification aims to classify unseen classes with limited labelled samples. Recent works benefit from the meta-learning process episodic tasks and can fast adapt class training testing. Due number of samples for each task, initial embedding network becomes an essential component largely affect performance in practice. To this end, most existing methods highly rely on efficient network. data, scale is constrained under a supervised learning(SL) manner which bottleneck...

10.48550/arxiv.1911.06045 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Developing artificial intelligence (AI) at the edge is always challenging, since devices have limited computation capability and memory resources but need to meet demanding requirements, such as real-time processing, high throughput performance, inference accuracy. To overcome these challenges, we propose SkyNet, an extremely lightweight DNN with 12 convolutional (Conv) layers only 1.82 megabyte (MB) of parameters following a bottom-up design approach. SkyNet demonstrated in 56th IEEE/ACM...

10.48550/arxiv.1906.10327 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Words are treated as atomic units in natural language processing tasks and it is a fundamental step to represent them vectors for supporting subsequent computations. GloVe widely used machine learning model train word vectors. Generally, large corpus high computation resources required high-quality using GloVe, making difficult users their own by themselves. A choice nowadays outsource the training process cloud. However, coming with such cloud-based services serious privacy concerns, which...

10.1109/tifs.2024.3364080 article EN IEEE Transactions on Information Forensics and Security 2024-01-01
Coming Soon ...