NFDI4DS | UHH-SEMS - Publication Details

Yuhong Song

ORCID: 0000-0002-4310-2766

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5041623482

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Quantum Computing Algorithms and Architecture
Advanced Neural Network Applications
Radiation Effects in Electronics
Advanced Memory and Neural Computing
Advanced Image and Video Retrieval Techniques
Quantum Information and Cryptography
Low-power high-performance VLSI design
Advancements in Semiconductor Devices and Circuit Design
Advanced Software Engineering Methodologies
Ferroelectric and Negative Capacitance Devices
Cloud Computing and Resource Management
Error Correcting Code Techniques
Green IT and Sustainability
Distributed systems and fault tolerance
Visual Attention and Saliency Detection
Quantum-Dot Cellular Automata
Machine Learning in Materials Science
Electronic and Structural Properties of Oxides
Interconnection Networks and Systems
Distributed and Parallel Computing Systems
Blockchain Technology Applications and Security
Speech Recognition and Synthesis
Quantum and electron transport phenomena

Xuzhou Medical College
2025

East China Normal University
2021-2024

Anhui University
2023

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization

OPENALEX - Publications

Panjie Qi Edwin H.‐M. Sha Qingfeng Zhuge Hongwu Peng Shaoyi Huang and 3 more

State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, the development of technology, more and devices available run a Transformer model. For model different constraints (tight or loose), it can deployed onto computing power. However, in previous work, designers did not choose best device among multiple Instead, they just used an existing deploy model, which was necessarily fit may lead...

10.1109/iccad51958.2021.9643586 article EN 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) 2021-11-01

QuSplit: Achieving Both High Fidelity and Throughput via Job Splitting on Noisy Quantum Computers

OPENALEX - Publications

Jinyang Li Yuhong Song Yipei Liu Jianli Pan Lei Yang and 2 more

As we enter the quantum utility era, computing paradigm shifts toward quantum-centric computing, where multiple processors collaborate with classical computers, exemplified by platforms like IBM Quantum and Amazon Braket. In this paradigm, efficient resource management is crucial; however, unlike face significant challenges due to noise, which raises fidelity concerns in applications. Compounding issue, noise characteristics across different are inherently heterogeneous, making optimization...

10.48550/arxiv.2501.12492 preprint EN arXiv (Cornell University) 2025-01-21

Escaping Barren Plateau: Co-Exploration of Quantum Circuit Parameters and Architectures

OPENALEX - Publications

Yipei Liu Yuhong Song Jinyang Li Qiang Guan Cheng‐Chang Lu and 2 more

Barren plateaus (BP), characterized by exponentially vanishing gradients that hinder the training of variational quantum circuits (VQC), present a pervasive and critical challenge in applying algorithms to real-world applications. It is widely recognized BP problem becomes more pronounced with an increase number parameters. This work demonstrates manifests at different scales depending on specific application, highlighting absence universal VQC ansatz capable resolving issue across all...

10.48550/arxiv.2501.13275 preprint EN arXiv (Cornell University) 2025-01-22

The impact of external oblique intercostal block on early postoperative pain and recovery in patients undergoing J-shaped incisions for upper abdominal surgery: a single-center prospective randomized controlled study

OPENALEX - Publications

Shuai Yi Xinlei Zhang Yuhong Song Xiaohui Wang Han Gao and 2 more

The aim of this study was to investigate the effects external oblique intercostal nerve block (EOIB) on early postoperative pain and recovery in patients undergoing J-shaped incision surgery upper abdomen. Patients aged 18-85 years, classified as ASA I-III, elective open abdominal under general anesthesia were included study. randomized into two groups: group (Group E ) control C ). Following induction anesthesia, Group received 30 ml 0.375% ropivacaine 4 mg dexamethasone for...

10.1186/s12871-025-03030-0 article EN cc-by-nc-nd BMC Anesthesiology 2025-04-05

Accommodating Transformer onto FPGA

OPENALEX - Publications

Panjie Qi Yuhong Song Hongwu Peng Shaoyi Huang Qingfeng Zhuge and 1 more

Recently, Transformers gradually gain popularity and perform outstanding for many Natural Language Processing (NLP) tasks. However, suffer from heavy computation memory footprint, making it difficult to deploy on embedded devices. The field-programmable gate array (FPGA) is widely used accelerate deep learning algorithms its advantages. the trained Transformer models are too large accommodate an FPGA fabric. To onto achieve efficient execution, we propose acceleration framework coupling...

10.1145/3453688.3461739 article EN Proceedings of the Great Lakes Symposium on VLSI 2022 2021-06-18

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

OPENALEX - Publications

Yuhong Song Weiwen Jiang Bingbing Li Panjie Qi Qingfeng Zhuge and 4 more

A pruning-based AutoML framework for run-time reconfigurability, namely RT <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> , is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching dynamic hardware conditions) at run-time. Such reconfigurability the key save energy battery-powered devices,...

10.1109/dac18074.2021.9586295 article EN 2021-11-08

Spatiotemporal knowledge teacher–student reinforcement learning to detect liver tumors without contrast agents

OPENALEX - Publications

Chenchu Xu Yuhong Song Dong Zhang Leonardo Kayat Bittencourt Sree Harsha Tirumani and 1 more

10.1016/j.media.2023.102980 article EN Medical Image Analysis 2023-09-25

Optimizing Data Placement for Hybrid SRAM+Racetrack Memory SPM in Embedded Systems

OPENALEX - Publications

Rui Xu Edwin H.‐M. Sha Qingfeng Zhuge Yuhong Song Han Wang and 1 more

Nonvolatile memory (NVM) has the potential as medium for scratchpad (SPM) in embedded devices. Racetrack (RM), particular, is a developing technology that possesses high density and read latency comparable to SRAM. The RM's access operations, however, are based on shift operations. Multiple operations will lead long energy. In this article, SRAM borrowed help shifts reduction. Thus, novel hybrid SRAM+RM SPM presented make use of SRAM's random density. But, there some challenges proposed...

10.1109/tcad.2022.3185548 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2022-06-27

BSC: Block-based Stochastic Computing to Enable Accurate and Efficient TinyML

OPENALEX - Publications

Yuhong Song Edwin H.‐M. Sha Qingfeng Zhuge Rui Xu Yongzhuo Zhang and 2 more

Along with the progress of AI democratization, machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving. Nowadays, more applications require ML on tiny devices extremely limited resources, like implantable cardioverter de-fibrillator (ICD), which is known TinyML. Unlike edge, TinyML a energy supply higher demands low-power execution. Stochastic computing (SC) using bitstreams for data representation promising since it can perform...

10.1109/asp-dac52403.2022.9712585 article EN 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) 2022-01-17

Optimizing Efficiency of Machine Learning Based Hard Disk Failure Prediction by Two-Layer Classification-Based Feature Selection

OPENALEX - Publications

Han Wang Qingfeng Zhuge Edwin H.‐M. Sha Rui Xu Yuhong Song

Predicting hard disk failure effectively and efficiently can prevent the high costs of data loss for storage systems. Disk prediction based on machine learning artificial intelligence has gained notable attention, because its good capabilities. Improving accuracy performance prediction, however, is still a challenging problem. When about to occur, time limited process, including building models predicting. Faster training would promote efficiency model updates, late predictions not only have...

10.3390/app13137544 article EN cc-by Applied Sciences 2023-06-26

Hardware-aware neural architecture search for stochastic computing-based neural networks on tiny devices

OPENALEX - Publications

Yuhong Song Edwin H.‐M. Sha Qingfeng Zhuge Rui Xu Xiaowei Xu and 2 more

10.1016/j.sysarc.2022.102810 article EN Journal of Systems Architecture 2022-12-22

Loop interchange and tiling for multi-dimensional loops to minimize write operations on NVMs

OPENALEX - Publications

Rui Xu Edwin H.‐M. Sha Qingfeng Zhuge Yuhong Song Han Wang

10.1016/j.sysarc.2022.102799 article EN Journal of Systems Architecture 2022-12-24

Mera: Memory Reduction and Acceleration for Quantum Circuit Simulation via Redundancy Exploration

OPENALEX - Publications

Yuhong Song Edwin H.‐M. Sha Longshan Xu Qingfeng Zhuge Zili Shao

With the development of quantum computing, processor demonstrates potential supremacy in specific applications, such as Grovers database search and popular neural networks (QNNs). For better calibrating algorithms machines, circuit simulation on classical computers becomes crucial. However, number bits (qubits) increases, memory requirement grows exponentially. In order to reduce usage accelerate simulation, we propose a multi-level optimization, namely Mera, by exploring computation...

10.48550/arxiv.2411.15332 preprint EN arXiv (Cornell University) 2024-11-22

Mera: Memory Reduction and Acceleration for Quantum Circuit Simulation via Redundancy Exploration

OPENALEX - Publications

Yuhong Song Edwin H.‐M. Sha Longshan Xu Qingfeng Zhuge Zili Shao

10.1109/iccd63220.2024.00087 article EN 2024-11-18

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

OPENALEX - Publications

Yuhong Song Weiwen Jiang Bingbing Li Panjie Qi Qingfeng Zhuge and 4 more

A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching dynamic hardware conditions) at run-time. Such reconfigurability the key save energy battery-powered devices, which widely use voltage frequency scaling (DVFS) technique reconfiguration prolong battery life. In...

10.48550/arxiv.2102.06336 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Optimal Loop Tiling for Minimizing Write Operations on NVMs with Complete Memory Latency Hiding

OPENALEX - Publications

Rui Xu Edwin Hsing.-Mean Sha Qingfeng Zhuge Yuhong Song Jingzhi Lin

Non-volatile memory (NVM) is expected to be the second level (named remote memory) in two-level hierarchy future. However, NVM has limited write endurance, thus it vital reduce number of operations on NVM. Meanwhile, hierarchy, prefetch widely used for fetching certain data before actually required, hide access latency. In general, large-scale nested loop performance bottleneck one program due caused by first local miss and reuse. Loop tiling key technique grouping iterations so as...

10.1109/asp-dac52403.2022.9712532 article EN 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) 2022-01-17

Efficient algorithm for full-state quantum circuit simulation with DD compression while maintaining accuracy

OPENALEX - Publications

Yuhong Song Edwin H.‐M. Sha Qingfeng Zhuge Rui Xu Han Wang

10.1007/s11128-023-04160-5 article EN Quantum Information Processing 2023-11-17

QuanPath: achieving one-step communication for distributed quantum circuit simulation

OPENALEX - Publications

Yuhong Song Edwin H.‐M. Sha Qingfeng Zhuge Wanyang Xiao Qijun Dai and 1 more

10.1007/s11128-023-04192-x article EN Quantum Information Processing 2023-12-21

Efficient Checkpoint under Unstable Power Supplies on NVM based Devices

OPENALEX - Publications

Jialin Liu Edwin H.‐M. Sha Qingfeng Zhuge Rui Xu Yuhong Song

In smart cities, many embedded devices' power supply lacks stability, e.g., solar panel, since they are used outdoors. these systems, failures can cause data lost if the is kept on volatile media like DRAM. Thus, using a backup mechanism to record program status necessary for going forward. Non-volatile memory (NVM) provides systems with convenient method persist programs' status. However, NVM notorious poor write endurance. this paper, we propose an automatic checkpoint that reduce wear....

10.1109/hpcc-dss-smartcity-dependsys57074.2022.00278 article EN 2022-12-01

Pseudo-Log: Restore Global Data Facing Power Failures with Minimum NVM Write

OPENALEX - Publications

Edwin H.‐M. Sha Yeteng Liao Qingfeng Zhuge Rui Xu Yuhong Song and 1 more

Embedded devices and systems are commonly used in various scenarios. But they often face power failures because of the unstable supply. So it is important to back up running state global data program under this architecture. Meanwhile, non-volatile memory (NVM) widely store its byte-addressability, low access latency, persistency. NVM has a limited write endurance backup procedure may cause many writes NVM. There works optimize state, but few considers consistency between data. In paper, we...

10.1109/hpcc-dss-smartcity-dependsys57074.2022.00301 article EN 2022-12-01

Coming Soon ...