- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Quantum Computing Algorithms and Architecture
- Advanced Neural Network Applications
- Radiation Effects in Electronics
- Advanced Memory and Neural Computing
- Advanced Image and Video Retrieval Techniques
- Quantum Information and Cryptography
- Low-power high-performance VLSI design
- Advancements in Semiconductor Devices and Circuit Design
- Advanced Software Engineering Methodologies
- Ferroelectric and Negative Capacitance Devices
- Cloud Computing and Resource Management
- Error Correcting Code Techniques
- Green IT and Sustainability
- Distributed systems and fault tolerance
- Visual Attention and Saliency Detection
- Quantum-Dot Cellular Automata
- Machine Learning in Materials Science
- Electronic and Structural Properties of Oxides
- Interconnection Networks and Systems
- Distributed and Parallel Computing Systems
- Blockchain Technology Applications and Security
- Speech Recognition and Synthesis
- Quantum and electron transport phenomena
Xuzhou Medical College
2025
East China Normal University
2021-2024
Anhui University
2023
State-of-the-art Transformer-based models, with gigantic parameters, are difficult to be accommodated on resource constrained embedded devices. Moreover, the development of technology, more and devices available run a Transformer model. For model different constraints (tight or loose), it can deployed onto computing power. However, in previous work, designers did not choose best device among multiple Instead, they just used an existing deploy model, which was necessarily fit may lead...
As we enter the quantum utility era, computing paradigm shifts toward quantum-centric computing, where multiple processors collaborate with classical computers, exemplified by platforms like IBM Quantum and Amazon Braket. In this paradigm, efficient resource management is crucial; however, unlike face significant challenges due to noise, which raises fidelity concerns in applications. Compounding issue, noise characteristics across different are inherently heterogeneous, making optimization...
Barren plateaus (BP), characterized by exponentially vanishing gradients that hinder the training of variational quantum circuits (VQC), present a pervasive and critical challenge in applying algorithms to real-world applications. It is widely recognized BP problem becomes more pronounced with an increase number parameters. This work demonstrates manifests at different scales depending on specific application, highlighting absence universal VQC ansatz capable resolving issue across all...
The aim of this study was to investigate the effects external oblique intercostal nerve block (EOIB) on early postoperative pain and recovery in patients undergoing J-shaped incision surgery upper abdomen. Patients aged 18-85 years, classified as ASA I-III, elective open abdominal under general anesthesia were included study. randomized into two groups: group (Group E ) control C ). Following induction anesthesia, Group received 30 ml 0.375% ropivacaine 4 mg dexamethasone for...
Recently, Transformers gradually gain popularity and perform outstanding for many Natural Language Processing (NLP) tasks. However, suffer from heavy computation memory footprint, making it difficult to deploy on embedded devices. The field-programmable gate array (FPGA) is widely used accelerate deep learning algorithms its advantages. the trained Transformer models are too large accommodate an FPGA fabric. To onto achieve efficient execution, we propose acceleration framework coupling...
A pruning-based AutoML framework for run-time reconfigurability, namely RT <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> , is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching dynamic hardware conditions) at run-time. Such reconfigurability the key save energy battery-powered devices,...
Nonvolatile memory (NVM) has the potential as medium for scratchpad (SPM) in embedded devices. Racetrack (RM), particular, is a developing technology that possesses high density and read latency comparable to SRAM. The RM's access operations, however, are based on shift operations. Multiple operations will lead long energy. In this article, SRAM borrowed help shifts reduction. Thus, novel hybrid SRAM+RM SPM presented make use of SRAM's random density. But, there some challenges proposed...
Along with the progress of AI democratization, machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving. Nowadays, more applications require ML on tiny devices extremely limited resources, like implantable cardioverter de-fibrillator (ICD), which is known TinyML. Unlike edge, TinyML a energy supply higher demands low-power execution. Stochastic computing (SC) using bitstreams for data representation promising since it can perform...
Predicting hard disk failure effectively and efficiently can prevent the high costs of data loss for storage systems. Disk prediction based on machine learning artificial intelligence has gained notable attention, because its good capabilities. Improving accuracy performance prediction, however, is still a challenging problem. When about to occur, time limited process, including building models predicting. Faster training would promote efficiency model updates, late predictions not only have...
With the development of quantum computing, processor demonstrates potential supremacy in specific applications, such as Grovers database search and popular neural networks (QNNs). For better calibrating algorithms machines, circuit simulation on classical computers becomes crucial. However, number bits (qubits) increases, memory requirement grows exponentially. In order to reduce usage accelerate simulation, we propose a multi-level optimization, namely Mera, by exploring computation...
A pruning-based AutoML framework for run-time reconfigurability, namely RT3, is proposed in this work. This enables Transformer-based large Natural Language Processing (NLP) models to be efficiently executed on resource-constrained mobile devices and reconfigured (i.e., switching dynamic hardware conditions) at run-time. Such reconfigurability the key save energy battery-powered devices, which widely use voltage frequency scaling (DVFS) technique reconfiguration prolong battery life. In...
Non-volatile memory (NVM) is expected to be the second level (named remote memory) in two-level hierarchy future. However, NVM has limited write endurance, thus it vital reduce number of operations on NVM. Meanwhile, hierarchy, prefetch widely used for fetching certain data before actually required, hide access latency. In general, large-scale nested loop performance bottleneck one program due caused by first local miss and reuse. Loop tiling key technique grouping iterations so as...
In smart cities, many embedded devices' power supply lacks stability, e.g., solar panel, since they are used outdoors. these systems, failures can cause data lost if the is kept on volatile media like DRAM. Thus, using a backup mechanism to record program status necessary for going forward. Non-volatile memory (NVM) provides systems with convenient method persist programs' status. However, NVM notorious poor write endurance. this paper, we propose an automatic checkpoint that reduce wear....
Embedded devices and systems are commonly used in various scenarios. But they often face power failures because of the unstable supply. So it is important to back up running state global data program under this architecture. Meanwhile, non-volatile memory (NVM) widely store its byte-addressability, low access latency, persistency. NVM has a limited write endurance backup procedure may cause many writes NVM. There works optimize state, but few considers consistency between data. In paper, we...