Kai Li

ORCID: 0000-0003-3251-931X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Memory and Neural Computing
  • Advanced Neural Network Applications
  • Ferroelectric and Negative Capacitance Devices
  • Numerical Methods and Algorithms
  • Parallel Computing and Optimization Techniques
  • Low-power high-performance VLSI design
  • Neural Networks and Applications
  • CCD and CMOS Imaging Sensors
  • Advanced Data Storage Technologies
  • Advancements in Semiconductor Devices and Circuit Design
  • VLSI and Analog Circuit Testing
  • Topic Modeling
  • Time Series Analysis and Forecasting
  • Integrated Circuits and Semiconductor Failure Analysis
  • Anomaly Detection Techniques and Applications
  • Machine Learning and ELM

Southern University of Science and Technology
2022-2025

Multi-bit-width convolutional neural network (CNN) maintains the balance between accuracy and hardware efficiency, thus enlightening a promising method for accurate yet energy-efficient edge computing. In this work, we develop state-of-the-art multi-bit-width accelerator NAS Optimized deep learning networks. To efficiently process inferencing, multi-level optimizations have been proposed. Firstly, differential Neural Architecture Search (NAS) is adopted high generation. Secondly, hybrid...

10.1109/tcsi.2022.3178474 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2022-06-10

There is an emerging need to design multi-precision floating-point (FP) accelerators for high-performance-computing (HPC) applications. The commonly-used methods are based on high-precision-split (HPS) and low-precision-combination (LPC) structures, which suffer from low hardware utilization ratio various multiple clock-cycle processing periods. In this brief, a new FP element (PE) developed with proposed bit-partitioning method. Minimized redundant bits operands achieved. PE supports...

10.1109/tcsii.2022.3183007 article EN IEEE Transactions on Circuits & Systems II Express Briefs 2022-06-14

Optimized deep neural network (DNN) models and energy-efficient hardware designs are of great importance in edge-computing applications. The architecture search (NAS) methods employed for DNN model optimization with mixed-bitwidth networks. To satisfy the computation requirements, convolution accelerators highly desired low-power high-throughput performance. There exist several to support multiply-accumulate (MAC) operations accelerator designs. low-bitwidth-combination (LBC) method improves...

10.1109/tvlsi.2022.3210069 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2022-11-03

Multi-bit-width neural network enlightens a promising method for high performance yet energy efficient edge computing due to its balance between software algorithm accuracy and hardware efficiency. To date, FPGA has been one of the core platforms deploying various networks. However, it is still difficult fully make use dedicated digital signal processing (DSP) blocks in accelerating multi-bit-width network. In this work, we develop state-of-the-art convolutional accelerator with novel...

10.1145/3543622.3573209 article EN 2023-02-10

Computing-In-memory (CIM) accelerators have the characteristics of storage and computing integration, which can effectively improve efficiency convolutional neural network (CNN). To throughput computational energy while maintaining accuracy, this paper proposes an SRAM CIM accelerator with capacitor-coupling method. Charge-domain based accumulation scheme reduce impact multiplication (MAC) unit variations, makes it possible to increase in a fully-parallel manner. Furthermore, array size...

10.1109/aicas57966.2023.10168630 article EN 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2023-06-11

Approximate computing is an emerging and effective method for reducing energy consumption in digital circuits, which critical energy-efficient performance improvement of edge-computing devices. In this paper, we propose a low-power DNN accelerator with novel signed approximate multiplier based on probability-optimized compressor error compensation. The customized partial product matrix (PPM) operands, gets the optimal logic circuit after probabilistic analysis optimization. At same time,...

10.1109/ojcas.2023.3279251 article EN cc-by IEEE Open Journal of Circuits and Systems 2024-01-01

High-performance computing (HPC) can facilitate deep neural network (DNN) training and inference. Previous works have proposed multiple-precision floating- fixed-point designs, but most only handle either one independently. This brief proposes a novel reconfigurable processing element (PE) supporting both energy-efficient floating-point multiply-accumulate (MAC) operations. PE support <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">...

10.1109/tcsii.2023.3322259 article EN IEEE Transactions on Circuits & Systems II Express Briefs 2023-10-05
Coming Soon ...