Yasong Cao

ORCID: 0009-0003-7683-989X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Error Correcting Code Techniques
  • Interconnection Networks and Systems
  • Advanced Memory and Neural Computing
  • Cellular Automata and Applications
  • Ferroelectric and Negative Capacitance Devices
  • Analytical Chemistry and Sensors
  • Advanced Neural Network Applications
  • Graph Theory and Algorithms
  • E-commerce and Technology Innovations
  • VLSI and Analog Circuit Testing
  • Conducting polymers and applications
  • Luminescence and Fluorescent Materials
  • Porphyrin and Phthalocyanine Chemistry
  • Algorithms and Data Compression
  • Molecular Sensors and Ion Detection
  • Tensor decomposition and applications

National University of Defense Technology
2021-2024

Northwest Normal University
2024

Guilin University of Electronic Technology
2023

Sparse matrix-vector multiplication (SpMV) solves the product of a sparse matrix and dense vector, sparseness is often more than 90%. Usually, compressed to save storage resources, but this causes irregular access vectors in algorithm, which takes lot time degrades SpMV performance system. In study, we design dedicated channel DMA implement an indirect memory process speed up operation. On basis, propose six algorithm schemes map them optimize SpMV. The results show that M processor’s...

10.3390/electronics11223699 article EN Electronics 2022-11-11

State-of-the-art systolic array-based accelerators adopt the traditional im2col algorithm to accelerate inference of convolutional layers. However, cannot efficiently support AI backpropagation. Backpropagation in layers involves performing transposed convolution and dilated convolution, which usually introduces plenty zero-spaces into feature map or kernel. The zero-space data reorganization interfere with continuity training incur additional non-negligible overhead terms off- on-chip...

10.1109/iccd56317.2022.00068 article EN 2022 IEEE 40th International Conference on Computer Design (ICCD) 2022-10-01

Generalized Sparse Matrix-Matrix Multiplication (SpGEMM) is a critical kernel in domains like graph analytic and scientific computation. As kind of classical special-purpose architecture, systolic arrays were first used for complex computing problems, e.g., matrix multiplication. However, are not efficient enough when handling sparse matrices due to the fact that PEs containing zero-valued entries perform unnecessary operations do contribute result. Accordingly, this paper, we propose...

10.1145/3545008.3545053 article EN 2022-08-29

Cationic fluorophore (CF) with highly twisted conformation is a kind of very important functional materials in the field optical sensing and imaging. In this paper, isomers pyridinium-type CFs...

10.1039/d4qm00578c article EN Materials Chemistry Frontiers 2024-01-01

On-chip memory is one of the core components deep learning accelerators. In general, area used by on-chip accounts for around 30% total chip area. With increasing complexity algorithms, it will become a challenge accelerators to integrate much larger responding algorithm needs, whereas multiprecision computation required different precision (such as FP32, FP16) computations in training and inference. To solve it, this paper explores use single-port (SPM) systolic-array-based We propose...

10.3390/electronics11101587 article EN Electronics 2022-05-16

The systolic array provides extremely high efficiency for running matrix multiplication, and is one of the mainstream architectures today's deep learning accelerators. In order to develop efficient accelerators, people usually employ simulators make design trade-offs. However, current suffer from coarse-grained modeling methods ideal assumptions, which limits their ability describing structural characteristics arrays. addition, they do not support exploration microarchitecture. This paper...

10.1109/ispass55109.2022.00016 article EN 2022-05-01

The systolic array provides extremely high efficiency for running matrix multiplication and is one of the mainstream architectures today’s deep learning accelerators. In order to develop efficient accelerators, people usually employ simulators make design trade-offs. However, current suffer from coarse-grained modeling methods ideal assumptions, which limits their ability describe structural characteristics arrays. addition, they do not support exploration microarchitecture. This paper...

10.3390/electronics11182928 article EN Electronics 2022-09-15

The convergence of High-Performance Computing (HPC) and Artificial Intelligence (AI) has become a promising trend. Due to the different computation patterns HPC AI applications, it's challenging design an appropriate architecture balance their demand. To address this, we propose Matrix Zone (MZ), enhanced Systolic Array-based matrix engine that accelerates General Multiplication (GEMM) for both applications. We develop semi-memory hierarchy reduce on-chip area consumption data stitching...

10.1109/hpcc-dss-smartcity-dependsys57074.2022.00050 article EN 2022-12-01

On-chip memory is one of the core components deep learning accelerators. In general, area overhead on-chip accounts for over 25 % total chip area. With increasing complexity algorithms, it will become a challenge accelerators to integrate much larger responding algorithm needs. To solve it, this paper explores use Single Port memory(SPM) in systolic array based We propose an efficient address transformation method avoid conflict simultaneous read and write requests on SPM. addition,...

10.1109/hpcc-dss-smartcity-dependsys53884.2021.00044 article EN 2021-12-01

With the development of social economy, consumers' demand for quality vegetables is increasing, and vegetable goods changes over time. In this paper, we address issue stocking volume pricing strategy in superstores adopt statistical methods programming language data preprocessing, including searching, cleaning, transforming, integrating statute, aiming to optimize vegetables. First, study analyzes relationship between category single product sales time through Pearson coefficient reveal...

10.25236/ajcis.2023.061315 article EN Academic Journal of Computing & Information Science 2023-01-01

State-of-the-art systolic array-based accelerators adopt the traditional im2col algorithm to accelerate inference of convolutional layers. However, cannot efficiently support AI backpropagation. Backpropagation in layers involves performing transposed convolution and dilated convolution, which usually introduces plenty zero-spaces into feature map or kernel. The zero-space data reorganization interfere with continuity training incur additional non-negligible overhead terms off- on-chip...

10.48550/arxiv.2209.09434 preprint EN other-oa arXiv (Cornell University) 2022-01-01

As NN accelerators emerging, many analytical models are presented to help designers carry out hardware design space exploration. However, these cannot accurately simulate the systolic array-based accelerator due their pervasiveness or abstraction. In this paper, we propose a compute-centric simulator driven by execution of events from tiles mapping matrix, which can model accelerator. The focuses on conflicts when tile is used for data access, various interruptions caused resource...

10.1109/iscas48785.2022.9937624 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2022-05-28
Coming Soon ...