Ziyu Hao

ORCID: 0000-0003-0277-1776
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Acoustic Wave Phenomena Research
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Vibration and Dynamic Analysis
  • Simulation Techniques and Applications
  • Advanced Data Storage Technologies
  • Digital Filter Design and Implementation
  • Interconnection Networks and Systems
  • Image and Signal Denoising Methods
  • Neural Networks and Applications
  • High voltage insulation and dielectric phenomena
  • Composite Structure Analysis and Optimization
  • Magnetic Bearings and Levitation Dynamics
  • Network Packet Processing and Optimization
  • Structural Health Monitoring Techniques
  • Advanced Sensor and Control Systems
  • Smart Materials for Construction
  • Vibration Control and Rheological Fluids
  • Advanced Neural Network Applications
  • Advanced Adaptive Filtering Techniques
  • Rock Mechanics and Modeling
  • Embedded Systems Design Techniques
  • Advanced Algorithms and Applications
  • Soil and Unsaturated Flow
  • Real-time simulation and control systems

Shenyang University of Technology
2022-2024

Chongqing University
2022-2023

Institute of Computing Technology
2009-2021

National University of Defense Technology
2019

Molecular dynamics (MD) simulation is a common tool to study the physical movements of atoms and molecules in many research fields. However, it an extremely time-consuming application which takes researchers weeks or months run single when size scales up computing demands keep growing. In this paper, improved MD implementation on Sunway TaihuLight supercomputer developed solve above mentioned issues. The new extended from existing (i.e., LAMMPS) widely uses application. heterogeneous with...

10.1109/hpcc-smartcity-dss.2016.0070 article EN 2016-12-01

In domains of VLSI and the rising SoC, system design exceedingly depends on simulation modeling. Conventional HDLs have some weakness including extravagant precision slow speed. On other hand, system-level modeling, such as SystemC, has been widely used all kinds projects achieved favorable results. When target scales excessively up, speed also drops to most extern. So, it is very important up by parallelizing modeling over HPC cluster. This paper introduces a novel parallel SystemC...

10.1109/icpads.2009.28 article EN 2009-01-01

Deep learning models have showed great potential in classification and recognition over the last decade. Belief Networks (DBNs) been applied visual, voice fields due to their feature presentation capability. However, there are a vast number of time consuming calculations training DBNs. Many researches accelerated DBNs with good speedups on CPU, GPU, FPGA, etc. At same time, latest published Sunway(SW) many-core processor has high computing performance dedicated heterogeneous architecture....

10.1109/hpcc-smartcity-dss.2016.0044 article EN 2016-12-01

Integrating a large number of simple cores on the chip to provide desired performance and throughput, microprocessor has entered many core era. In order fully extract ability processor, we propose speedup models for architecture in this paper. Under assumption Hill-Marty model, deduce our formulas based Gustafson's Law Sun-Ni's Law. Then, compared with theoretically analyze best allocation under given resources. Furthermore, apply conclusions evaluate current processors predict concrete...

10.1109/iscc-c.2013.146 article EN 2013-12-01

Download This Paper Open PDF in Browser Add to My Library Share: Permalink Using these links will ensure access this page indefinitely Copy URL DOI

10.2139/ssrn.4689744 preprint EN 2024-01-01

Abstract The performance of finite impulse response (FIR) filtering is an important index the real-time processing ability digital signal system. In this paper, various FIR algorithms are analyzed, and based on fast fir algorithm (FFA), block FFA (BFFA) proposed. A 128 order configurable filter circuit with 16 as a designed. analysis experimental data shows that bit integer reaches 881.78Gop/s.

10.1088/1742-6596/1827/1/012004 article EN Journal of Physics Conference Series 2021-03-01

10.7544/issn1000-1239.2015.20140004 article EN Journal of Computer Research and Development 2015-05-01

We present a parallel logic simulation framework aiming at more then one language, not only some special. After investigating number of research issues in simulation, we study these key techniques including general parallelization method, code portioning, and practical synchronization algorithm. Based on ArchSim (a system-level platform), the can run heterogeneous computing environments. By using framework, parallelize two languages: SystemC Verilog. Then, design pipelined multi-stage...

10.1145/1878537.1878694 article EN 2010-04-11

In recent times, the demand for computational capability of artificial intelligence (AI) is increasing rapidly.It well-known that high parallelism algorithm and strong reusability data provide more design space processor architecture design.The manycore has a huge development AI with its on-chip computing power, flexible architecture, efficient communication, optimized storage.Based on history processors, this paper summarizes main technical routes focuses requirements applications critical...

10.1360/n112018-00283 article EN Scientia Sinica Informationis 2019-03-01

Large-scale matrix multiplication is a fundamental kernel in science and engineering applications. However, existing computing platforms such as CPU, GPU FPGA suffer from limited performance or excessive power consumption. This paper presents high-performance efficient accelerator named MMA for floating-point based on scaled- out multi-array systolic arrays. A scheduling method proposed efficiently performing large-scale matrix-multiplication. Besides, an analytical model built related...

10.1109/eitce47263.2019.9095069 article EN 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE) 2019-10-01

Abstract The convolution operation takes up a large proportion in the neural network, which is kind of computation resource consuming operation. In this paper, domain transformation introduced to participate convolution, reduces acceleration obvious inference convolutional network.

10.1088/1742-6596/1748/3/032021 article EN Journal of Physics Conference Series 2021-01-01

Fast Fourier Transform (FFT) plays a key role in digital signal processing. With the rapid development of processing technology, high performance ultra-long point FFT is required. This paper analyzes characteristics algorithm, generalizes two-dimensional decomposition algorithm to multi-dimensional and proposes high-performance hardware implementation architecture. architecture implements three-dimensional transpose operation by using collision free bank addressing technology based on prime...

10.1109/ecie52353.2021.00069 article EN 2021-01-01
Coming Soon ...