Ling Zhuo

ORCID: 0000-0001-9244-0115
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Numerical Methods and Algorithms
  • Embedded Systems Design Techniques
  • Low-power high-performance VLSI design
  • Interconnection Networks and Systems
  • Matrix Theory and Algorithms
  • Renal Diseases and Glomerulopathies
  • Advanced Data Storage Technologies
  • Particle accelerators and beam dynamics
  • AI and Big Data Applications
  • Digital Filter Design and Implementation
  • Vehicle License Plate Recognition
  • Tuberous Sclerosis Complex Research
  • Atrial Fibrillation Management and Outcomes
  • Ion Transport and Channel Regulation
  • Genetics, Aging, and Longevity in Model Organisms
  • Water Systems and Optimization
  • Human Resource and Talent Management
  • High-Voltage Power Transmission Systems
  • Soil, Finite Element Methods
  • Computational Drug Discovery Methods
  • Transplantation: Methods and Outcomes
  • Cerebrovascular and Carotid Artery Diseases
  • Geotechnical Engineering and Soil Mechanics
  • Collaboration in agile enterprises

Ludwig-Maximilians-Universität München
2021-2024

State Grid Corporation of China (China)
2019-2024

Zhangzhou Normal University
2023

Anhui Sanlian University
2022

Sun Yat-sen University
2022

Sun Yat-sen University Cancer Center
2022

Yangzhou University
2021

Xi'an Shiyou University
2021

Chongqing University
2021

Liming Vocational University
2020

Floating-point Sparse Matrix-Vector Multiplication (SpMXV) is a key computational kernel in scientific and engineering applications. The poor data locality of sparse matrices significantly reduces the performance SpMXV on general-purpose processors, which rely heavily cache hierarchy to achieve high performance. abundant hardware resources current FPGAs provide new opportunities improve SpMXV. In this paper, we propose an FPGA-based design for Our accepts Compressed Row Storage format, makes...

10.1145/1046192.1046202 article EN 2005-02-20

Summary form only given. FPGAs are increasingly being used in the high performance and scientific computing community to implement floating-point based hardware accelerators. We analyze multiplier adder/subtractor units by considering number of pipeline stages as a parameter use throughput/area metric. achieve throughput rates more than 240 Mhz (200 Mhz) for single (double) precision operations deeply pipelining units. To illustrate impact on kernel, we matrix multiplication kernel our show...

10.1109/ipdps.2004.1303135 article EN 2004-06-10

The abundant hardware resources on current reconfigurable computing systems provide new opportunities for high-performance parallel implementations of scientific computations. In this paper, we study designs floating-point matrix multiplication, a fundamental kernel in number applications, systems. We first analyze design trade-offs implementing kernel. These are caused by the inherent parallelism multiplication and resource constraints, including configurable slices, size on-chip memory,...

10.1109/tpds.2007.1001 article EN IEEE Transactions on Parallel and Distributed Systems 2007-03-14

Numerical linear algebra operations are key primitives in scientific computing. Performance optimizations of such have been extensively investigated. With the rapid advances technology, hardware acceleration applications using FPGAs (Field Programmable Gate Arrays) has become feasible. In this paper, we propose FPGA-based designs for several basic operations, including dot product, matrix-vector multiplication, matrix multiplication and factorization. By identifying parameters each...

10.1109/tc.2008.55 article EN IEEE Transactions on Computers 2008-06-24

Field-programmable gate arrays (FPGAs) have become an attractive option for accelerating scientific applications. Many operations such as matrix-vector multiplication and dot product involve the reduction of a sequentially produced stream values. Unfortunately, because pipelining in FPGA-based floating-point units, data hazards may occur during these sequential operations. Improperly designed circuits can adversely impact performance, impose unrealistic buffer requirements, consume...

10.1109/tpds.2007.1068 article EN IEEE Transactions on Parallel and Distributed Systems 2007-09-17

Summary form only given. The abundant hardware resources on current FPGAs provide new opportunities to improve the performance of implementations scientific computations. We propose two FPGA-based algorithms for floating-point matrix multiplication, a fundamental kernel in number applications. analyze design tradeoffs implementing this FPGAs. Our employ linear array architecture with small control logic. This effectively utilizes entire FPGA and reduces routing complexity. processing...

10.1109/ipdps.2004.1303036 article EN 2004-06-10

Field-Programmable Gate Arrays (FPGAs) have become an attractive option for scientific computing. Several vendors developed high performance reconfigurable systems which employ FPGAs application acceleration. In this paper, we propose a BLAS (Basic Linear Algebra Subprograms) library state-of-the-art systems. We study three data-intensive operations: dot product, matrix-vector multiply and dense matrix multiply. The first two operations are I/O bound, our designs efficiently utilize the...

10.1109/sc.2005.31 article EN 2005-12-22

Thrombospondin type 1 domain containing 7A (THSD7A) was recently identified target autoantigen in membranous nephropathy (MN). However, patients with positive THSD7A expression were prone to have malignancies. found be expressed a variety of malignant tumors. In this study, we investigated the histologic colorectal or breast cancers, as well relationship between and proteinuria cancers. A total 101 enrolled 81 them had cancer 20 cancer. detected by immunohistochemical staining tumor tissues....

10.1186/s12882-019-1489-5 article EN cc-by BMC Nephrology 2019-08-23

The use of pipelined floating-point arithmetic cores to create high-performance FPGA-based computational kernels has introduced a new class problems that do not exist when using single-cycle cores. In particular, the data hazards associated with reduction circuits can limit scalability or severely reduce performance an otherwise kernel. inability efficiently execute in hardware coupled memory bandwidth issues may even negate gains derived from acceleration this paper we introduce method for...

10.1109/ipdps.2005.165 article EN 2005-04-19

Numerical linear algebra operations are key primitives in scientific computing. Performance optimizations of such have been extensively investigated and some basic implemented as software libraries. With the rapid advances technology, hardware acceleration applications using FPGAs (field programmable gate arrays) has become feasible. In this paper, we propose FPGA-based designs for several BLAS operations, including vector product, matrix-vector multiply, matrix multiply. By identifying...

10.1109/icpp.2005.31 article EN 2005-08-03

FPGAs have become an attractive choice for scientific computing. In this paper, we propose a high performance design LU decomposition, key kernel in many and engineering applications. Our achieves the optimal decomposition using available hardware resources. The is parameterized. Thus, it can be easily adapted to various constraints. Experimental results show that our offers good scalability. implementation on Xilinx Virtex-II Pro XC2VP100 superior sustained floating-point over existing...

10.1109/fpl.2006.311238 article EN 2006-01-01

Focal segmental glomerulosclerosis (FSGS) is still one of the common causes refractory nephrotic syndrome. Nephrin, encoded by podocyte-specific NPHS1 gene, participated in pathogenesis FSGS. The sites mutations FSGS not clarified very well. In this study, we investigated specific gene Chinese patients with sporadic A total 309 were collected and screened for second-generation sequencing. variants compared those extracted from 2504 healthy controls 1000 Genomes Project. possible pathogenic...

10.1186/s12881-019-0845-4 article EN cc-by BMC Medical Genetics 2019-06-19

Co‐processors offer attractive acceleration opportunities to waveform‐based imaging and inversion applications in challenging exploration production environments. Unlike seismic forward modeling, the large amount of data involved can pose a significant challenge scalable acceleration. We provide compare several computational schemes perform anisotropic reverse‐time migration on two co‐processor platforms: FPGAs GPUs. Our ongoing experiments so far indicate that both platforms potentially...

10.1190/1.3255485 article EN 2009-01-01

Field-programmable gate arrays (FPGAs) have become an attractive option for scientific applications. However, due to the pipelining in FPGA-based floating-point units, data hazards may occur during reduction of series values. A typical example is accumulation sets values, which needed many operations such as dot product and matrix-vector multiplication. Reduction circuits can significantly impact overall performance, impose unrealistic buffer requirements, or occupy large area on FPGA. In...

10.1109/cahpc.2005.28 article EN 2006-02-15

Recently, it has become possible to implement floating-point cores on field-programmable gate arrays (FPGAs) provide acceleration for the myriad applications that require high-performance arithmetic. To achieve high clock rates, FPGAs must be deeply pipelined. This deep pipelining makes difficult reuse same core a series of dependent computations. However, use great deal area, so is important as few them in an architecture possible. In this paper, we describe area-efficient architectures and...

10.1109/tvlsi.2007.912038 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2008-01-16

FPGA-based floating-point kernels must exploit algorithmic parallelism and use deeply pipelined cores to gain a performance advantage over general-purpose processors. Inability hide the latency of lengthy pipelines can significantly reduce or impose unrealistic buffer requirements. Designs requiring reduction operations such as accumulation are particularly susceptible. In this paper we introduce two high-performance methods for reducing multiple sets sequentially delivered values in optimal...

10.1109/fccm.2005.42 article EN 2005-10-18

Natural compounds that either increase or decrease polymerization of actin into filaments have become indispensable tools for cell biology. However, to date, it was not possible use them as therapeutics due their overall cytotoxicity and unfavorable pharmacokinetics. Furthermore, synthesis is in general quite complicated. In an attempt find simplified analogues miuraenamide, nucleating compound, we identified derivatives with a paradoxical inversion the mode action: instead increased...

10.1021/acsomega.1c02838 article EN cc-by-nc-nd ACS Omega 2021-08-18

Recently, reconfigurable computing systems have been built which employ field-programmable gate arrays (FPGAs) as hardware accelerators for general-purpose processors. These provide new opportunities high-performance computing. In this paper, we investigate hybrid designs that effectively utilize both the FPGAs and processors in systems. Based on a high-level computational model, propose floating-point matrix multiplication block LU decomposition. our designs, workload of an application is...

10.1109/icpads.2006.95 article EN 2006-01-01

Recently, high-end reconfigurable computing systems have been built that employ Field Programmable Gate Arrays (FPGAs) as hardware accelerators for general-purpose processors. These not only provide new opportunities high-performance computing, but also pose challenges to application developers. In this paper, we build a design model hybrid designs utilize both the processors and FPGAs. The characterizes system using various parameters. Based on model, propose methodology hardware/software...

10.1109/tc.2008.84 article EN IEEE Transactions on Computers 2008-09-18

Aiming at the weaknesses of current meter reading mode, this paper proposes an intelligent ammeter recognition method based on deep learning. A camera is first used to collect image information digital dial, then collected images are pre-processed, and finally pre-processed automatically recognized by using multi-layer convolutional neural network in method. This can be for recognizing automatically, reducing workload manual improving precision meter.

10.1109/itaic.2019.8785764 article EN 2022 IEEE 10th Joint International Information Technology and Artificial Intelligence Conference (ITAIC) 2019-05-01

Recently, reconfigurable computing systems have been built which employ field-programmable gate arrays (FPGAs) as hardware accelerators for general-purpose processors. These provide new opportunities scientific computations. However, the co-existence of processors and FPGAs in such also poses challenges to application developers. In this paper, we investigate a design model hybrid designs, that is, designs utilize both FPGAs. The characterizes system using various parameters, including...

10.1109/ipdps.2007.370268 article EN 2007-01-01
Coming Soon ...