Yue Zhao

ORCID: 0000-0003-4676-3612
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Distributed and Parallel Computing Systems
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Semantic Web and Ontologies
  • Algorithms and Data Compression
  • Cloud Computing and Resource Management
  • Scientific Computing and Data Management
  • Software System Performance and Reliability
  • Advanced Data Processing Techniques
  • Mining and Resource Management
  • Technology Assessment and Management
  • IPv6, Mobility, Handover, Networks, Security
  • Business Strategies and Innovation
  • Hydraulic and Pneumatic Systems
  • Underwater Vehicles and Communication Systems
  • Green IT and Sustainability
  • Security in Wireless Sensor Networks
  • Web Applications and Data Management
  • Maritime Navigation and Safety
  • Refrigeration and Air Conditioning Technologies
  • Statistical Distribution Estimation and Applications
  • Model-Driven Software Engineering Techniques
  • Probabilistic and Robust Engineering Design
  • Simulation and Modeling Applications
  • Education and Work Dynamics

Industrial and Commercial Bank of China
2024

University of Chinese Academy of Sciences
2013-2022

Brown University
2022

Academy of Mathematics and Systems Science
2022

Chinese Academy of Sciences
2022

North Carolina State University
2015-2020

Meta (Israel)
2020

Xi'an University of Science and Technology
2013

This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem determining best storage to maximize performance Sparse Matrix Vector Multiplication (SpMV). It describes how effectively bridge gap between needs pillar HPC through set techniques representations, structure, cross-architecture model migrations. The new solution cuts selection errors by two thirds, improves SpMV 1.73X average over state art.

10.1145/3178487.3178495 article EN 2018-02-06

Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier to meet practical needs. This work first time demonstrates that on existing GPUs, efficient preemptive possible even without special hardware support. Specifically, it presents EffiSha, pure software framework enables with very low overhead. The enabled...

10.1145/3018743.3018748 article EN 2017-01-26

This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem determining best storage to maximize performance Sparse Matrix Vector Multiplication (SpMV). It describes how effectively bridge gap between needs pillar HPC through set techniques representations, structure, cross-architecture model migrations. The new solution cuts selection errors by two thirds, improves SpMV 1.73X average over state art.

10.1145/3200691.3178495 article EN ACM SIGPLAN Notices 2018-02-10

Sparse matrix vector multiplication (SpMV) is an important kernel in many applications and often the major performance bottleneck. The storage format of sparse matrices critically affects SpMV. Although there have been previous studies on selecting appropriate for a given matrix, they ignored influence runtime prediction overhead conversion overhead. For common uses SpMV, such part execution times may outweigh benefits new formats. Ignoring them makes predictions from solutions frequently...

10.1109/ipdps.2018.00104 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018-05-01

Sparse matrix-vector multiplication (SpMV) is an important kernel and its performance critical for many applications. Storage format selection to select the best store a sparse matrix; it essential SpMV performance. Prior studies have focused on predicting that helps run fastest, but ignored runtime prediction conversion overhead. This work shows overhead makes predictions from previous solutions frequently sub-optimal sometimes inferior regarding end-to-end time. It proposes new paradigm...

10.1109/tpds.2019.2932931 article EN IEEE Transactions on Parallel and Distributed Systems 2019-08-05

Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier to meet practical needs. This work first time demonstrates that on existing GPUs, efficient preemptive possible even without special hardware support. Specifically, it presents EffiSha, pure software framework enables with very low overhead. The enabled...

10.1145/3155284.3018748 article EN ACM SIGPLAN Notices 2017-01-26

Program analysis is fundamental for program optimizations, debugging, and many other tasks. But developing analyses has been a challenging and error-prone process general users. Declarative program shown the promise to dramatically improve the productivity in development of analyses. Current declarative however subject some major limitations supporting cooperations among tools, guiding program often requires much effort repeated program preprocessing. In this work, we advocate...

10.4230/lipics.ecoop.2016.26 article EN European Conference on Object-Oriented Programming 2016-01-01

Modern machine learning programs are often written in Python, with the main computations specified through calls to some highly optimized libraries (e.g., TensorFlow, PyTorch). How maximize computing efficiency of such is essential for many application domains, which has drawn lots recent attention. This work points out a common limitation existing efforts: they focus their views only on static computation graphs by library APIs, but leave influence from hosting Python code largely...

10.1145/3377811.3380434 article EN 2020-06-27

Domain specific languages (DSLs) offer an attractive path to program large-scale, heterogeneous parallel computers since application developers can leverage high-level annotations defined by DSLs efficiently express algorithms without being distracted low-level hardware details. However, performance of DSL programs heavily relies on how well a implementation, including compilers and runtime systems, exploit knowledge across multiple layers software/hardware environments for optimizations....

10.1145/2830018.2830022 article EN 2015-11-05

In this work, we conduct a systematic exploration on the promise and challenges of deep learning for sparse matrix format selection. We propose set novel techniques to solve special learning, including input representations, late-merging neural network structure design, use transfer alleviate cross-architecture portability issues.

10.1109/pact.2017.33 article EN 2017-09-01

There are many technical tools and models to assist management in strategic planning process better realise a businesss strategy. Some classical widely used practice. They bring benefits, but their limitations should not be overlooked. This essay discusses the five - PEST analysis, SWOT Scenario Porters Five Forces Model, Growth-share Matrix. To apply these methods fast-changing IT environment, some of limitations, may it due narrow application initially was designed for, or changes macro...

10.54254/2754-1169/85/20240905 article EN cc-by Advances in Economics Management and Political Sciences 2024-05-27

Australia Post has seen increasing demands for its service and in response to this, an automated sorting station artificial intelligence-based prioritization system have been installed separately Tullamarine Mascot 2019. This essay aims answer the question: Can technology effectively improve processing capability of Post? Four hypotheses will be raised. Selected data each hypothesis firstly described visualized, then tested by Shapiro-Wilk test normal distribution. According result...

10.54254/2753-8818/39/20240607 article EN Theoretical and Natural Science 2024-07-26

This paper presents a prototype infrastructure for addressing the barriers effective accumulation, sharing, and reuse of various types knowledge high performance parallel computing.

10.1145/3018743.3019023 article EN 2017-01-26

Recently, HPC in the Cloud has emerged as a new paradigm field of parallel computing. Most cloud systems deploy virtual machines for provisioning resources. However, machine environment, there is still no mature method to analyze performance MPI programs. In this paper, we propose series innovative methods analysis programs on Xen machines, including data collection through instrumentation and sampling, bottleneck diagnosis using PAM DBSCAN clustering algorithms, root cause rough set...

10.1109/hpcc.and.euc.2013.215 article EN 2013-11-01

This paper presents a prototype infrastructure for addressing the barriers effective accumulation, sharing, and reuse of various types knowledge high performance parallel computing.

10.1145/3155284.3019023 article EN ACM SIGPLAN Notices 2017-01-26
Coming Soon ...