NFDI4DS | UHH-SEMS - Publication Details

Bridging the gap between deep learning and sparse matrix format selection

OPENALEX - Publications

Yue Zhao Jiajia Li Chunhua Liao Xipeng Shen

This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem determining best storage to maximize performance Sparse Matrix Vector Multiplication (SpMV). It describes how effectively bridge gap between needs pillar HPC through set techniques representations, structure, cross-architecture model migrations. The new solution cuts selection errors by two thirds, improves SpMV 1.73X average over state art.

10.1145/3178487.3178495 article EN 2018-02-06

EffiSha

OPENALEX - Publications

Guoyang Chen Yue Zhao Xipeng Shen Huiyang Zhou

Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier to meet practical needs. This work first time demonstrates that on existing GPUs, efficient preemptive possible even without special hardware support. Specifically, it presents EffiSha, pure software framework enables with very low overhead. The enabled...

10.1145/3018743.3018748 article EN 2017-01-26

Bridging the gap between deep learning and sparse matrix format selection

OPENALEX - Publications

Yue Zhao Jiajia Li Chunhua Liao Xipeng Shen

This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem determining best storage to maximize performance Sparse Matrix Vector Multiplication (SpMV). It describes how effectively bridge gap between needs pillar HPC through set techniques representations, structure, cross-architecture model migrations. The new solution cuts selection errors by two thirds, improves SpMV 1.73X average over state art.

10.1145/3200691.3178495 article EN ACM SIGPLAN Notices 2018-02-10

Overhead-Conscious Format Selection for SpMV-Based Applications

OPENALEX - Publications

Yue Zhao Weijie Zhou Xipeng Shen Graham Yiu

Sparse matrix vector multiplication (SpMV) is an important kernel in many applications and often the major performance bottleneck. The storage format of sparse matrices critically affects SpMV. Although there have been previous studies on selecting appropriate for a given matrix, they ignored influence runtime prediction overhead conversion overhead. For common uses SpMV, such part execution times may outweigh benefits new formats. Ignoring them makes predictions from solutions frequently...

10.1109/ipdps.2018.00104 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018-05-01

Enabling Runtime SpMV Format Selection through an Overhead Conscious Method

OPENALEX - Publications

Weijie Zhou Yue Zhao Xipeng Shen Wang Chen

Sparse matrix-vector multiplication (SpMV) is an important kernel and its performance critical for many applications. Storage format selection to select the best store a sparse matrix; it essential SpMV performance. Prior studies have focused on predicting that helps run fastest, but ignored runtime prediction conversion overhead. This work shows overhead makes predictions from previous solutions frequently sub-optimal sometimes inferior regarding end-to-end time. It proposes new paradigm...

10.1109/tpds.2019.2932931 article EN IEEE Transactions on Parallel and Distributed Systems 2019-08-05

EffiSha

OPENALEX - Publications

Guoyang Chen Yue Zhao Xipeng Shen Huiyang Zhou

Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier to meet practical needs. This work first time demonstrates that on existing GPUs, efficient preemptive possible even without special hardware support. Specifically, it presents EffiSha, pure software framework enables with very low overhead. The enabled...

10.1145/3155284.3018748 article EN ACM SIGPLAN Notices 2017-01-26

Towards ontology-based program analysis

OPENALEX - Publications

Yue Zhao Guoyang Chen Chunhua Liao Xipeng Shen

Program analysis is fundamental for program optimizations, debugging, and many other tasks. But developing analyses has been a challenging and error-prone process general users. Declarative program shown the promise to dramatically improve the productivity in development of analyses. Current declarative however subject some major limitations supporting cooperations among tools, guiding program often requires much effort repeated program preprocessing. In this work, we advocate...

10.4230/lipics.ecoop.2016.26 article EN European Conference on Object-Oriented Programming 2016-01-01

HARP

OPENALEX - Publications

Weijie Zhou Yue Zhao Guo‐Qiang Zhang Xipeng Shen

Modern machine learning programs are often written in Python, with the main computations specified through calls to some highly optimized libraries (e.g., TensorFlow, PyTorch). How maximize computing efficiency of such is essential for many application domains, which has drawn lots recent attention. This work points out a common limitation existing efforts: they focus their views only on static computation graphs by library APIs, but leave influence from hosting Python code largely...

10.1145/3377811.3380434 article EN 2020-06-27

Enhancing domain specific language implementations through ontology

OPENALEX - Publications

Chunhua Liao Pei‐Hung Lin Daniel J. Quinlan Yue Zhao Xipeng Shen

Domain specific languages (DSLs) offer an attractive path to program large-scale, heterogeneous parallel computers since application developers can leverage high-level annotations defined by DSLs efficiently express algorithms without being distracted low-level hardware details. However, performance of DSL programs heavily relies on how well a implementation, including compilers and runtime systems, exploit knowledge across multiple layers software/hardware environments for optimizations....

10.1145/2830018.2830022 article EN 2015-11-05

A spectral method for stochastic fractional PDEs using dynamically-orthogonal/bi-orthogonal decomposition

OPENALEX - Publications

Yue Zhao Zhiping Mao Ling Guo Yifa Tang George Em Karniadakis

10.1016/j.jcp.2022.111213 article EN Journal of Computational Physics 2022-04-07

POSTER: Bridging the Gap Between Deep Learning and Sparse Matrix Format Selection

OPENALEX - Publications

Yue Zhao Jiajia Li Chunhua Liao Xipeng Shen

In this work, we conduct a systematic exploration on the promise and challenges of deep learning for sparse matrix format selection. We propose set novel techniques to solve special learning, including input representations, late-merging neural network structure design, use transfer alleviate cross-architecture portability issues.

10.1109/pact.2017.33 article EN 2017-09-01

Classical Strategic Planning May not Be Adequate for a Fast-changing IT Environment

OPENALEX - Publications

Yue Zhao

There are many technical tools and models to assist management in strategic planning process better realise a businesss strategy. Some classical widely used practice. They bring benefits, but their limitations should not be overlooked. This essay discusses the five - PEST analysis, SWOT Scenario Porters Five Forces Model, Growth-share Matrix. To apply these methods fast-changing IT environment, some of limitations, may it due narrow application initially was designed for, or changes macro...

10.54254/2754-1169/85/20240905 article EN cc-by Advances in Economics Management and Political Sciences 2024-05-27

Practical quantitative research on technological impact on Australia post facilities' processing capacity

OPENALEX - Publications

Yue Zhao

Australia Post has seen increasing demands for its service and in response to this, an automated sorting station artificial intelligence-based prioritization system have been installed separately Tullamarine Mascot 2019. This essay aims answer the question: Can technology effectively improve processing capability of Post? Four hypotheses will be raised. Selected data each hypothesis firstly described visualized, then tested by Shapiro-Wilk test normal distribution. According result...

10.54254/2753-8818/39/20240607 article EN Theoretical and Natural Science 2024-07-26

POSTER

OPENALEX - Publications

Yue Zhao Chunhua Liao Xipeng Shen

This paper presents a prototype infrastructure for addressing the barriers effective accumulation, sharing, and reuse of various types knowledge high performance parallel computing.

10.1145/3018743.3019023 article EN 2017-01-26

Performance Analysis of MPI Parallel Programs on Xen Virtual Machines

OPENALEX - Publications

Jungang Xu Yue Zhao Kunlin Zhan Hui Li Xiuqi Han

Recently, HPC in the Cloud has emerged as a new paradigm field of parallel computing. Most cloud systems deploy virtual machines for provisioning resources. However, machine environment, there is still no mature method to analyze performance MPI programs. In this paper, we propose series innovative methods analysis programs on Xen machines, including data collection through instrumentation and sampling, bottleneck diagnosis using PAM DBSCAN clustering algorithms, root cause rough set...

10.1109/hpcc.and.euc.2013.215 article EN 2013-11-01

POSTER

OPENALEX - Publications

Yue Zhao Chunhua Liao Xipeng Shen

This paper presents a prototype infrastructure for addressing the barriers effective accumulation, sharing, and reuse of various types knowledge high performance parallel computing.

10.1145/3155284.3019023 article EN ACM SIGPLAN Notices 2017-01-26