- Distributed and Parallel Computing Systems
- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Semantic Web and Ontologies
- Algorithms and Data Compression
- Cloud Computing and Resource Management
- Scientific Computing and Data Management
- Software System Performance and Reliability
- Advanced Data Processing Techniques
- Mining and Resource Management
- Technology Assessment and Management
- IPv6, Mobility, Handover, Networks, Security
- Business Strategies and Innovation
- Hydraulic and Pneumatic Systems
- Underwater Vehicles and Communication Systems
- Green IT and Sustainability
- Security in Wireless Sensor Networks
- Web Applications and Data Management
- Maritime Navigation and Safety
- Refrigeration and Air Conditioning Technologies
- Statistical Distribution Estimation and Applications
- Model-Driven Software Engineering Techniques
- Probabilistic and Robust Engineering Design
- Simulation and Modeling Applications
- Education and Work Dynamics
Industrial and Commercial Bank of China
2024
University of Chinese Academy of Sciences
2013-2022
Brown University
2022
Academy of Mathematics and Systems Science
2022
Chinese Academy of Sciences
2022
North Carolina State University
2015-2020
Meta (Israel)
2020
Xi'an University of Science and Technology
2013
This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem determining best storage to maximize performance Sparse Matrix Vector Multiplication (SpMV). It describes how effectively bridge gap between needs pillar HPC through set techniques representations, structure, cross-architecture model migrations. The new solution cuts selection errors by two thirds, improves SpMV 1.73X average over state art.
Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier to meet practical needs. This work first time demonstrates that on existing GPUs, efficient preemptive possible even without special hardware support. Specifically, it presents EffiSha, pure software framework enables with very low overhead. The enabled...
This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem determining best storage to maximize performance Sparse Matrix Vector Multiplication (SpMV). It describes how effectively bridge gap between needs pillar HPC through set techniques representations, structure, cross-architecture model migrations. The new solution cuts selection errors by two thirds, improves SpMV 1.73X average over state art.
Sparse matrix vector multiplication (SpMV) is an important kernel in many applications and often the major performance bottleneck. The storage format of sparse matrices critically affects SpMV. Although there have been previous studies on selecting appropriate for a given matrix, they ignored influence runtime prediction overhead conversion overhead. For common uses SpMV, such part execution times may outweigh benefits new formats. Ignoring them makes predictions from solutions frequently...
Sparse matrix-vector multiplication (SpMV) is an important kernel and its performance critical for many applications. Storage format selection to select the best store a sparse matrix; it essential SpMV performance. Prior studies have focused on predicting that helps run fastest, but ignored runtime prediction conversion overhead. This work shows overhead makes predictions from previous solutions frequently sub-optimal sometimes inferior regarding end-to-end time. It proposes new paradigm...
Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier to meet practical needs. This work first time demonstrates that on existing GPUs, efficient preemptive possible even without special hardware support. Specifically, it presents EffiSha, pure software framework enables with very low overhead. The enabled...
Program analysis is fundamental for program optimizations, debugging, and many other tasks. But developing analyses has been a challenging and error-prone process general users. Declarative program shown the promise to dramatically improve the productivity in development of analyses. Current declarative however subject some major limitations supporting cooperations among tools, guiding program often requires much effort repeated program preprocessing. In this work, we advocate...
Modern machine learning programs are often written in Python, with the main computations specified through calls to some highly optimized libraries (e.g., TensorFlow, PyTorch). How maximize computing efficiency of such is essential for many application domains, which has drawn lots recent attention. This work points out a common limitation existing efforts: they focus their views only on static computation graphs by library APIs, but leave influence from hosting Python code largely...
Domain specific languages (DSLs) offer an attractive path to program large-scale, heterogeneous parallel computers since application developers can leverage high-level annotations defined by DSLs efficiently express algorithms without being distracted low-level hardware details. However, performance of DSL programs heavily relies on how well a implementation, including compilers and runtime systems, exploit knowledge across multiple layers software/hardware environments for optimizations....
In this work, we conduct a systematic exploration on the promise and challenges of deep learning for sparse matrix format selection. We propose set novel techniques to solve special learning, including input representations, late-merging neural network structure design, use transfer alleviate cross-architecture portability issues.
There are many technical tools and models to assist management in strategic planning process better realise a businesss strategy. Some classical widely used practice. They bring benefits, but their limitations should not be overlooked. This essay discusses the five - PEST analysis, SWOT Scenario Porters Five Forces Model, Growth-share Matrix. To apply these methods fast-changing IT environment, some of limitations, may it due narrow application initially was designed for, or changes macro...
Australia Post has seen increasing demands for its service and in response to this, an automated sorting station artificial intelligence-based prioritization system have been installed separately Tullamarine Mascot 2019. This essay aims answer the question: Can technology effectively improve processing capability of Post? Four hypotheses will be raised. Selected data each hypothesis firstly described visualized, then tested by Shapiro-Wilk test normal distribution. According result...
This paper presents a prototype infrastructure for addressing the barriers effective accumulation, sharing, and reuse of various types knowledge high performance parallel computing.
Recently, HPC in the Cloud has emerged as a new paradigm field of parallel computing. Most cloud systems deploy virtual machines for provisioning resources. However, machine environment, there is still no mature method to analyze performance MPI programs. In this paper, we propose series innovative methods analysis programs on Xen machines, including data collection through instrumentation and sampling, bottleneck diagnosis using PAM DBSCAN clustering algorithms, root cause rough set...
This paper presents a prototype infrastructure for addressing the barriers effective accumulation, sharing, and reuse of various types knowledge high performance parallel computing.