NFDI4DS | UHH-SEMS - Publication Details

Yingwei Luo

ORCID: 0000-0002-7903-0717

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5028194980

Research Areas

Cloud Computing and Resource Management
Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Advanced Computational Techniques and Applications
Service-Oriented Architecture and Web Services
Geographic Information Systems Studies
Distributed and Parallel Computing Systems
Caching and Content Delivery
Data Management and Algorithms
Web Applications and Data Management
Cloud Computing and Remote Desktop Technologies
Mobile Agent-Based Network Management
Semantic Web and Ontologies
Banking Systems and Strategies
Simulation and Modeling Applications
Software System Performance and Reliability
Opportunistic and Delay-Tolerant Networks
Interconnection Networks and Systems
Multi-Agent Systems and Negotiation
Scientific Computing and Data Management
Advanced Computing and Algorithms
Distributed systems and fault tolerance
Software Engineering Research
Graph Theory and Algorithms
Topic Modeling

Peking University
2016-2025

East China Normal University
2024-2025

King University
2025

Shanghai Key Laboratory of Trustworthy Computing
2024

Tsinghua University
2017-2023

Stomatology Hospital
2023

Kunming Medical University
2023

Peng Cheng Laboratory
2018-2022

Beijing Institute of Petrochemical Technology
2015

Michigan Technological University
2014

Live and incremental whole-system migration of virtual machines using block-bitmap

OPENALEX - Publications

Yingwei Luo Zhang Bin-bin Xiaolin Wang Zhenlin Wang Yi-Feng Sun and 1 more

In this paper, we describe a whole-system live migration scheme, which transfers the whole system run-time state, including CPU memory data, and local disk storage, of virtual machine (VM). To minimize downtime caused by migrating large storage data keep integrity consistency, propose three-phase (TPM) algorithm. facilitate back to initial source machine, use an incremental (IM) algorithm reduce amount be migrated. Block-bitmap is used track all write accesses during migration....

10.1109/clustr.2008.4663760 article EN 2008-09-01

Dynamic memory balancing for virtual machines

OPENALEX - Publications

Weiming Zhao Zhenlin Wang Yingwei Luo

Virtualization essentially enables multiple operating systems and applications to run on one physical computer by multiplexing hardware resources. A key motivation for applying virtualization is improve resource utilization while maintaining reasonable quality of service. However, such a goal cannot be achieved without efficient management. Though most resources, as processor cores I/O devices, are shared among virtual machines using time slicing can scheduled flexibly based priority,...

10.1145/1618525.1618530 article EN ACM SIGOPS Operating Systems Review 2009-07-31

Optimal Cache Partition-Sharing

OPENALEX - Publications

Jacob Brock Chencheng Ye Chen Ding Yechen Li Xiaolin Wang and 1 more

When a cache is shared by multiple cores, its space may be allocated either sharing, partitioning, or both. We call the last case partition-sharing. This paper studies partition-sharing as general solution, and presents theory an technique for optimizing present to optimize partition sharing. The shows that problem of reducible partitioning. uses dynamic programming partitioning overall miss ratio, two different kinds fairness. Finally, evaluates effect optimal sharing compares it with...

10.1109/icpp.2015.84 article EN 2015-09-01

DCAPS

OPENALEX - Publications

Yaocheng Xiang Xiaolin Wang Zihui Huang Zeyu Wang Yingwei Luo and 1 more

In a multicore system, effective management of shared last level cache (LLC), such as hardware/software partitioning, has attracted significant research attention. Some eminent progress is that Intel introduced Cache Allocation Technology (CAT) to its commodity processors recently. CAT implements way partitioning and provides software interface control allocation. Unfortunately, can only allocate at level, which does not scale well for large thread or program count serve their various...

10.1145/3190508.3190511 article EN 2018-04-18

Electroencephalogram oscillations differentiate semantic and prosodic processes during sentence reading

OPENALEX - Publications

Yingwei Luo Yuan Zhang Xiaoxu Feng Xiaolin Zhou

10.1016/j.neuroscience.2010.05.032 article EN Neuroscience 2010-05-25

Selective hardware/software memory virtualization

OPENALEX - Publications

Xiaolin Wang Jiarui Zang Zhenlin Wang Yingwei Luo Xiaoming Li

As virtualization becomes a key technique for supporting cloud computing, much effort has been made to reduce overhead, so virtualized system can match its native performance. One major overhead is due memory or page table virtualization. Conventional virtual machines rely on shadow mechanism manage tables, where maintained by the VMM (Virtual Machine Monitor) maps addresses machine while guest maintains own physical table. This will result in expensive VM exits whenever there fault that...

10.1145/1952682.1952710 article EN 2011-03-09

A Simple Cache Partitioning Approach in a Virtualized Environment

OPENALEX - Publications

Xinxin Jin Haogang Chen Xiaolin Wang Zhenlin Wang Xiang Wen and 2 more

Virtualization is often used in systems for the purpose of offering isolation among applications running separate virtual machines (VM). Current machine monitors (VMMs) have done a decent job resource memory, CPU and I/O devices. However, when looking further into usage lower-level shared cache, we notice that one machine's cache behavior may interfere with another's due to uncontrolled sharing. In this situation, performance cannot be guaranteed. This paper presents partitioning approach...

10.1109/ispa.2009.47 article EN 2009-01-01

Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

OPENALEX - Publications

Wei Gao Qinghao Hu Zhisheng Ye Peng Sun Xiaolin Wang and 3 more

Deep learning (DL) shows its prosperity in a wide variety of fields. The development DL model is time-consuming and resource-intensive procedure. Hence, dedicated GPU accelerators have been collectively constructed into datacenter. An efficient scheduler design for such datacenter crucially important to reduce the operational cost improve resource utilization. However, traditional approaches designed big data or high performance computing workloads can not support fully utilize resources....

10.48550/arxiv.2205.11913 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Fast Miss Ratio Curve Modeling for Storage Cache

OPENALEX - Publications

Xiameng Hu Xiaolin Wang Lan Hua Zhou Yingwei Luo Zhenlin Wang and 2 more

The reuse distance (least recently used (LRU) stack distance) is an essential metric for performance prediction and optimization of storage cache. Over the past four decades, there have been steady improvements in algorithmic efficiency measurement. This progress accelerating recent years, both theory practical implementation. In this article, we present a kinetic model LRU cache memory, based on average eviction time (AET) cached data. AET enables fast measurement use low-cost sampling. It...

10.1145/3185751 article EN ACM Transactions on Storage 2018-04-12

Hardware-Software Collaborative Tiered-Memory Management Framework for Virtualization

OPENALEX - Publications

Sai Sha Chuandong Li Xiaolin Wang Zhenlin Wang Yingwei Luo

The tiered-memory system can effectively expand the memory capacity for virtual machines (VMs). However, virtualization introduces new challenges specifically in enforcing performance isolation, minimizing context switching, and providing resource overcommit. None of state-of-the-art designs consider address these challenges; we observe that a VM with tiered incurs up to 2× slowdown compared DRAM-only VM. We propose vTMM , hardware-software collaborative management framework virtualization....

10.1145/3639564 article EN ACM Transactions on Computer Systems 2024-01-15

Fuzzy Information Entropy and Region Biased Matrix Factorization for Web Service QoS Prediction

OPENALEX - Publications

Guoxing Tang Yugen Du Xia Chen Yingwei Luo Benchi Ma

Nowadays, there are many similar services available on the internet, making Quality of Service (QoS) a key concern for users. Since collecting QoS values all through user invocations is impractical, predicting more feasible approach. Matrix factorization considered an effective prediction method. However, most existing matrix algorithms focus capturing global similarities between users and services, overlooking local their neighbors, as well non-interactive effects services. This paper...

10.48550/arxiv.2501.04063 preprint EN arXiv (Cornell University) 2025-01-07

A QoS Prediction Framework via Utility Maximization and Region-Aware Matrix Factorization

OPENALEX - Publications

Xia Chen Yugen Du Guoxing Tang Fan Chen Yingwei Luo and 1 more

10.1109/tsc.2025.3541554 article EN IEEE Transactions on Services Computing 2025-01-01

Criticality-Aware Instruction-Centric Bandwidth Partitioning for Data Center Applications

OPENALEX - Publications

Liren Zhu Liujia Li Jianyu Wu Yiming Yao Zhan Shi and 5 more

10.1109/hpca61900.2025.00042 article EN 2025-03-01

Ginkgo: A Learned Index Enhanced Tiered Memory System

OPENALEX - Publications

Xiran Yang Yifei Yu Chuandong Li Jianqiang Zeng Ke Zhou and 4 more

10.1109/mm.2025.3564395 article EN IEEE Micro 2025-01-01

Get Out of the Valley: Power-Efficient Address Mapping for GPUs

OPENALEX - Publications

Yu-xi Liu Xia Zhao Magnus Jahre Zhenlin Wang Xiaolin Wang and 2 more

GPU memory systems adopt a multi-dimensional hardware structure to provide the bandwidth necessary support 100s 1000s of concurrent threads. On software side, GPU-compute workloads also use structures organize We observe that these can combine unfavorably and create significant resource imbalance in subsystem - causing low performance poor power-efficiency. The key issue is it highly application-dependent which address bits exhibit high variability. To solve this problem, we first an entropy...

10.1109/isca.2018.00024 article EN 2018-06-01

EMBA

OPENALEX - Publications

Yaocheng Xiang Chencheng Ye Xiaolin Wang Yingwei Luo Zhenlin Wang

On multi-core processors, contention on shared resources such as the last level cache (LLC) and memory bandwidth may cause serious performance degradation, which makes efficient resource allocation a critical issue in data centers. Intel recently introduces Memory Bandwidth Allocation (MBA) technology its Xeon scalable it possible to allocate real system. However, how make most of MBA improve system remains an open question. In this work, (1) we formulate quantitative relationship between...

10.1145/3337821.3337863 article EN 2019-07-25

vTMM: Tiered Memory Management for Virtual Machines

OPENALEX - Publications

Sai Sha Chuandong Li Yingwei Luo Xiaolin Wang Zhenlin Wang

The memory demand of virtual machines (VMs) is increasing, while the traditional DRAM-only system has limited capacity and high power consumption. tiered can effectively expand increase cost efficiency. Virtualization introduces new challenges for tiering, specifically enforcing performance isolation, minimizing context switching, providing resource overcommit. However, none state-of-the-art designs consider virtualization thus address these challenges; we observe that a VM with incurs up to...

10.1145/3552326.3587449 article EN 2023-05-05

A Survey on I/O Virtualization and Optimization

OPENALEX - Publications

Binbin Zhang Xiaolin Wang Rongfeng Lai Liang Yang Yingwei Luo and 2 more

This paper surveys virtualization of I/O devices, which is one the most difficult parts in system virtualization. Current technologies virtualizing devices include full virtualization, paravirtualization, software emulation and VMM-bypass direct I/O. Optimizations are also done to improve performance each technology. Most optimizations used paravirtualization technology for reference. performs best performance, but VM direct-access be migrated, it hard capture states without VMM...

10.1109/chinagrid.2010.54 article EN 2010-07-01

Fast Live Cloning of Virtual Machine Based on Xen

OPENALEX - Publications

Yi-Feng Sun Yingwei Luo Xiaolin Wang Zhenlin Wang Zhang Bin-bin and 2 more

Virtual Machine (VM) cloning is to create a replica of source virtual machine (parent machine); the replica, also called child machine, owns exactly same executing status as parent machine. Fast live guarantees that, during period cloning, services running on observe no performance degradation. There are three important goals for fast cloning: reducing total time, minimizing suspension time and maximizing resource sharing between This paper exploits Copy-on-Write (CoW) mechanism fully...

10.1109/hpcc.2009.97 article EN 2009-01-01

Coming Soon ...