Xiangyao Yu

ORCID: 0009-0001-0785-2519
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Data Storage Technologies
  • Distributed systems and fault tolerance
  • Cloud Computing and Resource Management
  • Parallel Computing and Optimization Techniques
  • Cryptography and Data Security
  • Interconnection Networks and Systems
  • Security and Verification in Computing
  • Advanced Database Systems and Queries
  • Cloud Data Security Solutions
  • Data Management and Algorithms
  • Software System Performance and Reliability
  • Caching and Content Delivery
  • Distributed and Parallel Computing Systems
  • Complexity and Algorithms in Graphs
  • Advanced Memory and Neural Computing
  • IoT and Edge/Fog Computing
  • Graph Theory and Algorithms
  • Low-power high-performance VLSI design
  • Software-Defined Networks and 5G
  • Graphene research and applications
  • Internet Traffic Analysis and Secure E-voting
  • Stochastic Gradient Optimization Techniques
  • Supercapacitor Materials and Fabrication
  • VLSI and FPGA Design Techniques
  • Advanced Graph Neural Networks

University of Wisconsin–Madison
2020-2025

Massachusetts Institute of Technology
2013-2019

University of Michigan
2016

Michigan United
2016

IIT@MIT
2013

Nanyang Technological University
2013

Tsinghua University
2012

Carnegie Mellon University
2012

We present Path ORAM, an extremely simple Oblivious RAM protocol with a small amount of client storage. Partly due to its simplicity, ORAM is the most practical scheme for storage known date. formally prove that requires log^2 N / log X bandwidth overhead block size B = N. For sizes bigger than Omega(log^2 N), asymptotically better best Due practicality, has been adopted in design secure processors since proposal.

10.1145/2508859.2516660 article EN 2013-01-01

Computer architectures are moving towards an era dominated by many-core machines with dozens or even hundreds of cores on a single chip. This unprecedented level on-chip parallelism introduces new dimension to scalability that current database management systems (DBMSs) were not designed for. In particular, as the number increases, problem concurrency control becomes extremely challenging. With threads running in parallel, complexity coordinating competing accesses data will likely diminish...

10.14778/2735508.2735511 article EN Proceedings of the VLDB Endowment 2014-11-01

Concurrency control for on-line transaction processing (OLTP) database management systems (DBMSs) is a nasty game. Achieving higher performance on emerging many-core difficult. Previous research has shown that timestamp the key scalability bottleneck in concurrency algorithms. This prevents system from scaling to large numbers of cores. In this paper we present TicToc, new optimistic algorithm avoids and bottlenecks prior T/O schemes. TicToc relies novel provably correct data-driven...

10.1145/2882903.2882935 article EN Proceedings of the 2022 International Conference on Management of Data 2016-06-16

Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses resulting from following edges in a or non-zero elements matrix. These have little temporal spatial locality, thus incur long stalls large bandwidth requirements. A traditional streaming striding prefetcher cannot capture these access patterns.

10.1145/2830772.2830807 article EN 2015-12-05

Keeping user data private is a huge problem both in cloud computing and computation outsourcing. One paradigm to achieve privacy use tamper-resistant processors, inside which users' decrypted computed upon. These processors need interact with untrusted external memory. Even if we encrypt all that leaves the trusted processor, however, address sequence goes off-chip may still leak information. To prevent this leakage, security community has proposed ORAM (Oblivious RAM). mainly been explored...

10.1145/2485922.2485971 article EN 2013-06-23

We present Path ORAM, an extremely simple Oblivious RAM protocol with a small amount of client storage. Partly due to its simplicity, ORAM is the most practical scheme known date formally prove that has O (log N ) bandwidth cost for blocks size B = Ω 2 bits. For such block sizes, asymptotically better than best-known schemes Due practicality, been adopted in design secure processors since proposal.

10.1145/3177872 article EN Journal of the ACM 2018-04-12

A conventional Network-on-Chip (NoC) router uses input buffers to store in-flight packets. These improve performance, but consume significant power. It is possible bypass these when they are empty, reducing dynamic power, static buffer and power utilized, remain. To energy efficiency, less deflection routing removes buffers, instead (misrouting) resolve contention. However, at high network load, deflections cause unnecessary hops, wasting performance. In this work, we propose a new NoC...

10.1109/nocs.2012.8 article EN 2012-05-01

Oblivious RAM (ORAM) is an established cryptographic technique to hide a program's address pattern untrusted storage system. More recently, ORAM schemes have been proposed replace conventional memory controllers in secure processor settings protect against information leakage external and the I/O bus.

10.1109/hpca.2014.6835932 article EN 2014-02-01

We present Path ORAM, an extremely simple Oblivious RAM protocol with a small amount of client storage. Partly due to its simplicity, ORAM is the most practical scheme known date formally prove that has O(log N) bandwidth cost for blocks size B = Omega(log^2 bits. For such block sizes, asymptotically better than best schemes Due practicality, been adopted in design secure processors since proposal.

10.48550/arxiv.1202.5150 preprint EN other-oa arXiv (Cornell University) 2012-01-01

There has been significant amount of excitement and recent work on GPU-based database systems. Previous claimed that these systems can perform orders magnitude better than CPU-based analytical workloads such as those found in decision support business intelligence applications. A hardware expert would view claims with suspicion. Given the general notion operators are memory-bandwidth bound, one expect maximum gain to be roughly equal ratio memory bandwidth GPU CPU. In this paper, we adopt a...

10.1145/3318464.3380595 article EN 2020-05-29

Deterministic databases are able to efficiently run transactions across different replicas without coordination. However, existing state-of-the-art deterministic require that transaction read/write sets known before execution, making such systems impractical in many OLTP applications. In this paper, we present Aria, a new distributed and database does not have limitation. The key idea behind Aria is it first executes batch of against the same snapshot an execution phase , then...

10.14778/3407790.3407808 article EN Proceedings of the VLDB Endowment 2020-08-01

Placing the DRAM in same package as a processor enables several times higher memory bandwidth than conventional off-package DRAM. Yet, latency of in-package is not appreciably lower that A promising use large cache. Unfortunately, most previous cache designs optimize mainly for hit and do consider efficiency first-class design constraint. Hence, we show this paper, these are suboptimal with

10.1145/3123939.3124555 article EN 2017-10-04

Highly available database systems rely on data replication to tolerate machine failures. Both classes of existing algorithms, active-passive and active-active, were designed in a time when network was the dominant performance bottleneck. In essence, these techniques aim minimize communication between replicas at cost incurring more processing redundancy; trade-off that suitably fitted conventional wisdom distributed design. However, emergence next-generation networks with high throughput low...

10.14778/3342263.3342639 article EN Proceedings of the VLDB Endowment 2019-07-01

This paper revisits the longstanding challenge of coordinating database systems with general-purpose OS interfaces, such as POSIX, which often lack tailored support for DB requirements. Existing approaches to this DB-OS co-design struggle limited design space, security risks, and compatibility issues. To overcome these hurdles, we propose a new approach leveraging virtualization elevate privilege level processes. Our method enables fully exploit hardware capabilities via virtualization,...

10.1145/3709714 article EN cc-by-nd Proceedings of the ACM on Management of Data 2025-02-10

Hierarchical ring networks, which hierarchically connect multiple levels of rings, have been proposed in the past to improve scalability interconnects, but hierarchical designs sacrifice some key benefits rings by reintroducing more complex in-ring buffering and buffered flow control. Our goal this paper is design a new interconnect that can maintain most simplicity traditional (i.e., no or control) while achieving high as designs. To end, we revisit concept hierarchical-ring networkon-chip....

10.1109/sbac-pad.2014.31 article EN 2014-10-01

Oblivious RAM (ORAM) is an established technique to hide the access pattern untrusted storage system. With ORAM, a curious adversary cannot tell what address user accessing when observing bits moving between and All existing ORAM schemes achieve obliviousness by adding redundancy system, i.e., each turned into multiple random accesses. Such incurs large performance overhead.

10.1145/2749469.2750413 article EN 2015-05-26

Many modern data-oriented applications are built on top of distributed OLTP databases for both scalability and high availability. Such enforce atomicity, durability, consistency through two-phase commit (2PC) synchronous replication at the granularity every single transaction. In this paper, we present COCO, a new database that supports epoch-based replication. The key idea behind COCO is it separates transactions into epochs treats whole epoch as unit. way, overhead 2PC significantly...

10.14778/3446095.3446098 article EN Proceedings of the VLDB Endowment 2021-01-01

Modern cloud databases adopt a storage-disaggregation architecture that separates the management of computation and storage. A major bottleneck in such an is network connecting storage layers. Two solutions have been explored to mitigate bottleneck: caching pushdown. While both techniques can significantly reduce traffic, existing DBMSs consider them as orthogonal support only one or other, leaving potential performance benefits unexploited. In this paper we present FlexPushdownDB (FPDB) ,...

10.14778/3476249.3476265 article EN Proceedings of the VLDB Endowment 2021-07-01

GPUs are increasingly used for high-performance and interactive data analytics workloads due to their capability accelerate computation using massive parallelism. A key constraint of GPU-based today is the limited memory capacity in GPU devices.

10.1145/3514221.3526132 article EN Proceedings of the 2022 International Conference on Management of Data 2022-06-10

Oblivious-RAMs (ORAM) are used to hide memory access patterns. Path ORAM has gained popularity due its efficiency and simplicity. In this paper, we propose an efficient integrity verification layer for ORAM, which only imposes 17% latency overhead. We also show that is vital maintaining privacy recursive ORAMs under active adversaries.

10.1109/hpec.2013.6670339 article EN 2013-09-01

Distributed transactions suffer from poor performance due to two major limiting factors. First, distributed high latency because each of their accesses remote data incurs a long network delay. Second, this increases the likelihood contention among transactions, leading abort rates and low performance. We present Sundial , an in-memory optimistic concurrency control protocol that addresses these limitations. reduce transaction rate, dynamically determines logical order at runtime, based on...

10.14778/3231751.3231763 article EN Proceedings of the VLDB Endowment 2018-06-01

Hotspots, a small set of tuples frequently read/written by large number transactions, cause contention in concurrency control protocol. While hotspot may comprise only fraction transaction's execution time, conventional strict two-phase locking allows transaction to release lock after the completes, which leaves significant parallelism unexploited. Ideally, protocol serializes transactions for duration hotspots, rather than transactions.

10.1145/3448016.3457294 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

There has been a growing interest in using GPU to accelerate data analytics due its massive parallelism and high memory bandwidth. The main constraint of for is the limited capacity memory. Heterogeneous CPU-GPU query execution compelling approach mitigate PCIe However, design space heterogeneous not fully explored. We aim improve state-of-the-art engine by optimizing placement execution. First, we introduce semantic-aware fine-grained caching policy which takes into account various aspects...

10.14778/3551793.3551809 article EN Proceedings of the VLDB Endowment 2022-07-01

We present pessimistic locking and optimistic reading (PLOR), a hybrid concurrency control protocol for in-memory transaction systems that delivers high throughput low tail latency. PLOR is especially designed high-contention workloads: throughput, transactions are allowed to access records without being blocked by lock conflicts in the read phase; latency, conflict detection delayed commit phase, where old always committed first using timestamps lock. demonstrate efficacy of this approach...

10.1145/3514221.3517879 article EN Proceedings of the 2022 International Conference on Management of Data 2022-06-10
Coming Soon ...