NFDI4DS | UHH-SEMS - Publication Details

Flipping bits in memory without accessing them

OPENALEX - Publications

Yoongu Kim Ross Daly Jeremie Kim Chris Fallin Ji Hye Lee and 4 more

Memory isolation is a key property of reliable and secure computing system--an access to one memory address should not have unintended side effects on data stored in other addresses. However, as DRAM process technology scales down smaller dimensions, it becomes more difficult prevent cells from electrically interacting with each other. In this paper, we expose the vulnerability commodity chips disturbance errors. By reading same DRAM, show that possible corrupt nearby More specifically,...

10.1145/2678373.2665726 article EN ACM SIGARCH Computer Architecture News 2014-06-14

Memory power management via dynamic voltage/frequency scaling

OPENALEX - Publications

Howard David Chris Fallin Eugene Gorbatov Ulf R. Hanebutte Onur Mutlu

Energy efficiency and energy-proportional computing have become a central focus in enterprise server architecture. As thermal electrical constraints limit system power, datacenter operators more conscious of energy costs, becomes important across the whole system. There are many proposals to scale at level. However, one significant component memory system, remains largely unaddressed. We propose dynamic volt age/frequency scaling (DVFS) address this problem, evaluate simple algorithm real

10.1145/1998582.1998590 article EN 2011-06-14

Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors

OPENALEX - Publications

Yoongu Kim Ross Daly Jeremie Kim Chris Fallin Ji Hye Lee and 4 more

Memory isolation is a key property of reliable and secure computing system-an access to one memory address should not have unintended side effects on data stored in other addresses. However, as DRAM process technology scales down smaller dimensions, it becomes more difficult prevent cells from electrically interacting with each other. In this paper, we expose the vulnerability commodity chips disturbance errors. By reading same DRAM, show that possible corrupt nearby More specifically,...

10.1109/isca.2014.6853210 article EN 2014-06-01

RowClone

OPENALEX - Publications

Vivek Seshadri Yoongu Kim Chris Fallin Donghyuk Lee Rachata Ausavarungnirun and 6 more

Several system-level operations trigger bulk data copy or initialization. Even though these do not require any computation, current systems transfer a large quantity of back and forth on the memory channel to perform such operations. As result, consume high latency, bandwidth, energy--degrading both system performance energy efficiency.

10.1145/2540708.2540725 article EN 2013-12-07

CHIPPER: A low-complexity bufferless deflection router

OPENALEX - Publications

Chris Fallin Chris Craik Onur Mutlu

As Chip Multiprocessors (CMPs) scale to tens or hundreds of nodes, the interconnect becomes a significant factor in cost, energy consumption and performance. Recent work has explored many design tradeoffs for networks-on-chip (NoCs) with novel router architectures reduce hardware cost. In particular, recent proposes bufferless deflection routing eliminate buffers. The high cost buffers makes this choice potentially appealing, especially low-to-medium network loads. However, current designs...

10.1109/hpca.2011.5749724 article EN 2011-02-01

Parallel application memory scheduling

OPENALEX - Publications

Eiman Ebrahimi Rustam Miftakhutdinov Chris Fallin Chang Joo Lee José A. Joao and 2 more

A primary use of chip-multiprocessor (CMP) systems is to speed up a single application by exploiting thread-level parallelism. In such systems, threads may slow each other down issuing memory requests that interfere in the shared subsystem. This inter-thread system interference can significantly degrade parallel performance. Better request scheduling mitigate performance degradation. However, previously proposed algorithms for CMPs are designed multi-programmed workloads where core runs an...

10.1145/2155620.2155663 article EN 2011-12-03

MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect

OPENALEX - Publications

Chris Fallin Greg Nazario Xiangyao Yu Kevin K. Chang Rachata Ausavarungnirun and 1 more

A conventional Network-on-Chip (NoC) router uses input buffers to store in-flight packets. These improve performance, but consume significant power. It is possible bypass these when they are empty, reducing dynamic power, static buffer and power utilized, remain. To energy efficiency, less deflection routing removes buffers, instead (misrouting) resolve contention. However, at high network load, deflections cause unnecessary hops, wasting performance. In this work, we propose a new NoC...

10.1109/nocs.2012.8 article EN 2012-05-01

On-chip networks from a networking perspective

OPENALEX - Publications

George Nychis Chris Fallin Thomas Moscibroda Onur Mutlu Srinivasan Seshan

In this paper, we present network-on-chip (NoC) design and contrast it to traditional network design, highlighting similarities differences between the two. As an initial case study, examine congestion in bufferless NoCs. We show that manifests itself differently a NoC than networks. Network reduces system throughput congested workloads for smaller NoCs (16 64 nodes), limits scalability of larger (256 4096 nodes) even when traffic has locality (e.g., application's required data is mapped...

10.1145/2342356.2342436 article EN 2012-08-13

Next generation on-chip networks

OPENALEX - Publications

George Nychis Chris Fallin Thomas Moscibroda Onur Mutlu

In this paper, we present network-on-chip (NoC) design and contrast it to traditional network design, highlighting core differences between NoCs networks. As an initial case study, examine congestion in bufferless NoCs. We show that manifests itself differently a NoC than network, with application-level awareness the make proper throttling decisions improve system performance by up 28%. It is our hope unique interesting challenges of on-chip can be met novel effective solutions from...

10.1145/1868447.1868459 article EN 2010-10-20

HAT: Heterogeneous Adaptive Throttling for On-Chip Networks

OPENALEX - Publications

Kevin K. Chang Rachata Ausavarungnirun Chris Fallin Onur Mutlu

The network-on-chip (NoC) is a primary shared resource in chip multiprocessor (CMP) system. As core counts continue to increase and applications become increasingly data-intensive, the network load will also increase, leading more congestion network. This can degrade system performance if not appropriately controlled. Prior works have proposed source-throttling control, which limits rate at new traffic (packets) enters NoC order reduce improve performance. These prior control mechanisms...

10.1109/sbac-pad.2012.44 article EN 2012-10-01

Design and Evaluation of Hierarchical Rings with Deflection Routing

OPENALEX - Publications

Rachata Ausavarungnirun Chris Fallin Xiangyao Yu Kevin K. Chang Greg Nazario and 3 more

Hierarchical ring networks, which hierarchically connect multiple levels of rings, have been proposed in the past to improve scalability interconnects, but hierarchical designs sacrifice some key benefits rings by reintroducing more complex in-ring buffering and buffered flow control. Our goal this paper is design a new interconnect that can maintain most simplicity traditional (i.e., no or control) while achieving high as designs. To end, we revisit concept hierarchical-ring networkon-chip....

10.1109/sbac-pad.2014.31 article EN 2014-10-01

Going beyond the Limits of SFI: Flexible and Secure Hardware-Assisted In-Process Isolation with HFI

OPENALEX - Publications

Shravan Narayan Tal Garfinkel Mohammadkazem Taram Joey Rudek Daniel Moghimi and 7 more

We introduce Hardware-assisted Fault Isolation (HFI), a simple extension to existing processors support secure, flexible, and efficient in-process isolation. HFI addresses the limitations of software-based isolation (SFI) systems including: runtime overheads, limited scalability, vulnerability Spectre attacks, compatibility with code. can seamlessly integrate current SFI (e.g., WebAssembly), or directly sandbox unmodified native binaries. To ease adoption, relies only on incremental changes...

10.1145/3582016.3582023 article EN 2023-03-20

On-chip networks from a networking perspective

OPENALEX - Publications

George Nychis Chris Fallin Thomas Moscibroda Onur Mutlu Srinivasan Seshan

In this paper, we present network-on-chip (NoC) design and contrast it to traditional network design, highlighting similarities differences between the two. As an initial case study, examine congestion in bufferless NoCs. We show that manifests itself differently a NoC than networks. Network reduces system throughput congested workloads for smaller NoCs (16 64 nodes), limits scalability of larger (256 4096 nodes) even when traffic has locality (e.g., application's required data is mapped...

10.1145/2377677.2377757 article EN ACM SIGCOMM Computer Communication Review 2012-08-13

A High-Performance Hierarchical Ring On-Chip Interconnect with Low-Cost Routers

OPENALEX - Publications

Chris Fallin Xiangyao Yu Gregory Nazario Onur Mutlu

Energy consumption of routers in commonly used mesh-based on-chip networks for chip multiprocessors is an increasingly important concern: these consist a crossbar and complex control logic can require significant buffers, hence high energy area consumption. In contrast, alternative design uses ring-based to connect network nodes with small simple routers. Rings have been recent commercial designs, are well-suited smaller core counts. However, rings do not scale as efficiently meshes. this...

10.1184/r1/6468200.v1 article EN 2011-09-06

The heterogeneous block architecture

OPENALEX - Publications

Chris Fallin Chris Wilkerson Onur Mutlu

This paper makes two observations that lead to a new heterogeneous core design. First, we observe most serial code exhibits fine-grained heterogeneity: at the scale of tens or hundreds instructions, regions fit different microarchitectures better (at same point points in time). Second, by grouping contiguous instructions into blocks are executed atomically, can exploit this atomicity allows each block be independently on its own execution backend fits characteristics best. Based these...

10.1109/iccd.2014.6974710 article EN 2014-10-01

A case for hierarchical rings with deflection routing: An energy-efficient on-chip communication substrate

OPENALEX - Publications

Rachata Ausavarungnirun Chris Fallin Xiangyao Yu Kevin K. Chang Greg Nazario and 3 more

10.1016/j.parco.2016.01.009 article EN publisher-specific-oa Parallel Computing 2016-02-12

RowHammer: Reliability Analysis and Security Implications

OPENALEX - Publications

Yoongu Kim Ross Daly Jeremie S. Kim Chris Fallin Jihye Lee and 4 more

As process technology scales down to smaller dimensions, DRAM chips become more vulnerable disturbance, a phenomenon in which different cells interfere with each other's operation. For the first time academic literature, our ISCA paper exposes existence of disturbance errors commodity that are sold and used today. We show repeatedly reading from same address could corrupt data nearby addresses. More specifically: When row is opened (i.e., activated) closed precharged) hammered), it can...

10.48550/arxiv.1603.00747 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Lightweight, Modular Verification for WebAssembly-to-Native Instruction Selection

OPENALEX - Publications

Alexa VanHattum Monica Pardeshi Chris Fallin Adrian Sampson Fraser Brown

Language-level guarantees---like module runtime isolation for WebAssembly (Wasm)---are only as strong the compiler that produces a final, native-machine-specific executable. The process of lowering language-level constructions to ISA-specific instructions can introduce subtle bugs violate security guarantees. In this paper, we present Crocus, system lightweight, modular verification instruction-lowering rules within Cranelift, production retargetable Wasm native code generator. We use Crocus...

10.1145/3617232.3624862 article EN cc-by 2024-04-17

Going beyond the Limits of SFI: Flexible and Secure Hardware-Assisted In-Process Isolation with HFI

OPENALEX - Publications

Shravan Narayan Tal Garfinkel Mohammadkazem Taram Joey Rudek Daniel Moghimi and 7 more

10.1109/mm.2024.3422977 article EN IEEE Micro 2024-07-01

RowClone: Accelerating Data Movement and Initialization Using DRAM

OPENALEX - Publications

Vivek Seshadri Yoongu Kim Chris Fallin Donghyuk Lee Rachata Ausavarungnirun and 6 more

In existing systems, to perform any bulk data movement operation (copy or initialization), the has first be read into on-chip processor, all way L1 cache, and result of must written back main memory. This is despite fact that these operations do not involve actual computation. RowClone exploits organization commodity DRAM completely inside using two mechanisms. The mechanism, Fast Parallel Mode, copies between rows same subarray by issuing back-to-back activate commands source destination...

10.48550/arxiv.1805.03502 preprint EN other-oa arXiv (Cornell University) 2018-01-01