NFDI4DS | UHH-SEMS - Publication Details

Enzian: an open, general, CPU/FPGA platform for systems software research

OPENALEX - Publications

David Cock Abishek Ramdas Daniel Schwyn Michael Giardino Adam Turowski and 8 more

Hybrid computing platforms, comprising CPU cores and FPGA logic, are increasingly used for accelerating data-intensive workloads in cloud deployments, a growing topic of interest systems research. However, from research perspective, existing hardware platforms limited: they often optimized concrete, narrow use-cases and, therefore lack the flexibility needed to explore other applications configurations.

10.1145/3503222.3507742 article EN 2022-02-22

Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines

OPENALEX - Publications

Reto Achermann Ashish Panwar Abhishek Bhattacharjee Timothy Roscoe Jayneel Gandhi

Multi-socket machines with 1-100 TBs of physical memory are becoming prevalent. Applications running on such multi-socket suffer non-uniform bandwidth and latency when accessing memory. Decades research have focused data allocation placement policies in NUMA settings, but there been no studies the question how to place page-tables amongst sockets. We make case for explicit page-table show that is crucial overall performance. propose Mitosis mitigate effects walks by transparently replicating...

10.1145/3373376.3378468 article EN 2020-03-09

Fast Sparse Decision Tree Optimization via Reference Ensembles

OPENALEX - Publications

Hayden McTavish Chudi Zhong Reto Achermann Ilias Karimalis Jacques Chen and 2 more

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at core interpretable machine learning. computationally hard, despite steady effort 1960's, breakthroughs have made on problem only within past few years, primarily finding optimal sparse trees. However, current state-of-the-art algorithms often require impractical amounts computation time memory to find or near-optimal trees for some real-world datasets, particularly...

10.1609/aaai.v36i9.21194 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

SpaceJMP

OPENALEX - Publications

Izzat El Hajj Alexander Merritt Gerd Zellweger Dejan Milojičić Reto Achermann and 4 more

Memory-centric computing demands careful organization of the virtual address space, but traditional methods for doing so are inflexible and inefficient. If an application wishes to larger physical memory than bits allow, if it maintain pointer-based data structures beyond process lifetimes, or share large amounts across simultaneously executing processes, legacy interfaces managing space cumbersome often incur excessive overheads. We propose a new operating system design that promotes spaces...

10.1145/2872362.2872366 article EN 2016-03-25

Velosiraptor : Code Synthesis for Memory Translation

OPENALEX - Publications

Reto Achermann Eric Chu Ryan Mehri Ilias Karimalis Margo Seltzer

10.1145/3676641.3711998 article EN 2025-03-27

Fast local page-tables for virtualized NUMA servers with vMitosis

OPENALEX - Publications

Ashish Panwar Reto Achermann Arkaprava Basu Abhishek Bhattacharjee K. Gopinath and 1 more

Increasing heterogeneity in the memory system mandates careful data placement to hide non-uniform access (NUMA) effects on applications. However, NUMA optimizations have predominantly focused application past decades, largely ignoring of kernel structures due their small footprint; this is evident typical OS designs that pin objects memory. In paper, we show gaining importance context page-tables: sub-optimal page-tables causes severe slowdown (up 3.1x) virtualized servers.

10.1145/3445814.3446709 article EN 2021-04-11

Verus: A Practical Foundation for Systems Verification

OPENALEX - Publications

Andrea Lattuada Travis Hance Jay Bosamiya Matthias Brun Chanhee Cho and 9 more

Formal verification is a promising approach to eliminate bugs at compile time, before they ship. Indeed, our community has verified wide variety of system software. However, much this success required heroic developer effort, relied on bespoke logics for individual domains, or sacrificed expressiveness powerful proof automation.

10.1145/3694715.3695952 article EN cc-by 2024-11-04

Machine-aware atomic broadcast trees for multicores

OPENALEX - Publications

Stefan Kaestle Reto Achermann Roni Haecki Moritz Hoffmann Sabela Ramos and 1 more

The performance of parallel programs on multicore machines often critically depends group communication operations like barriers and reductions being highly tuned to hardware, a task requiring considerable developer skill.Smelt is library that automatically builds efficient inter-core broadcast trees individual machines, using machine model derived from hardware registers plus micro-benchmarks capturing the low-level characteristics missing vendor specifications.Experiments wide variety show...

10.5555/3026877.3026881 article EN Operating Systems Design and Implementation 2016-11-02

Cache-coherent accelerators for persistent memory crash consistency

OPENALEX - Publications

Ankit Bhardwaj Todd Thornley Vinita Pawar Reto Achermann Gerd Zellweger and 1 more

Building persistent memory (PM) data structures is difficult because crashes interrupt operations, leaving in an inconsistent state. Solving this requires augmenting code that modifies PM state to ensure interrupted operations can be completed or undone. Today, done using careful, hand-crafted code, a compiler pass, page faults. We propose new, easy way transform volatile structure work with uses cache-coherent accelerator do augmentation, and we show it may outperform existing approaches...

10.1145/3538643.3539752 article EN 2022-06-23

Separating Translation from Protection in Address Spaces with Dynamic Remapping

OPENALEX - Publications

Reto Achermann Chris Dalton Paolo Faraboschi Moritz Hoffmann Dejan Milojičić and 5 more

It is time to reconsider memory protection. The emergence of large non-volatile main memories, scalable interconnects, and rack-scale computers running numbers small "micro services" creates significant challenges for protection based solely on MMU mechanisms. Central this a tension between translation: optimizing translation performance often comes with cost in flexibility.

10.1145/3102980.3103000 article EN 2017-05-07

Formalizing Memory Accesses and Interrupts

OPENALEX - Publications

Reto Achermann Lukas Humbel David Cock Timothy Roscoe

The hardware/software boundary in modern heterogeneous multicore computers is increasingly complex, and diverse across different platforms. A single memory access by a core or DMA engine traverses multiple hardware translation caching steps, the destination cell register often appears at physical addresses for cores. Interrupts pass through complex topology of interrupt controllers remappers before delivery to one more cores, each with specific constraints on their configurations. System...

10.4204/eptcs.244.4 article EN cc-by-nc-nd arXiv (Cornell University) 2017-03-15

Beyond isolation: OS verification as a foundation for correct applications

OPENALEX - Publications

Matthias Brun Reto Achermann Tej Chajed Jon Howell Gerd Zellweger and 1 more

Verified systems software has generally had to assume the correctness of operating system and its provided services (like networking file system). Even though there exist verified systems, specifications for these components do not compose with applications produce a fully high-performance stack.

10.1145/3593856.3595899 article EN 2023-06-22

mmapx

OPENALEX - Publications

Reto Achermann David Cock Roni Haecki Nora Hossle Lukas Humbel and 2 more

Modern Systems-on-Chip (SoCs) are networks of heterogeneous cores, intelligent devices, and memory, connected through multiple configurable address translation protection units like IOMMUs System MMUs.

10.1145/3458336.3465273 article EN 2021-06-01

SpaceJMP

OPENALEX - Publications

Izzat El Hajj Alexander Merritt Gerd Zellweger Dejan Milojičić Reto Achermann and 4 more

Memory-centric computing demands careful organization of the virtual address space, but traditional methods for doing so are inflexible and inefficient. If an application wishes to larger physical memory than bits allow, if it maintain pointer-based data structures beyond process lifetimes, or share large amounts across simultaneously executing processes, legacy interfaces managing space cumbersome often incur excessive overheads. We propose a new operating system design that promotes spaces...

10.1145/2954679.2872366 article EN ACM SIGPLAN Notices 2016-03-25

Why write address translation OS code yourself when you can synthesize it?

OPENALEX - Publications

Reto Achermann Ilias Karimalis Margo Seltzer

Address translation hardware is at the cornerstone of modern computer systems. It provides a wide range security-relevant features and abstractions such as memory partitioning, address space isolation, virtual memory. Hardware designers have developed different protection schemes with varying means configuration.

10.1145/3593856.3595895 article EN 2023-06-22

Synthesizing Device Drivers with Ghost Writer

OPENALEX - Publications

Bingyao Wang Sepehr Noorafshan Reto Achermann Margo Seltzer

Device drivers are components that enable operating systems to interact with devices. Unfortunately, they the main source of bugs in systems, because writing a driver is an intricate and error-prone process requires extensive knowledge devices systems. Furthermore, supporting new accommodating kernel revisions require significant development effort. To facilitate device drivers, we present Ghost Writer, end-to-end toolchain allows developers synthesize correct-by-construction from high-level...

10.1145/3623759.3624545 article EN cc-by-nc-sa 2023-10-14

SpaceJMP

OPENALEX - Publications

Izzat El Hajj Alexander Merritt Gerd Zellweger Dejan Milojičić Reto Achermann and 4 more

Memory-centric computing demands careful organization of the virtual address space, but traditional methods for doing so are inflexible and inefficient. If an application wishes to larger physical memory than bits allow, if it maintain pointer-based data structures beyond process lifetimes, or share large amounts across simultaneously executing processes, legacy interfaces managing space cumbersome often incur excessive overheads. We propose a new operating system design that promotes spaces...

10.1145/2980024.2872366 article EN ACM SIGARCH Computer Architecture News 2016-03-25

SpaceJMP

OPENALEX - Publications

Izzat El Hajj Alexander Merritt Gerd Zellweger Dejan Milojičić Reto Achermann and 4 more

Memory-centric computing demands careful organization of the virtual address space, but traditional methods for doing so are inflexible and inefficient. If an application wishes to larger physical memory than bits allow, if it maintain pointer-based data structures beyond process lifetimes, or share large amounts across simultaneously executing processes, legacy interfaces managing space cumbersome often incur excessive overheads. We propose a new operating system design that promotes spaces...

10.1145/2954680.2872366 article EN ACM SIGOPS Operating Systems Review 2016-03-25

Generating correct initial page tables from formal hardware descriptions

OPENALEX - Publications

Reto Achermann David Cock Roni Haecki Nora Hossle Lukas Humbel and 2 more

Modern hardware platforms are increasingly complex and heterogeneous. System software uses a hodgepodge of different mechanisms representations to express the memory topology target platform. Considerable maintenance effort is required keep them in sync while often sharing impossible due hard-coded values. Incorrect platform-specific values initialization sequence can lead security critical hard-to-find bugs because misconfigured translation hardware, inaccessible devices, or use bad pointers.

10.1145/3477113.3487270 article EN 2021-10-11

Memory-Side Protection With a Capability Enforcement Co-Processor

OPENALEX - Publications

Leonid Azriel Lukas Humbel Reto Achermann Alexander Richardson Moritz Hoffmann and 5 more

Byte-addressable nonvolatile memory (NVM) blends the concepts of storage and can radically improve data-centric applications, from in-memory databases to graph processing. By enabling large-capacity devices be shared across multiple computing elements, fabric-attached NVM changes nature rack-scale systems enables short-latency direct access while retaining data persistence properties simplifying software stack. An adequate protection scheme is paramount when addressing persistent memory, but...

10.1145/3302257 article EN ACM Transactions on Architecture and Code Optimization 2019-03-08