NFDI4DS | UHH-SEMS - Publication Details

Michael A. Kozuch

ORCID: 0009-0009-0939-3297

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5068719520

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Cloud Computing and Resource Management
Caching and Content Delivery
Security and Verification in Computing
Distributed and Parallel Computing Systems
Peer-to-Peer Network Technologies
Distributed systems and fault tolerance
Radiation Effects in Electronics
Interconnection Networks and Systems
IoT and Edge/Fog Computing
Algorithms and Data Compression
Software System Performance and Reliability
Software-Defined Networks and 5G
Advanced Optical Network Technologies
Ferroelectric and Negative Capacitance Devices
Advanced Neural Network Applications
Advanced Memory and Neural Computing
Advanced Malware Detection Techniques
Video Analysis and Summarization
Multimedia Communication and Technology
Insect symbiosis and bacterial influences
Low-power high-performance VLSI design
Green IT and Sustainability
Personal Information Management and User Behavior

Intel (United States)
2012-2024

Carnegie Mellon University
2010-2023

IBM Research - Thomas J. Watson Research Center
2020-2021

University of Bologna
2020

Fondazione Bruno Kessler
2020

Hasso Plattner Institute
2020

University of Potsdam
2020

Texas Tech University
2020

KTH Royal Institute of Technology
2020

Intel (United Kingdom)
2003-2017

Heterogeneity and dynamicity of clouds at scale

OPENALEX - Publications

Charles Reiss Alexey Tumanov Gregory R. Ganger Randy H. Katz Michael A. Kozuch

To better understand the challenges in developing effective cloud-based resource schedulers, we analyze first publicly available trace data from a sizable multi-purpose cluster. The most notable workload characteristic is heterogeneity: types (e.g., cores:RAM per machine) and their usage duration resources needed). Such heterogeneity reduces effectiveness of traditional slot- core-based scheduling. Furthermore, some tasks are constrained as to kind machine they can use, increasing complexity...

10.1145/2391229.2391236 article EN 2012-10-14

c-Through

OPENALEX - Publications

Guohui Wang David G. Andersen Michael Kaminsky Konstantina Papagiannaki T. S. Eugene Ng and 2 more

Data-intensive applications that operate on large volumes of data have motivated a fresh look at the design center networks. The first wave proposals focused designing pure packet-switched networks provide full bisection bandwidth. However, these significantly increase network complexity in terms number links and switches required restricted rules to wire them up. On other hand, optical circuit switching technology holds very bandwidth advantage over packet technology. This fact motivates us...

10.1145/1851182.1851222 article EN 2010-08-30

Ambit

OPENALEX - Publications

Vivek Seshadri Donghyuk Lee Thomas Mullins Hasan Hassan Amirali Boroumand and 5 more

Many important applications trigger bulk bitwise operations, i.e., operations on large bit vectors. In fact, recent works design techniques that exploit fast to accelerate databases (bitmap indices, BitWeaving) and web search (BitFunnel). Unfortunately, in existing architectures, the throughput of is limited by memory bandwidth available processing unit (e.g., CPU, GPU, FPGA, processing-in-memory).

10.1145/3123939.3124544 article EN 2017-10-14

Base-delta-immediate compression

OPENALEX - Publications

Gennady Pekhimenko Vivek Seshadri Onur Mutlu Phillip B. Gibbons Michael A. Kozuch and 1 more

Cache compression is a promising technique to increase on-chip cache capacity and decrease off-chip bandwidth usage. Unfortunately, directly applying well-known algorithms (usually implemented in software) leads high hardware complexity unacceptable decompression/compression latencies, which turn can negatively affect performance. Hence, there need for simple yet efficient that effectively compress common in-cache data patterns, has minimal effect on access latency.

10.1145/2370816.2370870 article EN 2012-09-19

RowClone

OPENALEX - Publications

Vivek Seshadri Yoongu Kim Chris Fallin Donghyuk Lee Rachata Ausavarungnirun and 6 more

Several system-level operations trigger bulk data copy or initialization. Even though these do not require any computation, current systems transfer a large quantity of back and forth on the memory channel to perform such operations. As result, consume high latency, bandwidth, energy--degrading both system performance energy efficiency.

10.1145/2540708.2540725 article EN 2013-12-07

AutoScale

OPENALEX - Publications

Anshul Gandhi Mor Harchol‐Balter Ram Raghunathan Michael A. Kozuch

Energy costs for data centers continue to rise, already exceeding $15 billion yearly. Sadly much of this power is wasted. Servers are only busy 10--30% the time on average, but they often left on, while idle, utilizing 60% or more peak when in idle state. We introduce a dynamic capacity management policy, AutoScale , that greatly reduces number servers needed driven by unpredictable, time-varying load, meeting response SLAs. scales center capacity, adding removing as needed. has two key...

10.1145/2382553.2382556 article EN ACM Transactions on Computer Systems 2012-11-01

Fast Bulk Bitwise AND and OR in DRAM

OPENALEX - Publications

Vivek Seshadri Kevin Hsieh Amirali Boroumand Donghyuk Lee Michael A. Kozuch and 3 more

Bitwise operations are an important component of modern day programming, and used in a variety applications such as databases. In this work, we propose new simple mechanism to implement <i>bulk</i> bitwise AND OR DRAM, which is faster more efficient than existing mechanisms. Our exploits DRAM operation perform AND/OR two rows completely within DRAM. The key idea simultaneously connect three cells bitline before the sense-amplification. By controlling value one cells, sense amplifier forces...

10.1109/lca.2015.2434872 article EN IEEE Computer Architecture Letters 2015-05-18

Internet suspend/resume

OPENALEX - Publications

Michael A. Kozuch Mahadev Satyanarayanan

We identify a new capability for mobile computing that mimics the opening and closing of laptop, but avoids physical transport hardware. Through rapid easy personalization depersonalization anonymous hardware, user is able to suspend work at one machine resume it another. Our key insight this can be achieved by layering virtual technology on distributed file system. report an initial implementation describe our plans improving efficiency, portability, security.

10.1109/mcsa.2002.1017484 article EN 2003-06-25

Optimality analysis of energy-performance trade-off for server farm management

OPENALEX - Publications

Anshul Gandhi Varun Gupta Mor Harchol‐Balter Michael A. Kozuch

10.1016/j.peva.2010.08.009 article EN Performance Evaluation 2010-08-12

c-Through

OPENALEX - Publications

Guohui Wang David G. Andersen Michael Kaminsky Konstantina Papagiannaki T. S. Eugene Ng and 2 more

10.1145/1851275.1851222 article EN ACM SIGCOMM Computer Communication Review 2010-08-16

Robust and flexible power-proportional storage

OPENALEX - Publications

Hrishikesh Amur James Cipar Varun Gupta Gregory R. Ganger Michael A. Kozuch and 1 more

Power-proportional cluster-based storage is an important component of overall cloud computing infrastructure. With it, substantial subsets nodes in the cluster can be turned off to save power during periods low utilization. Rabbit a distributed file system that arranges its data-layout provide ideal power-proportionality down very minimum number powered-up (enough store primary replica available datasets). addresses node failure rates large-scale clusters with data layouts minimize must if...

10.1145/1807128.1807164 article EN 2010-06-10

Scheduling threads for constructive cache sharing on CMPs

OPENALEX - Publications

Shimin Chen Phillip B. Gibbons Michael A. Kozuch Vasileios Liaskovitis Anastassia Ailamaki and 6 more

In chip multiprocessors (CMPs), limiting the number of offchip cache misses is crucial for good performance. Many multithreaded programs provide opportunities constructive sharing, in which concurrently scheduled threads share a largely overlapping working set. this paper, we compare performance two state-of-the-art schedulers proposed fine-grained programs: Parallel Depth First (PDF), specifically designed and Work Stealing (WS), more traditional design. Our experimental results indicate...

10.1145/1248377.1248396 article EN 2007-06-09

TetriSched

OPENALEX - Publications

Alexey Tumanov Timothy Zhu Jun Woo Park Michael A. Kozuch Mor Harchol‐Balter and 1 more

TetriSched is a scheduler that works in tandem with calendaring reservation system to continuously re-evaluate the immediate-term scheduling plan for all pending jobs (including those reservations and best-effort jobs) on each cycle. leverages information supplied by about jobs' deadlines estimated runtimes ahead deciding whether wait busy preferred resource type (e.g., machine GPU) or fall back less placement options. Plan-ahead affords significant flexibility handling mis-estimates job...

10.1145/2901318.2901355 article EN 2016-04-12

Open Cirrus: A Global Cloud Computing Testbed

OPENALEX - Publications

Arutyun Avetisyan Roy H. Campbell Indranil Gupta Michael T. Heath Steven Y. Ko and 13 more

Open Cirrus is a cloud computing testbed that, unlike existing alternatives, federates distributed data centers. It aims to spur innovation in systems and applications research catalyze development of an open source service stack for the cloud.

10.1109/mc.2010.111 article EN Computer 2010-04-01

Gather-scatter DRAM

OPENALEX - Publications

Vivek Seshadri Thomas Mullins Amirali Boroumand Onur Mutlu Phillip B. Gibbons and 2 more

Many data structures (e.g., matrices) are typically accessed with multiple access patterns. Depending on the layout of structure in physical address space, some patterns result non-unit strides. In existing systems, which optimized to store and cache lines, strided accesses exhibit low spatial locality. Therefore, they incur high latency, waste memory bandwidth space.

10.1145/2830772.2830820 article EN 2015-12-05

Linearly compressed pages

OPENALEX - Publications

Gennady Pekhimenko Vivek Seshadri Yoongu Kim Hongyi Xin Onur Mutlu and 3 more

Data compression is a promising approach for meeting the increasing memory capacity demands expected in future systems. Unfortunately, existing algorithms do not translate well when directly applied to main because they require controller perform non-trivial computation locate cache line within compressed page, thereby access latency and degrading system performance. Prior proposals addressing this performance degradation problem are either costly or energy inefficient.

10.1145/2540708.2540724 article EN 2013-12-07

The evicted-address filter

OPENALEX - Publications

Vivek Seshadri Onur Mutlu Michael A. Kozuch Todd C. Mowry

Off-chip main memory has long been a bottleneck for system performance. With increasing pressure due to multiple on-chip cores, effective cache utilization is important. In with limited space, we would ideally like prevent 1) pollution, i.e., blocks low reuse evicting high from the cache, and 2) thrashing, each other cache.

10.1145/2370816.2370868 article EN 2012-09-19

Flexible Hardware Acceleration for Instruction-Grain Program Monitoring

OPENALEX - Publications

Shimin Chen Michael A. Kozuch Theodoros Strigkos Babak Falsafi Phillip B. Gibbons and 5 more

Instruction-grain program monitoring tools, which check and analyze executing programs at the granularity of individual instructions, are invaluable for quickly detecting bugs security attacks then limiting their damage (via containment and/or recovery). Unfortunately, fine-grain nature implies very high overheads software-only typically based on dynamic binary instrumentation. Previous hardware proposals either focus mechanisms that target specific or address only cost In this paper, we...

10.1145/1394608.1382153 article EN ACM SIGARCH Computer Architecture News 2008-06-01

Pervasive Personal Computing in an Internet Suspend/Resume System

OPENALEX - Publications

Mahadev Satyanarayanan Ben Gilbert Matt Toups Niraj H. Tolia Ajay Surie and 9 more

The Internet suspend/resume model of mobile computing cuts the tight binding between PC state and hardware. By layering a virtual machine on distributed storage, ISR lets VM encapsulate execution user customization state; storage then transports that across space time. This article explores implications for an infrastructure-based approach to computing. It reports experiences with three versions describes work in progress toward OpenISR version

10.1109/mic.2007.46 article EN IEEE Internet Computing 2007-03-01

Are sleep states effective in data centers?

OPENALEX - Publications

Anshul Gandhi Mor Harchol‐Balter Michael A. Kozuch

While sleep states have existed for mobile devices and workstations some time, these not been incorporated into most of the servers in today's data centers. High setup times make center administrators fearful any form dynamic power management, whereby are suspended or shut down when load drops. This general reluctance has stalled research whether there might be feasible state (with sufficiently low overhead and/or power) that would actually beneficial paper investigates regime advantageous...

10.1109/igcc.2012.6322260 article EN 2012-06-01

PriorityMeister

OPENALEX - Publications

Timothy Zhu Alexey Tumanov Michael A. Kozuch Mor Harchol‐Balter Gregory R. Ganger

Meeting service level objectives (SLOs) for tail latency is an important and challenging open problem in cloud computing infrastructures. The challenges are exacerbated by burstiness the workloads. This paper describes PriorityMeister -- a system that employs combination of per-workload priorities rate limits to provide QoS shared networked storage, even with bursty automatically proactively configures workload across multiple stages (e.g., storage stage followed network stage) meet...

10.1145/2670979.2671008 article EN 2014-11-03

Exploiting compressed block size as an indicator of future reuse

OPENALEX - Publications

Gennady Pekhimenko Tyler Huberty Rui Cai Onur Mutlu Phillip B. Gibbons and 2 more

We introduce a set of new Compression-Aware Management Policies (CAMP) for on-chip caches that employ data compression. Our management policies are based on two key ideas. First, we show it is possible to build more efficient policy compressed if the block size directly used in calculating value (importance) cache. This leads Minimal-Value Eviction (MVE), evicts cache blocks with least value, both and expected future reuse. Second, that, some cases, can be as an indicator reuse block. use...

10.1109/hpca.2015.7056021 article EN 2015-02-01

Coming Soon ...