NFDI4DS | UHH-SEMS - Publication Details

Microarchitectural Design Space Exploration Using an Architecture-Centric Approach

OPENALEX - Publications

Christophe Dubach Timothy M. Jones Michael O’Boyle

The microarchitectural design space of a new processor is too large for an architect to evaluate in its entirety. Even with the use statistical simulation, evaluation single configuration can take excessive time due need run set benchmarks realistic workloads. This paper proposes novel machine learning model that quickly and accurately predict performance energy consumption any programs on configuration. architecture-centric approach uses prior knowledge from off-line training applies it...

10.1109/micro.2007.12 article EN 2007-01-01

Parallaft: Runtime-Based CPU Fault Tolerance via Heterogeneous Parallelism

OPENALEX - Publications

B Zhang Sam Ainsworth Lev Mukhanov Timothy M. Jones

10.1145/3696443.3708946 article EN 2025-02-22

Portable compiler optimisation across embedded programs and microarchitectures using machine learning

OPENALEX - Publications

Christophe Dubach Timothy M. Jones Edwin V. Bonilla Grigori Fursin Michael O’Boyle

Building an optimising compiler is a difficult and time consuming task which must be repeated for each generation of microprocessor. As the underlying microarchitecture changes from one to next, retuned optimise specifically that new system. It may take several releases effectively exploit processor's performance potential, by has appeared process starts again.

10.1145/1669112.1669124 article EN 2009-12-12

HELIX

OPENALEX - Publications

Simone Campanoni Timothy M. Jones Glenn Holloway Vijay Janapa Reddi Gu-Yeon Wei and 1 more

We describe and evaluate HELIX, a new technique for automatic loop parallelization that assigns successive iterations of to separate threads. show the inter-thread communication costs forced by loop-carried data dependences can be mitigated code optimization, using an effective heuristic selecting loops parallelize, helper threads prefetch synchronization signals. have implemented HELIX as part optimizing compiler framework automatically selects parallelizes from general sequential programs....

10.1145/2259016.2259028 article EN 2012-03-31

MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State

OPENALEX - Publications

Sam Ainsworth Timothy M. Jones

The disclosure of the Spectre speculative-execution attacks in January 2018 has left a severe vulnerability that systems are still struggling with how to patch. solutions currently exist tend have incomplete coverage, perform badly, or highly undesirable edge cases cause application domains break. MuonTrap allows processors continue speculate, avoiding significant reductions performance, without impacting security. We instead prevent propagation any state based on speculative execution, by...

10.1109/isca45697.2020.00022 preprint EN 2020-05-01

Graph Prefetching Using Data Structure Knowledge

OPENALEX - Publications

Sam Ainsworth Timothy M. Jones

Searches on large graphs are heavily memory latency bound, as a result of many high DRAM accesses. Due to the highly irregular nature access patterns involved, caches and prefetchers, both hardware software, perform poorly graph workloads. This leads CPU stalling for majority time. However, in cases data pattern is well defined predictable advance, falling into small set simple patterns. Although existing implicit prefetchers cannot bring significant benefit, prefetcher armed with knowledge...

10.1145/2925426.2926254 article EN 2016-06-01

A Predictive Model for Dynamic Microarchitectural Adaptivity Control

OPENALEX - Publications

Christophe Dubach Timothy M. Jones Edwin V. Bonilla Michael O’Boyle

Adaptive micro architectures are a promising solution for designing high-performance, power-efficient microprocessors. They offer the ability to tailor computational resources specific requirements of different programs or program phases. have potential adapt hardware cost-effectively at runtime any application's needs. However, one key challenges is how dynamically determine best architecture configuration given time, new workload. This paper proposes novel control mechanism based on...

10.1109/micro.2010.14 article EN 2010-12-01

The thermal conductance of uranium dioxide/stainless steel interfaces

OPENALEX - Publications

A. C. Rapier Timothy M. Jones J.E. McIintosh

10.1016/0017-9310(63)90101-7 article EN International Journal of Heat and Mass Transfer 1963-05-01

CHERIvoke

OPENALEX - Publications

Hongyan Xia Jonathan Woodruff Sam Ainsworth Nathaniel Wesley Filardo Michael Roe and 6 more

A lack of temporal safety in low-level languages has led to an epidemic use-after-free exploits. These have surpassed number and severity even the infamous buffer-overflow exploits violating spatial safety. Capability addressing can directly enforce for C language by enforcing bounds on pointers rendering unforgeable. Nevertheless, efficient solution strong memory remains elusive.

10.1145/3352460.3358288 article EN 2019-10-11

Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs

OPENALEX - Publications

Karthik T. Sundararajan Vasileios Porpodas Timothy M. Jones Nigel Topham Björn Franke

Intelligently partitioning the last-level cache within a chip multiprocessor can bring significant performance improvements. Resources are given to applications that benefit most from them, restricting each core number of logical ways. However, although overall is increased, existing schemes fail consider energy saving when making their decisions. This paper presents Cooperative Partitioning, runtime scheme reduces both dynamic and static while maintaining high performance. It works by...

10.1109/hpca.2012.6169036 article EN 2012-02-01

Cornucopia: Temporal Safety for CHERI Heaps

OPENALEX - Publications

Nathaniel Wesley Filardo Brett F. Gutstein Jonathan Woodruff Sam Ainsworth Lucian Paul-Trifu and 19 more

Use-after-free violations of temporal memory safety continue to plague software systems, underpinning many high-impact exploits. The CHERI capability system shows great promise in achieving C and C++ language spatial safety, preventing out-of-bounds accesses. Enforcing language-level on requires revocation, traditionally achieved either via table lookups (avoided for performance the design) or by identifying capabilities revoke them (similar a garbage-collector sweep). CHERIvoke, prior...

10.1109/sp40000.2020.00098 article EN 2022 IEEE Symposium on Security and Privacy (SP) 2020-05-01

Janitizer: Rethinking Binary Tools for Practical and Comprehensive Security

OPENALEX - Publications

Mahwish Arif Sam Ainsworth Timothy M. Jones

10.1145/3696443.3708930 article EN 2025-02-22

A Deep Technical Review of nZDC Fault Tolerance

OPENALEX - Publications

Minli Liao Sam Ainsworth Lev Mukhanov Timothy M. Jones

10.1145/3708493.3712688 article EN cc-by 2025-02-25

Mascot: Predicting Memory Dependencies and Opportunities for Speculative Memory Bypassing

OPENALEX - Publications

Karl H. Mose Sebastian S. Kim Alberto Ros Timothy M. Jones Robert Mullins

10.1109/hpca61900.2025.00016 article EN 2025-03-01

HELIX-RC

OPENALEX - Publications

Simone Campanoni Kevin Brownell Svilen Kanev Timothy M. Jones Gu-Yeon Wei and 1 more

Data dependences in sequential programs limit parallelization because extracted threads cannot run independently. Although thread-level speculation can avoid the need for precise dependence analysis, communication overheads required to synchronize actual counteract benefits of parallelization. To address these challenges, we propose a lightweight architectural enhancement co-designed with parallelizing compiler, which together decouple from thread execution. Simulations approaches, applied...

10.1145/2678373.2665705 article EN ACM SIGARCH Computer Architecture News 2014-06-14

An Event-Triggered Programmable Prefetcher for Irregular Workloads

OPENALEX - Publications

Sam Ainsworth Timothy M. Jones

Many modern workloads compute on large amounts of data, often with irregular memory accesses. Current architectures perform poorly for these workloads, as existing prefetching techniques cannot capture the access patterns; applications end up heavily memory-bound a result. Although number exist to explicitly configure prefetcher traversal patterns, gaining significant speedups, they do not generalise beyond their target data structures. Instead, we propose an event-triggered programmable...

10.1145/3173162.3173189 article EN 2018-03-19

MarkUs: Drop-in use-after-free prevention for low-level languages

OPENALEX - Publications

Sam Ainsworth Timothy M. Jones

Use-after-free vulnerabilities have plagued software written in low-level languages, such as C and C++, becoming one of the most frequent classes exploited bugs. Attackers identify code paths where data is manually freed by programmer, but later incorrectly reused, take advantage reallocating to themselves. They then alter behind program's back, using erroneous reuse gain control application and, potentially, system. While a variety techniques been developed deal with these vulnerabilities,...

10.1109/sp40000.2020.00058 article EN 2022 IEEE Symposium on Security and Privacy (SP) 2020-05-01

Smart cache: A self adaptive cache architecture for energy efficiency

OPENALEX - Publications

Karthik T. Sundararajan Timothy M. Jones Nigel Topham

The demand for low-power embedded systems requires designers to tune processor parameters avoid excessive energy wastage. Tuning on a per-application or per-application-phase basis allows greater saving in consumption without noticeable degradation performance. On-chip caches often consume significant fraction of the total budget and are therefore prime candidates adaptation. Fixed-configuration must be designed deliver low average memory access times across wide range potential...

10.1109/samos.2011.6045443 article EN International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation 2011-07-01

Throttling Automatic Vectorization: When Less is More

OPENALEX - Publications

Vasileios Porpodas Timothy M. Jones

SIMD vectors are widely adopted in modern general purpose processors as they can boost performance and energy efficiency for certain applications. Compiler-based automatic vectorization is one approach generating codethat makes efficient use of the units, has benefit avoiding hand development platform-specific optimizations. The Superword-Level Parallelism (SLP) algorithm most well-known implementation when starting from straight-line scalar code, implemented several major compilers....

10.1109/pact.2015.32 article EN 2015-10-01

Software prefetching for indirect memory accesses

OPENALEX - Publications

Sam Ainsworth Timothy M. Jones

Many modern data processing and HPC workloads are heavily memory-latency bound. A tempting proposition to solve this is software prefetching, where special non-blocking loads used bring into the cache hierarchy just before being required. However, these difficult insert effectively improve performance, techniques for automatic insertion currently limited. This paper develops a novel compiler pass automatically generate prefetches indirect memory accesses, class of irregular accesses often...

10.1109/cgo.2017.7863749 article EN 2017-02-01

PSLP: padded SLP automatic vectorization

OPENALEX - Publications

Vasileios Porpodas Alberto Magni Timothy M. Jones

The need to increase performance and power efficiency in modern processors has led a wide adoption of SIMD vector units. All major vendors support instructions the trend is pushing them become wider more powerful. However, writing code that makes efficient use these units hard leads platform-specific implementations. Compiler-based automatic vectorization one solution for this problem. In particular Superword-Level Parallelism (SLP) algorithm primary way automatically generate starting from...

10.5555/2738600.2738625 article EN Symposium on Code Generation and Optimization 2015-02-07

Software prefetching for indirect memory accesses

OPENALEX - Publications

Sam Ainsworth Timothy M. Jones

Many modern data processing and HPC workloads are heavily memory-latency bound. A tempting proposition to solve this is software prefetching, where special non-blocking loads used bring into the cache hierarchy just before being required. However, these difficult insert effectively improve performance, techniques for automatic insertion currently limited. This paper develops a novel compiler pass automatically generate prefetches indirect memory accesses, class of irregular accesses often...

10.5555/3049832.3049865 article EN Symposium on Code Generation and Optimization 2017-02-04

PSLP: Padded SLP automatic vectorization

OPENALEX - Publications

Vasileios Porpodas Alberto Magni Timothy M. Jones

The need to increase performance and power efficiency in modern processors has led a wide adoption of SIMD vector units. All major vendors support instructions the trend is pushing them become wider more powerful. However, writing code that makes efficient use these units hard leads platform-specific implementations. Compiler-based automatic vectorization one solution for this problem. In particular Superword-Level Parallelism (SLP) algorithm primary way automatically generate starting from...

10.1109/cgo.2015.7054199 article EN 2015-02-01

Compiler directed early register release

OPENALEX - Publications

Timothy M. Jones Michael O’Boyle Jaume Abella Antonio González Oğuz Ergin

This paper presents a novel compiler directed technique to reduce the register pressure and power of file by releasing registers early. The identifies that mil only be read once renames them different logical registers. Upon issuing an instruction with one these as source, processor knows there will no more uses it can release through checkpointing. reduces occupancy our banked file, allowing banks turned off for savings. Our scheme is faster, simpler requires less hardware than recently...

10.1109/pact.2005.14 article EN 2005-01-01