NFDI4DS | UHH-SEMS - Publication Details

Jesús Alastruey-Benedé

ORCID: 0000-0003-4164-5078

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5073821260

Research Areas

Parallel Computing and Optimization Techniques
Interconnection Networks and Systems
Advanced Data Storage Technologies
Low-power high-performance VLSI design
Genomics and Phylogenetic Studies
Algorithms and Data Compression
Distributed and Parallel Computing Systems
Embedded Systems Design Techniques
Cloud Computing and Resource Management
Distributed systems and fault tolerance
Advanced Memory and Neural Computing
Enzyme Structure and Function
Protein Structure and Dynamics
Radiation Effects in Electronics
Caching and Content Delivery
CCD and CMOS Imaging Sensors
Evolutionary Algorithms and Applications
Chromosomal and Genetic Variations
Quantum Mechanics and Applications
Advanced Optimization Algorithms Research
Innovations in Educational Methods
RNA and protein synthesis mechanisms
DNA and Biological Computing
Educational Technology in Learning
Real-Time Systems Scheduling

Universidad de Zaragoza
2014-2025

Instituto de Investigación Sanitaria Aragón
2020

Hispanics in Philanthropy
2015

Berti: an Accurate Local-Delta Data Prefetcher

OPENALEX - Publications

Agustín Navarro-Torres Biswabandan Panda Jesús Alastruey-Benedé Pablo Ibáñez Víctor Viñals and 1 more

Data prefetching is a technique that plays crucial role in modern high-performance processors by hiding long latency memory accesses. Several state-of-the-art hardware prefetchers exploit the concept of deltas, defined as difference between cache line addresses two demand Existing delta prefetchers, such best offset (BOP) and multi-lookahead (MLOP), train predict future accesses based on global deltas. We observed use deltas results missed opportunities to anticipate accesses.In this paper,...

10.1109/micro56248.2022.00072 article EN 2022-10-01

A Complexity-Effective Local Delta Prefetcher

OPENALEX - Publications

Agustín Navarro-Torres Biswabandan Panda Jesús Alastruey-Benedé Pablo Ibáñez Víctor Viñnals-Yúfera and 1 more

10.1109/tc.2025.3533086 article EN cc-by IEEE Transactions on Computers 2025-01-01

Developing an AI IoT application with open software on a RISC-V SoC

OPENALEX - Publications

Enrique Torres-Sanchez Jesús Alastruey-Benedé Enrique F. Torres Moreno

RISC-V is an emergent architecture that gaining strength in low-power IoT applications. The stabilization of the architectural extensions and start commercialization based SOCs, like Kendryte K210, raises question whether this open standard will facilitate development applications specific markets or not. In paper we evaluate environments, toolchain, debugging processes related to Sipeed MAIX Go board, as well standalone SDK Micropython port for K210. training pipeline built-in convolutional...

10.1109/dcis51330.2020.9268645 article EN 2020-11-18

GenArchBench: A genomics benchmark suite for arm HPC processors

OPENALEX - Publications

Lorién López‐Villellas Rubén Langarita Asaf Badouh Vı́ctor Soria-Pardos Quim Aguado-Puig and 10 more

Arm usage has substantially grown in the High-Performance Computing (HPC) community. Japanese supercomputer Fugaku, powered by Arm-based A64FX processors, held top position on Top500 list between June 2020 and 2022, currently sitting fourth position. The recently released 7th generation of Amazon EC2 instances for compute-intensive workloads (C7 g) is also Graviton3 processors. Projects like European Mont-Blanc U.S. DOE/NNSA Astra are further examples irruption HPC. In parallel, over last...

10.1016/j.future.2024.03.050 article EN cc-by-nc Future Generation Computer Systems 2024-04-02

Concertina: Squeezing in Cache Content to Operate at Near-Threshold Voltage

OPENALEX - Publications

Alexandra Ferrerón Darío Suárez Gracia Jesús Alastruey-Benedé Teresa Monreal Pablo Ibáñez

Scaling supply voltage to values near the threshold allows a dramatic decrease in power consumption of processors; however, lower voltage, higher sensitivity process variation, and, hence, reliability. Large SRAM structures, like last-level cache (LLC), are extremely vulnerable variation because they aggressively sized satisfy high density requirements. In this paper, we propose Concertina, an LLC designed enable reliable operation at low voltages with conventional cells. Based on...

10.1109/tc.2015.2479585 article EN IEEE Transactions on Computers 2015-09-18

Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP

OPENALEX - Publications

Agustín Navarro-Torres Jesús Alastruey-Benedé Pablo Ibáñez Víctor Viñals

SPEC CPU is one of the most common benchmark suites used in computer architecture research. CPU2017 has recently been released to replace CPU2006. In this paper we present a detailed evaluation memory hierarchy performance for both CPU2006 and single-threaded benchmarks. The experiments were executed on an Intel Xeon Skylake-SP, which first processor implement mostly non-inclusive last-level cache (LLC). We classification benchmarks according their pressure analyze impact different LLC...

10.1371/journal.pone.0220135 article EN cc-by PLoS ONE 2019-08-01

Porting and Optimizing BWA-MEM2 Using the Fujitsu A64FX Processor

OPENALEX - Publications

Rubén Langarita Adrià Armejach Pablo Ibáñez Jesús Alastruey-Benedé Miquel Moretó

Sequence alignment pipelines for human genomes are an emerging workload that will dominate in the precision medicine field. BWA-MEM2 is a tool widely used scientific community to perform read mapping studies. In this paper, we port AArch64 architecture using ARMv8-A specification, and compare resulting version against Intel Skylake system both performance energy-to-solution. The porting effort entails numerous code modifications, since implements certain kernels x86_64 specific intrinsics,...

10.1109/tcbb.2023.3264514 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2023-04-05

Accelerating Sequence Alignments Based on FM-Index Using the Intel KNL Processor

OPENALEX - Publications

Jose M. Herruzo Sonia González-Navarro Pablo Ibáñez Víctor Viñals Jesús Alastruey-Benedé and 1 more

FM-index is a compact data structure suitable for fast matches of short reads to large reference genomes. The matching algorithm using this index exhibits irregular memory access patterns that cause frequent cache misses, resulting in bound problem. This paper analyzes different versions presented the literature, focusing on those computing aspects related access. As result analysis, we propose new organization minimizes demand bandwidth, allowing great improvement performance processors...

10.1109/tcbb.2018.2884701 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2018-12-07

Expanding the Limits of Computer-Assisted Sperm Analysis through the Development of Open Software

OPENALEX - Publications

J.L. Yániz Carlos Alquézar-Baeta Jorge Yagüe-Martínez Jesús Alastruey-Benedé I. Palacín and 4 more

Computer assisted sperm analysis (CASA) systems can reduce errors occurring in manual analysis. However, commercial CASA are frequently not applicable at the forefront of challenging research endeavors. The development open source software may offer important solutions for researchers working related areas. Here, we present an example this, with three new modules OpenCASA (hosted Github). first is Chemotactic Sperm Accumulation Module, a powerful tool studying chemotactic behavior, analyzing...

10.3390/biology9080207 article EN cc-by Biology 2020-08-05

Selection of the Register File Size and the Resource Allocation Policy on SMT Processors

OPENALEX - Publications

Jesús Alastruey-Benedé Teresa Monreal Francisco J. Cazorla Víctor Viñals Mateo Valero

The performance impact of the Physical Register File(PRF) size on Simultaneous Multithreading processors has not been extensively studied in spite being a critical shared resource. In this paper we analyze effect PRF for broad set resource allocation policies (Icount, Stall, Flush, Flush++, Static,Dcra and Hill-climbing) evaluate them under two metrics: instructions per second (IPS) throughput harmonic mean weighted IPCs (Hmean-wIPC) fairness. We have found that policy should be considered...

10.1109/sbac-pad.2008.17 article EN 2008-10-01

Compressed Sparse FM-Index: Fast Sequence Alignment Using Large K-Steps

OPENALEX - Publications

Rubén Langarita Adrià Armejach Javier Setoaín Pablo Ibáñez Jesús Alastruey-Benedé and 1 more

The FM-index is a data structure used in genomics for exact search of input sequences over large reference genomes. Algorithms based on the show an irregular memory access pattern, resulting bound problem. We analyze recent implementation and highlight existing throughput-memory trade-offs, showing that requirements limit k-steps. propose COFI, COmpressed FM-Index K-steps. COFI enables 15-step using less than 16 GB human genome 3 giga base pairs. An algorithm this new layout evaluated both...

10.1109/tcbb.2020.3000253 article EN IEEE/ACM Transactions on Computational Biology and Bioinformatics 2020-06-05

Accurate and efficient constrained molecular dynamics of polymers using Newton's method and special purpose code

OPENALEX - Publications

Lorién López‐Villellas Carl Christian Kjelgaard Mikkelsen Juan José Galano‐Frutos Santiago Marco‐Sola Jesús Alastruey-Benedé and 4 more

In molecular dynamics simulations we can often increase the time step by imposing constraints on bond lengths and angles. This allows us to extend length of interval therefore range physical phenomena that afford simulate. We examine existing algorithms software for solving nonlinear constraint equations in parallel explain why it is necessary advance state-of-the-art. present ILVES-PC, a new algorithm proteins accurately efficiently. It solves same system differential algebraic as...

10.1016/j.cpc.2023.108742 article EN cc-by-nc-nd Computer Physics Communications 2023-03-29

Microarchitectural Support for Speculative Register Renaming

OPENALEX - Publications

Jesús Alastruey-Benedé Teresa Monreal Víctor Viñals Mateo Valero

This paper proposes and evaluates a new microarchitecture for out-of-order processors that supports speculative renaming. We call renaming to the omission of physical register allocation along with early release registers. These policies may cause operand not be kept in file (PRF). Thus, we add low-ported auxiliary (XRF) located outside processor core keeps values absent PRF supplies them at higher latency. To support location operands being either or XRF, use virtual consider directed by...

10.1109/ipdps.2007.370237 article EN 2007-01-01

Block Disabling Characterization and Improvements in CMPs Operating at Ultra-low Voltages

OPENALEX - Publications

Alexandra Ferrerón Darío Suárez Gracia Jesús Alastruey-Benedé Teresa Monreal Víctor Viñals

Power density has become the limiting factor in technology scaling as power budget restricts amount of hardware that can be active at same time. Reducing supply voltage to ultra-low ranges close threshold region promise great energy savings. However, potential savings are limited by correct operation SRAM cells, which is not guaranteed below Vddmin, minimum cache structures operate reliably. Understanding effects operating Vddmin requires complex modelling, so we introduce an updated...

10.1109/sbac-pad.2014.12 article EN 2014-10-01

A fault-tolerant last level cache for CMPs operating at ultra-low voltage

OPENALEX - Publications

Alexandra Ferrerón Jesús Alastruey-Benedé Darío Suárez Gracia Teresa Monreal Pablo Ibáñez and 1 more

10.1016/j.jpdc.2018.10.010 article EN Journal of Parallel and Distributed Computing 2018-11-07

BALANCER: bandwidth allocation and cache partitioning for multicore processors

OPENALEX - Publications

Agustín Navarro-Torres Jesús Alastruey-Benedé Pablo Ibáñez Víctor Viñals

Abstract The management of shared resources in multicore processors is an open problem due to the continuous evolution these systems. trend toward increasing number cores and organizing them clusters sets out new challenges not considered previous works. In this paper, we characterize use cache memory bandwidth AMD Rome processor executing multiprogrammed workloads propose several mechanisms that control improve system performance fairness. Our require no hardware or operating modifications....

10.1007/s11227-023-05070-0 article EN cc-by The Journal of Supercomputing 2023-02-04

Genarchbench: A Genomics Benchmark Suite for Arm Hpc Processors

OPENALEX - Publications

Lorién López‐Villellas Rubén Langarita Asaf Badouh Vı́ctor Soria-Pardos Quim Aguado-Puig and 10 more

Arm® usage has substantially grown in the High-Performance Computing (HPC) community. Japanese supercomputer Fugaku, powered by Arm®-based A64FX processors, held top position on Top500 list between June 2020 and 2022, currently sitting second position. The recently released 7th generation of Amazon EC2 instances for compute-intensive workloads (C7g) is also Graviton3 processors. Projects like European Mont-Blanc U.S. DOE/NNSA Astra are further examples irruption HPC. In parallel, over last...

10.2139/ssrn.4632220 preprint EN 2023-01-01

Speculative early register release

OPENALEX - Publications

Jesús Alastruey-Benedé Teresa Monreal Víctor Viñals Mateo Valero

The late release policy of conventional renaming keeps many registers in the register file assigned spite containing values that will never be read future. In this work, we study potential a novel scheme speculatively releases physical as soon it has been by predicted last instruction references its value. An auxiliary placed outside critical paths processor pipeline holds early released just case they are unexpectedly referenced some instruction. addition to demonstrate feasibility last-use...

10.1145/1128022.1128061 article EN 2006-05-03

Software Demand, Hardware Supply

OPENALEX - Publications

Jesús Alastruey-Benedé José Luis Briz Pablo Ibáñez Víctor Viñals

Do the demands of new software outpace developments in hardware? Experiments with behavior SPEC CPU on-chip caches and data collection from a wide range processors over time address this question illuminate trends hardware evolution

10.1109/mm.2006.80 article EN IEEE Micro 2006-07-01

AISC: Approximate Instruction Set Computer

OPENALEX - Publications

Alexandra Ferrerón Jesús Alastruey-Benedé Darío Suárez Gracia Ulya R. Karpuzcu

This paper makes the case for a single-ISA heterogeneous computing platform, AISC, where each compute engine (be it core or an accelerator) supports different subset of very same ISA. An ISA may not be functionally complete, but union (per engine) subsets renders platform-wide single Tailoring microarchitecture to that can easily reduce hardware complexity. At time, energy efficiency improve by exploiting algorithmic noise tolerance: mapping code sequences tolerate (any potential inaccuracy...

10.48550/arxiv.1803.06955 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Exposing Abstraction-Level Interactions with a Parallel Ray Tracer

OPENALEX - Publications

Alejandro Valero Darío Suárez Gracia Rubén Gran Tejero Luis M. Ramos Agustín Navarro-Torres and 12 more

For students of any Computer Engineering program, attaining an integrated vision the different abstraction levels is paramount to fully understand and exploit a computer system, especially when tough topics such as parallelism, concurrency, consistency, or atomicity are involved at hardware-software frontiers. However, structure typical engineering programs leads creation self-contained courses, where single level studied overall picture lost.

10.1145/3338698.3338886 article EN 2019-06-22

Coming Soon ...