Vı́ctor Soria-Pardos

ORCID: 0000-0001-8337-6326
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Embedded Systems Design Techniques
  • Low-power high-performance VLSI design
  • Evolutionary Algorithms and Applications
  • Interconnection Networks and Systems
  • VLSI and FPGA Design Techniques
  • Advanced Data Storage Technologies
  • Distributed and Parallel Computing Systems
  • Distributed systems and fault tolerance
  • Tensor decomposition and applications
  • Genomics and Phylogenetic Studies
  • VLSI and Analog Circuit Testing
  • Advanced Memory and Neural Computing
  • RNA modifications and cancer

Barcelona Supercomputing Center
2020-2024

Universitat Politècnica de Catalunya
2020-2024

Arm usage has substantially grown in the High-Performance Computing (HPC) community. Japanese supercomputer Fugaku, powered by Arm-based A64FX processors, held top position on Top500 list between June 2020 and 2022, currently sitting fourth position. The recently released 7th generation of Amazon EC2 instances for compute-intensive workloads (C7 g) is also Graviton3 processors. Projects like European Mont-Blanc U.S. DOE/NNSA Astra are further examples irruption HPC. In parallel, over last...

10.1016/j.future.2024.03.050 article EN cc-by-nc Future Generation Computer Systems 2024-04-02

The design presented in this paper, called preDRAC, is a RISC-V general purpose processor capable of booting Linux jointly developed by BSC, CIC-IPN, IMB-CNM (CSIC), and UPC. preDRAC the first designed fabricated Spanish or Mexican academic institution, will be basis future designs these institutions. This paper summarizes tasks, for FPGA SoC later, from high architectural level descriptions down to RTL then going through logic synthesis physical get layout ready its final tapeout CMOS 65nm...

10.1109/dcis51330.2020.9268664 article EN 2020-11-18

The RISC-V open Instruction Set Architecture (ISA) has proven to be a solid alternative licensed ISAs. In the past 5 years, plethora of industrial and academic cores accelerators have been developed implementing this ISA. paper, we present Sargantana, 64-bit processor based on that implements RV64G ISA, subset vector instructions extension (RVV 0.7.1), custom application-specific instructions. Sargantana features highly optimized 7-stage pipeline out-of-order write-back, register renaming,...

10.1109/dsd57027.2022.00042 article EN 2022 25th Euromicro Conference on Digital System Design (DSD) 2022-08-01

This paper describes the design, verification, implementation and fabrication of Drac Vector IN-Order (DVINO) processor, a RISC-V vector processor capable booting Linux jointly developed by BSC, CIC-IPN, IMB-CNM (CSIC), UPC. The DVINO includes an internally two-lane unit as well Phase Locked Loop (PLL) Analog-to-Digital Converter (ADC). summarizes design from architectural logic synthesis physical in CMOS 65nm technology.

10.1109/dcis55711.2022.9970128 article EN 2022-11-16

Arm® usage has substantially grown in the High-Performance Computing (HPC) community. Japanese supercomputer Fugaku, powered by Arm®-based A64FX processors, held top position on Top500 list between June 2020 and 2022, currently sitting second position. The recently released 7th generation of Amazon EC2 instances for compute-intensive workloads (C7g) is also Graviton3 processors. Projects like European Mont-Blanc U.S. DOE/NNSA Astra are further examples irruption HPC. In parallel, over last...

10.2139/ssrn.4632220 preprint EN 2023-01-01

With increasing core counts in modern multi-core designs, the overhead of synchronization jeopardizes scalability and efficiency parallel applications. To mitigate these overheads, cache-coherent protocols offer support for Atomic Memory Operations (AMOs) that can be executed near-core (near) or remotely on-chip memory hierarchy (far).

10.1145/3579371.3589065 article EN 2023-06-16

This paper proposes the Tensor Marshaling Unit (TMU), a near-core programmable dataflow engine for multicore architectures that accelerates tensor traversals and merging, most critical operations of sparse workloads running on today's computing infrastructures. The TMU leverages novel multi-lane design enables parallel loading which naturally produces vector operands are marshaled into core efficient SIMD computation. supports all necessary primitives to be tensor-format tensor-algebra...

10.1145/3613424.3614284 article EN 2023-10-28
Coming Soon ...