- Parallel Computing and Optimization Techniques
- Embedded Systems Design Techniques
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- Real-Time Systems Scheduling
- Distributed systems and fault tolerance
- Cloud Computing and Resource Management
- Architecture, Art, Education
- Modular Robots and Swarm Intelligence
- Digital Transformation in Industry
- Higher Education Teaching and Evaluation
- Numerical Methods and Algorithms
- CCD and CMOS Imaging Sensors
- Radiation Effects in Electronics
- Advanced Memory and Neural Computing
- X-ray Diffraction in Crystallography
- Model-Driven Software Engineering Techniques
- IoT and Edge/Fog Computing
- Enzyme Structure and Function
- Educational Technology in Learning
- Low-power high-performance VLSI design
- Graphite, nuclear technology, radiation studies
Universitat Politècnica de Catalunya
2015-2024
Barcelona Supercomputing Center
2014-2023
OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly OpenMP, it based on compiler directives. While OpenMP specification also includes support for heterogeneous execution, we use and as prototype implementation develop new ideas OpenMP. implements tasking model with runtime automatically exploit all SMP FPGA resources available in execution platform. In this paper, present ecosystem, Mercurium Nanos++ system. We show how applications are...
OmpSs is an OpenMP-like directive-based programming model that includes heterogeneous execution (MIC, GPU, SMP, etc.) and runtime task dependencies management. Indeed, has largely influenced the recently appeared OpenMP 4.0 specification. Zynq All-Programmable SoC combines features of a SMP FPGA benefits DLP, ILP TLP parallelisms in order to efficiently exploit new technology improvements chip resource capacities. In this paper, we focus on programmability support, presenting successful...
This article presents the new features of OmpSs@FPGA framework. OmpSs is a data-flow programming model that supports task nesting and dependencies to target asynchronous parallelism heterogeneity. extension addressed specifically FPGAs. environment built on top Mercurium source compiler Nanos++ runtime system. To address FPGA specifics implements several related as local variable caching, wide memory accesses or accelerator replication. In addition, part has been ported hardware. Driven by...
People and objects will soon share the same digital network for information exchange in a world named as age of cyber-physical systems. The general expectation is that people systems interact real-time. This poses pressure onto design to support increasing demands on computational power, while keeping low power envelop. Additionally, modular scaling easy programmability are also important ensure these become widespread. whole set expectations impose scientific technological challenges need...
The AXIOM project (Agile, eXtensible, fast I/O Module) aims at researching new software/hardware architectures for the future Cyber-Physical Systems (CPSs). These systems are expected to react in real-time, provide enough computational power assigned tasks, consume least possible energy such task (energy efficiency), scale up through modularity, allow an easy programmability across performance scaling, and exploit best existing standards minimal costs.
People and objects will soon share the same digital network for information exchange in a world named as age of cyber-physical systems. The general expectation is that people systems interact real-time. This poses pressure onto design to support increasing demands on computational power, while keeping low power envelop. Additionally, modular scaling easy programmability are also important ensure these become widespread. whole set expectations impose scientific technological challenges need...
Heterogeneous computing is emerging as a mandatory requirement for power-efficient system design. With this aim, modern heterogeneous platforms like Zynq All-Programmable SoC, that integrates ARM-based SMP and programmable logic, have been designed. However, those introduce large design cycles consisting on hardware/software partitioning, decisions granularity number of hardware accelerators, integration, bitstream generation, etc. This paper presents performance parallel estimation systems...
OmpSs is a directive-based programming model that uses OpenMP-like directives, allow to execute the tasks annotated on both SMPs and as FPGA kernels modern SoC processors, like Xilinx Zynq platform. includes support for accelerators (MIC, GPUs, FPGAs) task dependencies, OpenMP 4.0 will support. In this paper we present our approach of FPGAs SoC, current status implementation, its analysis performance evaluation.
This paper presents the OmpSs approach to deal with heterogeneous programming on GPU and FPGA accelerators. The model is based Mercurium compiler Nanos++ runtime. Applications are annotated directives specifying task-based parallelism. transforms code exploit parallelism in SMP host cores, also spawn work CUDA/OpenCL devices, For programmer needs only insert annotations provide kernel function be compiled by native compiler. In case of FPGAs, uses High-Level Synthesis tools from vendors...
Editor's notes: IoT constitutes an important area of cyber–physical systems, whose design and programming involve interactions between multiple abstraction layers. This article describes a new node, its hardware architecture, environment, two application scenarios where it may be used. —Samarjit Chakraborty, University North Carolina at Chapel Hill
Cyber-Physical Systems (CPSs) are widely necessary for many applications that require interactions with the humans and physical environment. A CPS integrates a set of hardware-software components to distribute, execute manage its operations. The AXIOM project (Agile, eXtensible, fast I/O Module) aims at developing platform such i) it can use an easy parallel programming model ii) easily scale-up performance by adding multiple boards (e.g., 1 10 run in parallel). supports task-based based on...
This paper proposes to enhance current task-based programming models by breaking their master-slave approach between the main processor and its hardware accelerators. As a proof-of-concept, it presents an extension of [email protected] toolchain that allows tasks offloaded into FPGA create synchronize nested on own without involving host. Those spawned may target host execute code not suitable for FPGA, like system calls or I/O operations; other kernel accelerators inside same FPGA. In...
In state-of-the-art FPGA, especially in chiplet-based devices, place and route has become an important challenge due to increase device size complexity. the same way, off-chip memory resources have grown number of modules. Making efficient use them a difficult task.
To achieve high performance and energy efficiency on near-future exascale computing systems, three key technology gaps needs to be bridged. These include: thermal control; extreme computation via HW acceleration new arithmetics; methods tools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach solutions, supported by the extension SW IPs, programming models derived from European research.
X-ray crystallography is a powerful method that has significantly contributed to our understanding of the biological function proteins and other molecules. This relies on production crystals that, however, are usually bottleneck in process. For some molecules, no crystallization been achieved or insufficient were obtained. Some systems do not crystallize at all, such as nanoparticles which, because their dimensions, cannot be treated by usual crystallographic methods. To solve this, whole...
In modern FPGA devices, place and route has become an increasingly difficult task due to increase in resources device complexity. This results exponential of implementation possibilities. Such a huge search space causes tools have hard time providing good solution. is even more challenging chiplet-based devices their topology. the same way, off-chip memory grown both size number modules. These are presented user as raw interfaces requiring manage how accelerator kernels access make effective...