NFDI4DS | UHH-SEMS - Publication Details

Hiago Mayk G. de A. Rocha

ORCID: 0000-0002-0827-0131

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5019705944

Research Areas

Parallel Computing and Optimization Techniques
Cloud Computing and Resource Management
Interconnection Networks and Systems
Graph Theory and Algorithms
Embedded Systems Design Techniques
Advanced Graph Neural Networks
Advanced Data Storage Technologies
VLSI and FPGA Design Techniques
Energy Harvesting in Wireless Networks
Distributed and Parallel Computing Systems
Advanced Memory and Neural Computing

Universidade Federal da Bahia
2025

Universidade Federal do Rio Grande do Sul
2020-2023

Exploiting Design Flexibility in Multi-Tenant Multi-FPGA Edge Systems

OPENALEX - Publications

Ian Kersz Michael Guilherme Jordan Hiago Mayk G. de A. Rocha Felipe Kalinski Ferreira José Rodrigo Azambuja and 2 more

10.1109/lascas64004.2025.10966290 article EN 2025-02-25

Boosting Graph Analytics by Tuning Threads and Data Affinity on NUMA Systems

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Janaína Schwarzrock Arthur F. Lorenzon Antonio Carlos Schneider Beck

The execution of large real-world graphs, such as web searches and social networks, has been boosting by modern HPC systems. However, their irregular communication patterns poor data locality impose many challenges, mainly when executed on NUMA As we show in this paper, there is no one-fits-all configuration for threads/data mapping, the best combination will vary according to system, graph algorithm, input at hand. Based that, propose Graphith: a framework that automatically enhances...

10.1109/pdp52278.2021.00033 article EN 2021-03-01

Using machine learning to optimize graph execution on NUMA machines

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Janaína Schwarzrock Arthur F. Lorenzon Antonio Carlos Schneider Beck

This paper proposes PredG, a Machine Learning framework to enhance the graph processing performance by finding ideal thread and data mapping on NUMA systems. PredG is agnostic input graph: it uses available graphs' features train an ANN perform predictions as new graphs arrive - without any application execution after being trained. When evaluating over representative algorithms three systems, its solutions are up 41% faster than Linux OS Default Best Static average 2% far from Oracle -,...

10.1145/3489517.3530581 article EN Proceedings of the 59th ACM/IEEE Design Automation Conference 2022-07-10

Effective Exploration of Thread Throttling and Thread/Page Mapping on NUMA Systems

OPENALEX - Publications

Janaína Schwarzrock Hiago Mayk G. de A. Rocha Antonio Carlos Schneider Beck Arthur F. Lorenzon

NUMA systems have become commonly used in HPC. However, to fully take advantage of these systems, the right thread-to-core allocation and page placement are essential. On top that, considering that many parallel applications limited scalability, applying thread throttling (i.e., artificially reducing number active threads) most times will further improve energy and/or performance. Because it involves variables, previous research has not considered aforementioned approaches altogether....

10.1109/hpcc-smartcity-dss50907.2020.00030 article EN 2020-12-01

Improving the efficiency of graph algorithm executions on high‐performance computing

OPENALEX - Publications

Marcelo K. Moori Hiago Mayk G. de A. Rocha Janaína Schwarzrock Arthur F. Lorenzon Antonio Carlos Schneider Beck

Summary The growing need for extracting information from large graphs has been pushing the development of parallel graph algorithms. However, highly irregular structure real‐world limits performance and energy improvements applications. In this paper, we show that, in most cases, using all available cores multiprocessor is not best option terms aforementioned non‐functional requirements. Based on propose GraphKat , a framework that enables simultaneous processing several algorithms/graphs...

10.1002/cpe.7419 article EN Concurrency and Computation Practice and Experience 2022-11-01

Using evolutionary metaheuristics to solve the mapping and routing problem in networks on chip

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Antonio Carlos Schneider Beck Márcio Kreutz Sílvia Maria Diniz Monteiro Maia Monica Pereira

10.1007/s10617-023-09269-5 article EN Design Automation for Embedded Systems 2023-03-10

Automatic CPU-GPU Allocation for Graph Execution

OPENALEX - Publications

Marcelo K. Moori Hiago Mayk G. de A. Rocha M.A.A.P. Silva Janaína Schwarzrock Arthur F. Lorenzon and 1 more

Although advances in modern GPUs have accelerated the execution of heavy data processing applications, speeding up graph on these systems is not a trivial task: applications are characterized by their high volume irregular memory access that varies with structure so they do reach peak performance when executing many times. In cases, CPU more suitable. Given structures can be identified through high-level metrics (e.g., diameter and average clustering coefficient), may assist designer...

10.1109/pdp59025.2023.00013 article EN 2023-03-01

A Routing based Genetic Algorithm for Task Mapping on MPSoC

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Antonio Carlos Schneider Beck Sílvia Maria Diniz Monteiro Maia Márcio Kreutz Monica Pereira

This work proposes an optimized task mapping solution called Routing Model-based Genetic Algorithm (RMGA) that combines and routing problems using Integer Linear Programming (ILP) model as a fitness function. We compared our proposed RMGA with other Algorithms (GA) address the problem classical flow x distance function evaluation. Experimental results evaluating communication latency demonstrate algorithm outperforms two GAs from literature. It presents up to 30% lower delay when simulating...

10.1109/sbesc51047.2020.9277843 article EN 2020-11-24

Optimizing Single-Source Graph Execution on NUMA Machines

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Vicenç Beltrán Arthur F. Lorenzon Antonio Carlos Schneider Beck

Graphs are data structures capable of representing problems from different domains, such as logistics and social networks. However, these massive graphs stored in high-performance computing (HPC) servers start processing distinct source vertices (i.e., single-source: a user or message network). Therefore, the amount structure sub-graphs to be processed will also change depending on source, highly influencing graph algorithm behavior performance. In this paper, we propose GraphNroll,...

10.1109/sbesc60926.2023.10324068 article EN 2023-11-21

Smoothing on Dynamic Concurrency Throttling

OPENALEX - Publications

Janaína Schwarzrock Hiago Mayk G. de A. Rocha Arthur F. Lorenzon Antonio Carlos Schneider Beck

Technology scaling has been allowing a growing number of cores in processors to satisfy the increasing demand new applications, which need process huge amounts data High-Performance Computing (HPC). However, considering that many parallel applications have limited scalability, not always activating maximum available execute an application will provide best outcome energy and performance (represented by Energy-Delay Product, or EDP). Because that, works already proposed different Dynamic...

10.1109/ipdpsw55747.2022.00154 article EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2022-05-01

AtTune: A Heuristic based Framework for Parallel Applications Autotuning

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Janaína Schwarzrock Monica Pereira Lucas Mello Schnorr Philippe O. A. Navaux and 2 more

Several aspects limit the scalability of parallel applications, e.g., off-chip bus saturation and data synchronization. Moreover, high cost cooling HPC systems, which can outweigh developing system itself, has pushed application’s execution to another level requirements, in terms performance energy. In this work, we propose AtTune: a heuristic-based framework for tuning number processes/threads CPU frequency optimize applications’ execution. AtTune is transparent user, independent input...

10.5753/sbesc_estendido.2020.13105 article EN 2020-11-23

Análise da Execução Concorrente de Aplicações Paralelas em Arquiteturas Multicore

OPENALEX - Publications

Vinicius Da Silva Thiarles Medeiros Hiago Mayk G. de A. Rocha Marcelo Caggiani Luizelli Fábio Rossi and 2 more

O paralelismo no nível de threads (TLP) tem sido amplamente utilizado para otimizar o uso recursos computacionais (e.g., memórias cache e unidades funcionais da CPU) sistemas alto desempenho. No entanto, como algumas aplicações não escalam com número threads, ﬁcarão ociosos quando a aplicação é executada ideal threads. Neste sentido, execução concorrente paralelas pode ser utilizada prover uma melhor utilização dos sem impactar desempenho consumo energia do sistema um todo. Dito isto, nós...

10.5753/wscad.2020.14058 article PT 2020-10-21

Searching for the Ideal Number of Threads on Asymmetric Multiprocessors

OPENALEX - Publications

Marcelo K. Moori Hiago Mayk G. de A. Rocha Arthur F. Lorenzon Antonio Carlos Schneider Beck

Asymmetric multicore processors (AMP) combine high-performance cores with more energy-efficient ones, capitalizing on the diverse performance demands of modern devices (e.g., smartphones and tablets). Although hardware players have been designing powerful AMPs for desktop server computers, such as Apple M1 Intel Alder Lake family, these impose new challenges parallel computing researchers how to properly use them their fullest. As we show in this paper, best number threads which combination...

10.1109/sbesc60926.2023.10324167 article EN 2023-11-21

Firefly: An Open-source Rocket-based Intermittent Framework

OPENALEX - Publications

Hiago Mayk G. de A. Rocha Guilherme Korol Michael Guilherme Jordan Arthur M. Krause Ronaldo Silveira and 5 more

Intermittent systems are ultra-low-power batteryless devices that increasing in popularity. These operate with energy extracted entirely from the environment. Since most environments cannot ensure sufficient and steady power supply conditions, intermittent suffer frequent outages, where computation is interrupted due to lack of energy. While numerous works have enabled via many different techniques, there no versatile configurable tools functionality, rapid development/design space...

10.1109/sbcci50935.2020.9189926 article EN 2020-08-01

Coming Soon ...