NFDI4DS | UHH-SEMS - Publication Details

Leonel Sousa

ORCID: 0000-0002-8066-221X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5077537777

Research Areas

Parallel Computing and Optimization Techniques
Video Coding and Compression Technologies
Cryptography and Residue Arithmetic
Coding theory and cryptography
Cryptographic Implementations and Security
Advanced Vision and Imaging
Advanced Data Compression Techniques
Distributed and Parallel Computing Systems
Cryptography and Data Security
Advanced Data Storage Technologies
Interconnection Networks and Systems
Embedded Systems Design Techniques
Cloud Computing and Resource Management
Low-power high-performance VLSI design
Image and Video Quality Assessment
Error Correcting Code Techniques
Advanced Wireless Communication Techniques
Genomics and Phylogenetic Studies
Advanced Memory and Neural Computing
Evolutionary Algorithms and Applications
CCD and CMOS Imaging Sensors
Microfluidic and Bio-sensing Technologies
Numerical Methods and Algorithms
Neuroscience and Neural Engineering
Machine Learning in Bioinformatics

Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento
2016-2025

University of Lisbon
2015-2024

Instituto Superior Técnico
2010-2024

Universidade de Brasília
2024

Instituto Politécnico de Lisboa
2009-2023

Institut National des Sciences Appliquées de Lyon
2022

Nvidia (United States)
2022

Institut national de recherche en informatique et en automatique
2022

Centre National de la Recherche Scientifique
2022

École Polytechnique
2022

Communication contention in task scheduling

OPENALEX - Publications

Oliver Sinnen Leonel Sousa

Task scheduling is an essential aspect of parallel programming. Most heuristics for this NP-hard problem are based on a simple system model that assumes fully connected processors and concurrent interprocessor communication. Hence, contention communication resources not considered in task scheduling, yet it has strong influence the execution time program. This paper investigates incorporation awareness into scheduling. A new proposed, allowing us to capture both end-point network contention....

10.1109/tpds.2005.64 article EN IEEE Transactions on Parallel and Distributed Systems 2005-05-03

Cache-aware Roofline model: Upgrading the loft

OPENALEX - Publications

Aleksandar Ilić Frederico Pratas Leonel Sousa

The Roofline model graphically represents the attainable upper bound performance of a computer architecture. This paper analyzes original and proposes novel approach to provide more insightful modeling modern architectures by introducing cache-awareness, thus significantly improving guidelines for application optimization. proposed was experimentally verified different taking advantage built-in hardware counters with curve fitness above 90%.

10.1109/l-ca.2013.6 article EN IEEE Computer Architecture Letters 2013-04-22

Femtomolar limit of detection with a magnetoresistive biochip

OPENALEX - Publications

V. C. Martins Filipe A. Cardoso J. Germano Susana Cardoso Leonel Sousa and 3 more

10.1016/j.bios.2009.01.040 article EN Biosensors and Bioelectronics 2009-02-07

Massively LDPC Decoding on Multicore Architectures

OPENALEX - Publications

Gabriel Falcão Leonel Sousa Vítor Silva

Unlike usual VLSI approaches necessary for the computation of intensive Low-Density Parity-Check (LDPC) code decoders, this paper presents flexible software-based LDPC decoders. Algorithms and data structures suitable parallel computing are proposed in to perform decoding on multicore architectures. To evaluate efficiency algorithms, decoders were developed recent multicores, such as off-the-shelf general-purpose x86 processors, Graphics Processing Units (GPUs), CELL Broadband Engine...

10.1109/tpds.2010.66 article EN IEEE Transactions on Parallel and Distributed Systems 2010-04-09

Deep Learning Architectures for Accurate Millimeter Wave Positioning in 5G

OPENALEX - Publications

João Gante Gabriel Falcão Leonel Sousa

10.1007/s11063-019-10073-1 article EN Neural Processing Letters 2019-08-13

A genetic-based approach for service placement in fog computing

OPENALEX - Publications

Nazanin Sarrafzade Reza Entezari‐Maleki Leonel Sousa

10.1007/s11227-021-04254-w article EN The Journal of Supercomputing 2022-01-28

NTT Architecture for a Linux-Ready RISC-V Fully-Homomorphic Encryption Accelerator

OPENALEX - Publications

Rogério Paludo Leonel Sousa

This paper proposes two architectures for the acceleration of Number Theoretic Transforms (NTTs) using a novel Montgomery-based butterfly. We first design custom NTT hardware accelerator Field-Programmable Gate Arrays (FPGAs). The butterfly architecture is expanded to Modular Arithmetic Logic Unit (MALU) and greater reuse easier programmability six-stage pipeline Linux-ready RISC-V core extended with instructions. performance proposed assessed on Xilinx Ultrascale+ FPGA an...

10.1109/tcsi.2022.3166550 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2022-04-27

A Portable and Autonomous Magnetic Detection Platform for Biosensing

OPENALEX - Publications

J. Germano Verónica C. Martins Filipe A. Cardoso T. M. Almeida Leonel Sousa and 2 more

This paper presents a prototype of platform for biomolecular recognition detection. The system is based on magnetoresistive biochip that performs biorecognition assays by detecting magnetically tagged targets. All the electronic circuitry addressing, driving and reading out signals from spin-valve or magnetic tunnel junctions sensors implemented using off-the-shelf components. Taking advantage digital signal processing techniques, acquired are processed in real time transmitted to analyzer...

10.3390/s90604119 article EN cc-by Sensors 2009-05-27

Combining Residue Arithmetic to Design Efficient Cryptographic Circuits and Systems

OPENALEX - Publications

Leonel Sousa Samuel Antão Paulo Martins

Cryptography plays a major role assuring security in computation and communication. In particular, public-key cryptography enables the asymmetrical ciphering of data along with authentication parties that are attempting to share data. The encryption is costly, thus it has motivated extensive research efficiently accelerate execution most relevant algorithms improve resistance against Side-Channel Attacks (SCAs), which leverage exposed features by cryptographic systems, such as power...

10.1109/mcas.2016.2614714 article EN IEEE Circuits and Systems Magazine 2016-01-01

Dethroning GPS: Low-Power Accurate 5G Positioning Systems Using Machine Learning

OPENALEX - Publications

João Gante Leonel Sousa Gabriel Falcão

Over the last years positioning systems have become increasingly pervasive, covering most of planet's surface. Although they are accurate enough for a large number uses, their precision, power consumption, and hardware requirements establish limits adoption in mobile devices. In this paper, energy consumption proposed deep learning-based millimeter wave method is assessed, being subsequently compared to state-of-the-art on outdoor systems. Requiring as low 0.4 mJ per position fix, when...

10.1109/jetcas.2020.2991024 article EN IEEE Journal on Emerging and Selected Topics in Circuits and Systems 2020-04-28

Nonconventional Computer Arithmetic Circuits, Systems and Applications

OPENALEX - Publications

Leonel Sousa

Arithmetic plays a major role in computer?s performance and efficiency. Building new computing platforms supported by the traditional binary arithmetic silicon-based technologies to meet requirements of today?s applications is becoming increasingly more challenging, regardless whether we consider embedded devices or high-performance computers. As result, significant amount research effort has been devoted study nonconventional number systems investigate efficient circuits improved computer...

10.1109/mcas.2020.3027425 article EN IEEE Circuits and Systems Magazine 2021-01-01

General method for eliminating redundant computations in video coding

OPENALEX - Publications

Leonel Sousa

A new simple and efficient method for avoiding useless computations in the video coding process is proposed. Experimental results show practical interest of reducing computation software coders power consumption hardware coders.

10.1049/el:20000272 article EN Electronics Letters 2000-02-17

Toward a realistic task scheduling model

OPENALEX - Publications

Oliver Sinnen Leonel Sousa Frode Eika Sandnes

Task scheduling is an important aspect of parallel programming. Most the heuristics for this NP-hard problem are based on a very simple system model target system. Experiments revealed inappropriateness classic to obtain accurate and efficient schedules real-systems. In order overcome shortcoming, new was proposed that considers contention communication resources. Even though accuracy efficiency improved with consideration contention, still not good enough. The crucial involvement processor...

10.1109/tpds.2006.40 article EN IEEE Transactions on Parallel and Distributed Systems 2006-03-01

Cost-Efficient SHA Hardware Accelerators

OPENALEX - Publications

Ricardo Chaves Georgi Kuzmanov Leonel Sousa S. Vassiliadis

This paper presents a new set of techniques for hardware implementations secure hash algorithm (SHA) functions. These consist mostly in operation rescheduling and reutilization, therefore, significantly decreasing the critical path required area. Throughputs from 1.3 Gbit/s to 1.8 were obtained SHA on Xilinx VIRTEX II Pro. Compared commercial cores previously published research, these figures correspond an improvement throughput/slice range 29% 59% SHA-1 54% 100% SHA-2. Experimental results...

10.1109/tvlsi.2008.2000450 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2008-07-29

QCA-LG: A tool for the automatic layout generation of QCA combinational circuits

OPENALEX - Publications

Tiago Teodosio Leonel Sousa

Quantum-dot Cellular Automata (QCA) is a promising successor for CMOS transistor technology, while allowing the implementation of logic circuits using quantum devices, such as dots or single domain nano magnets, new set tools must be developed to assist design and process. Examples are QCADesigner handmade layout physical simulation, also majority optimization. Since no tool available assisting QCA generation, we propose automatically generate circuits. This tool, designated by QCA-Layout...

10.1109/norchp.2007.4481078 article EN NORCHIP 2007-11-01

Challenges and trends in the development of a magnetoresistive biochip portable platform

OPENALEX - Publications

Verónica C. Martins J. Germano Filipe A. Cardoso Joana Loureiro Susana Cardoso and 4 more

10.1016/j.jmmm.2009.02.141 article EN Journal of Magnetism and Magnetic Materials 2009-03-15

How GPUs can outperform ASICs for fast LDPC decoding

OPENALEX - Publications

Gabriel Falcão Vítor Silva Leonel Sousa

Due to huge computational requirements, powerful Low-Density Parity-Check (LDPC) error correcting codes, discovered in the early 1960s, have only recently been adopted by emerging communication standards. LDPC decoders are supported VLSI technology, which delivers good parallel power with excellent throughputs, but at expense of significant costs.

10.1145/1542275.1542330 article EN 2009-06-08

Improving residue number system multiplication with more balanced moduli sets and enhanced modular arithmetic structures

OPENALEX - Publications

Ricardo Chaves Leonel Sousa

Residue number systems (RNS) are non-weighted that allow to perform addition, subtraction and multiplication operations concurrently independently on each residue. The triple moduli set {2n−1, 2n, 2n+1} its respective extensions have gained unprecedent importance in RNS, mainly because of the simplicity arithmetic units for individual channels also converters from RNS. However, there is neither a perfect balance between various elements this nor an exact equivalence complexity Two...

10.1049/iet-cdt:20060059 article EN IET Computers & Digital Techniques 2007-09-04

Fine-grain Parallelism Using Multi-core, Cell/BE, and GPU Systems: Accelerating the Phylogenetic Likelihood Function

OPENALEX - Publications

Frederico Pratas Pedro Trancoso Alexandros Stamatakis Leonel Sousa

We are currently faced with the situation where applications have increasing computational demands and there is a wide selection of parallel processor systems. In this paper we focus on exploiting fine-grain parallelism for demanding bioinformatics application - MrBayes its phylogenetic likelihood functions (PLF) using different architectures. Our experiments compare side-by-side scalability performance achieved general-purpose multi-core processors, cell/BE, graphics units (GPU). The...

10.1109/icpp.2009.30 article EN International Conference on Parallel Processing 2009-09-01

RNS-Based Elliptic Curve Point Multiplication for Massive Parallel Architectures

OPENALEX - Publications

Samuel Antão Jean-Claude Bajard Leonel Sousa

Acceleration of cryptographic applications on massive parallel computing platforms, such as Graphic Processing Units (GPUs), becomes a real challenge concerning practical implementations. In this paper, we propose algorithm for Elliptic Curve (EC) point multiplication in order to compute EC cryptography these platforms. The proposed approach relies the usage Residue Number System (RNS) extract parallelism high-precision integer arithmetic. Results suggest maximum throughput 9827...

10.1093/comjnl/bxr119 article EN The Computer Journal 2011-11-30

MRC-Based RNS Reverse Converters for the Four-Moduli Sets $\{2^{n} + 1, 2^{n} - 1, 2^{n}, 2^{2n + 1} - 1\}$ and $ \{2^{n} + 1, 2^{n} - 1, 2^{2n}, 2^{2n + 1} - 1\}$

OPENALEX - Publications

Leonel Sousa Samuel Antão

The moduli set {2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> + 1,2 - , 2 xmlns:xlink="http://www.w3.org/1999/xlink">2n+1</sup> -1} has been recently proposed for supporting residue number systems with dynamic ranges of 5n bits. In this brief, we suggest modifying to -1},in order enlarge the range 6n We propose a method that unifies design efficient reverse converters original and modified sets. A unified architecture was derived...

10.1109/tcsii.2012.2188456 article EN IEEE Transactions on Circuits & Systems II Express Briefs 2012-03-22

Real-time implementation of remotely sensed hyperspectral image unmixing on GPUs

OPENALEX - Publications

Sergio Sánchez Ricardo S. Ramalho Leonel Sousa Antonio Plaza

10.1007/s11554-012-0269-2 article EN Journal of Real-Time Image Processing 2012-09-15

Energy-aware QoS-based dynamic virtual machine consolidation approach based on RL and ANN

OPENALEX - Publications

Mahshid Rezakhani Nazanin Sarrafzadeh-Ghadimi Reza Entezari‐Maleki Leonel Sousa Ali Movaghar

10.1007/s10586-023-03983-2 article EN Cluster Computing 2023-02-22

Deadline-aware task offloading in vehicular networks using deep reinforcement learning

OPENALEX - Publications

Mina Khoshbazm Farimani Soroush Karimian-Aliabadi Reza Entezari‐Maleki Bernhard Egger Leonel Sousa

10.1016/j.eswa.2024.123622 article EN Expert Systems with Applications 2024-03-11

Coming Soon ...