Leonardo Ecco

ORCID: 0000-0003-2348-0759
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Interconnection Networks and Systems
  • Parallel Computing and Optimization Techniques
  • Real-Time Systems Scheduling
  • Embedded Systems Design Techniques
  • Distributed systems and fault tolerance
  • Advanced Data Storage Technologies
  • CCD and CMOS Imaging Sensors
  • Advanced Memory and Neural Computing
  • Advanced Neural Network Applications
  • Education and Digital Technologies
  • Distributed and Parallel Computing Systems
  • Adversarial Robustness in Machine Learning
  • Ferroelectric and Negative Capacitance Devices
  • Supercapacitor Materials and Fabrication

Robert Bosch (Germany)
2019-2021

Technische Universität Braunschweig
2014-2017

Universidade Estadual de Campinas (UNICAMP)
2009-2015

Mixed critical platforms are those in which applications that have different criticalities, i.e. levels of importance for system safety, coexist and share resources. Such require a memory controller capable providing sufficient timing independence applications. Existing real-time controllers, however, either do not support mixed criticality or still allow certain degree interference between The former issue leads to overly constrained, hence more expensive, systems. latter forces designers...

10.1109/rtcsa.2014.6910550 article EN 2014-08-01

As DRAMs become faster, the penalty to reverse direction of their data buses increases. Yet, existing real-time memory controllers do not reorder read and write commands. Hence, timing bounds are computed by assuming an alternating pattern reads writes, thus accounting for several bus reversals, consequently leading suboptimal results. Therefore, in this paper, we propose a controller that reorders commands, which minimizes reversals. Moreover, prove through detailed analysis effect...

10.1109/rtss.2015.13 article EN 2015-12-01

Networks-on-Chip (NoCs) for real-time systems require solutions safe and predictable sharing of network resources between transmissions with different quality-of service requirementrs. In this work, we present a mechanism global dynamic admission control in NoCs designed realtime systems. It introduces an overlay to synchronize using arbitration units called Resource Managers (RMs), which allows work-conserving scheduling. We formal worst-case timing analysis the proposed demonstrate that...

10.1109/aspdac.2016.7428096 article EN 2016-01-01

Multi-rank DRAM modules have been identified as a flexible option for accommodating large mixed critical workloads. However, because all ranks in module share the same multi-drop data bus, penalty form of idle cycles is necessary when alternating transfers between different ranks. Moreover, bus clock frequency becomes higher, such increases significantly and can no longer be neglected. Therefore, this paper, we propose real-time controller multi-rank that minimizes rank switches. Our works...

10.1109/ecrts.2016.8 article EN 2016-07-01

Synchronous dynamic random access memories (SDRAMs) are widely employed in multiand many-core platforms due to their high-density and low-cost. Nevertheless, benefits come at the price of a complex two-stage protocol, which reflects bank-based structure an internal level explicitly managed caching. In scenarios requestors demand real-time guarantees, these features pose predictability challenge and, order tackle it, several SDRAM controllers have been proposed. this context, recent research...

10.1109/tc.2017.2714672 article EN IEEE Transactions on Computers 2017-06-12

For mixed-criticality systems, safety standards (e.g. ISO 26262) require sufficient independence among different criticality levels, unless the entire system is certified according to highest applicable level. We present a resource arbitration scheme that provides levels w.r.t. timing properties. exploit throughput and latency slack of critical applications by prioritizing non-critical over accesses only switching priorities when necessary. By using an accurate representation access patterns...

10.1145/2656075.2656105 article EN 2014-10-07

Processing-in-Memory (PIM) is an emerging approach to bridge the memory-computation gap. One of key challenges PIM architectures in scope neural network inference deployment traditional area-intensive arithmetic multipliers memory technology, especially for DRAM-based architectures. Hence, existing DRAM are either confined binary networks or exploit analog property sub-array bitlines perform bulk bit-wise logic operations. The former reduces accuracy predictions, i.e. Quality-of-results,...

10.1145/3394885.3431522 article EN Proceedings of the 28th Asia and South Pacific Design Automation Conference 2021-01-18

RISC processors can be used to face the ever increasing demand for performance required by embedded systems. Nevertheless, this solution comes with cost of poor code density. Alternative encodings instruction sets, such as MIPS16 and Thumb, represent an effective approach deal drawback. This article proposes apply a new encoding SPARCv8 architecture. Through extensive analysis program mix from Mibench Mediabench benchmark suites, we suggest 16-bit set, easily translated its 32-bit...

10.1109/sbac-pad.2009.22 article EN 2009-10-01

Time-division multiplexing (TDM) is the commonly used and well established solution to problem of sharing resources in real-time Networks-on-Chip (NoCs). TDM timing predictable, simplifies worst-case analysis easy implement. However, it introduces a constant, periodic non-work-conserving resource scheme. This challenges efficiency whenever applications expose dynamics execution time, communication volume system not highly loaded. In this work, we present flexible approach for NoCs where...

10.1145/2834848.2834851 article EN 2015-11-04

Modern Networks-on-Chip (NoCs) must accommodate a diversity of temporal requirements e.g. provide guarantees for real-time senders with the minimum impact on performance sensitive best-effort (BE) traffic. In this work, we propose protocol-based adaptive load distribution which by selectively detouring BE traffic i.e. balancing, allows to significantly improve NoC's without costly hardware extensions. The introduced method offers, during runtime, safe and efficient integration mixed-critical...

10.1109/aspdac.2017.7858411 article EN 2017-01-01

End-to-end performance estimation and measurement of deep neural network (DNN) systems become more important with increasing complexity DNN consisting hardware software components. The methodology proposed in this paper aims at a reduced turn-around time for evaluating different design choices components systems. This reduction is achieved by moving the from implementation phase to concept employing virtual models instead gathering results physical prototypes. Deep learning compilers...

10.1145/3372394.3372396 preprint EN 2019-10-13

The trend towards integration is leading to the design of multi- and many-core platforms that accommodate processing tiles (requestors) with different memory requirements. Such require a controller capable providing low-latency best-effort (BE) service for some requestors guaranteed throughput (GT) others. Although there are realtime controllers support concept traffic classes, they do not efficiently handle scenarios multiple BE GT requestors. We propose tackles this problem, low latency...

10.1109/sies.2015.7185038 article EN 2015-06-01

As SDRAM modules get faster and their data buses wider, researchers proposed the use of open-row policy in command schedulers for real-time controllers. While properties such have been thoroughly investigated, hardware implementation was not. Hence, this paper, we propose a highly-parallel multi-stage architecture that implements state-of-the scheduler. Moreover, evaluate from overhead performance perspectives.

10.23919/date.2017.7927063 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2017-03-01

In the last couple of years, goal deploying Deep Neural Networks (DNNs) in embedded domain has led to development several dedicated DNN Hardware Accelerators (HWAs), many which rely on a Dot-Product Engine (DPE)-based architecture. A DPE is hardware block that receives two input vectors same size and produces one scalar value. Nevertheless, when actual vector does not match DPEs native input-size, becomes underutilized. This particularly observed for Convolutional (CNNs). this article, we...

10.1109/ipdpsw50202.2020.00142 article EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2020-05-01
Coming Soon ...