- Parallel Computing and Optimization Techniques
- Advanced Neural Network Applications
- Topic Modeling
- Distributed and Parallel Computing Systems
- Robotics and Sensor-Based Localization
- Graph Theory and Algorithms
- Network Packet Processing and Optimization
- UAV Applications and Optimization
- Robotic Path Planning Algorithms
- Advanced Memory and Neural Computing
- CCD and CMOS Imaging Sensors
University of Bologna
2023-2024
Laboratori Guglielmo Marconi (Italy)
2023
Marconi University
2023
With the rise of embodied foundation models (EFMs), most notably small language (SLMs), adapting Transformers for edge applications has become a very active field research. However, achieving end-to-end deployment SLMs on microcontroller (MCU)-class chips without high-bandwidth off-chip main memory access is still an open challenge. In this article, we demonstrate high efficiency SLM multicore RISC-V (RV32) MCU augmented with ML instruction extensions and hardware neural processing unit...
Nano-sized unmanned aerial vehicles (UAVs) are ideal candidates for flying Internet-of-Things smart sensors to collect information in narrow spaces. This requires ultra-fast navigation under very tight memory/computation constraints. The PULP-Dronet convolutional neural network (CNN) enables autonomous running aboard a nano-UAV at 19, the cost of large memory footprint 320kB– and with drone control complex scenarios hindered by disjoint training collision avoidance steering capabilities. In...
One of the challenges for Tiny Machine Learning (tinyML) is keeping up with evolution models from Convolutional Neural Networks to Transformers. We address this by leveraging a heterogeneous architectural template coupling RISC-V processors hardwired accelerators supported an automated deployment flow. demonstrate Attention-based model in tinyML power envelope octa-core cluster coupled accelerator quantized Attention. Our flow enables end-to-end 8-bit MobileBERT, achieving leading-edge...
With the rise of Embodied Foundation Models (EFMs), most notably Small Language (SLMs), adapting Transformers for edge applications has become a very active field research. However, achieving end-to-end deployment SLMs on microcontroller (MCU)-class chips without high-bandwidth off-chip main memory access is still an open challenge. In this paper, we demonstrate high-efficiency SLM multicore RISC-V (RV32) MCU augmented with ML instruction extensions and hardware neural processing unit (NPU)....
Nano-sized unmanned aerial vehicles (UAVs) are ideal candidates for flying Internet-of-Things smart sensors to collect information in narrow spaces. This requires ultra-fast navigation under very tight memory/computation constraints. The PULP-Dronet convolutional neural network (CNN) enables autonomous running aboard a nano-UAV at 19 frame/s, the cost of large memory footprint 320 kB -- and with drone control complex scenarios hindered by disjoint training collision avoidance steering...
Emerging Artificial-Intelligence-enabled System-on-Chips (AI-SoCs) combine a flexible microcontroller with parallel Digital Signal Processors (DSP) and heterogeneous acceleration capabilities. In this Work-in-Progress paper, we focus on the GAP9 RISC-V SoC as case study to show how open-source DORY Deep Neural Network (DNN) tool flow can be extended for by fine grained interleaving of dedicated Engine cluster cores. Our results that up 91% peak accelerator throughput extracted in end-to-end...