Francesco Conti

ORCID: 0000-0002-7924-933X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Advanced Memory and Neural Computing
  • Parallel Computing and Optimization Techniques
  • CCD and CMOS Imaging Sensors
  • Ferroelectric and Negative Capacitance Devices
  • Interconnection Networks and Systems
  • Embedded Systems Design Techniques
  • Robotics and Sensor-Based Localization
  • Machine Learning and ELM
  • Neural Networks and Applications
  • Low-power high-performance VLSI design
  • UAV Applications and Optimization
  • Advanced Image and Video Retrieval Techniques
  • Domain Adaptation and Few-Shot Learning
  • Brain Tumor Detection and Classification
  • EEG and Brain-Computer Interfaces
  • Radiation Effects in Electronics
  • Robotic Path Planning Algorithms
  • Distributed and Parallel Computing Systems
  • Advancements in Semiconductor Devices and Circuit Design
  • IoT and Edge/Fog Computing
  • Multimodal Machine Learning Applications
  • 3D IC and TSV technologies
  • Semiconductor materials and devices
  • Muscle activation and electromyography studies

University of Bologna
2016-2025

ETH Zurich
2016-2023

National University of Singapore
2023

Sapienza University of Rome
2019-2023

Innovation Cluster (Canada)
2023

Laboratori Guglielmo Marconi (Italy)
2023

Marconi University
2023

Dalle Molle Institute for Artificial Intelligence Research
2023

University of Applied Sciences and Arts of Southern Switzerland
2023

University of Catania
2023

Current ultra-low power smart sensing edge devices, operating for years on small batteries, are limited to low-bandwidth sensors, such as temperature or pressure. Enabling the next generation of devices process data from richer sensors image, video, audio, multi-axial motion/vibration has huge application potential. However, processing data-rich poses extreme challenge squeezing computational requirements advanced, machine-Iearning-based near-sensor analysis algorithms (such Convolutional...

10.1109/asap.2018.8445101 article EN 2018-07-01

The deployment of Deep Neural Networks (DNNs) on end-nodes at the extreme edge Internet-of-Things is a critical enabler to support pervasive Learning-enhanced applications. Low-Cost MCU-based have limited on-chip memory and often replace caches with scratchpads, reduce area overheads increase energy efficiency -- requiring explicit DMA-based transfers between different levels hierarchy. Mapping modern DNNs these systems requires aggressive topology-dependent tiling double-buffering. In this...

10.1109/tc.2021.3066883 article EN IEEE Transactions on Computers 2021-03-18

Achieving a power envelope of few milliwatts combined with tight performance constraints is emerging as one the key challenges for battery-powered and low cost Internet-of-things (IoT) end-nodes. IoT devices have to cope highly time-varying workloads, characterized by intermittent "race-to-sleep" bursts compute-intensive operations mingled long periods activity. Architectural heterogeneity provides possible solution harmonize these competing constraints; availability diverse cores optimized...

10.1109/patmos.2017.8106976 article EN 2017-09-01

Hand movement classification via surface electromyographic (sEMG) signal is a well-established approach for advanced Human-Computer Interaction. However, sEMG recognition has to deal with the long-term reliability of sEMG-based control, limited by variability affecting signal. Embedded solutions are affected accuracy drop over time that makes them unsuitable reliable gesture controller design. In this paper, we present complete wearable-class embedded system robust recognition, based on...

10.1109/tbcas.2019.2959160 article EN IEEE Transactions on Biomedical Circuits and Systems 2019-12-13

Binary Neural Networks (BNNs) are promising to deliver accuracy comparable conventional deep neural networks at a fraction of the cost in terms memory and energy. In this paper, we introduce XNOR Engine (XNE), fully digital configurable hardware accelerator IP for BNNs, integrated within microcontroller unit (MCU) equipped with an autonomous I/O subsystem hybrid SRAM / standard cell memory. The XNE is able compute convolutional dense layers autonomy or cooperation core MCU realize more...

10.1109/tcad.2018.2857019 article EN IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2018-07-18

Fully-autonomous miniaturized robots (e.g., drones), with artificial intelligence (AI) based visual navigation capabilities are extremely challenging drivers of Internet-of-Things edge capabilities. Visual on AI approaches, such as deep neural networks (DNNs) becoming pervasive for standard-size drones, but considered out reach nanodrones size a few cm${}^\mathrm{2}$. In this work, we present the first (to best our knowledge) demonstration engine autonomous nano-drones capable closed-loop...

10.1109/jiot.2019.2917066 article EN IEEE Internet of Things Journal 2019-05-15

The Internet-of-Things requires end-nodes with ultra-low-power always-on capability for a long battery lifetime, as well high performance, energy efficiency, and extreme flexibility to deal complex fast-evolving near-sensor analytics algorithms (NSAAs). We present Vega, an IoT end-node SoC capable of scaling from 1.7 $\mathrm{\mu}$W fully retentive cognitive sleep mode up 32.2 GOPS (@ 49.4 mW) peak performance on NSAAs, including mobile DNN inference, exploiting 1.6 MB state-retentive SRAM,...

10.1109/jssc.2021.3114881 article EN IEEE Journal of Solid-State Circuits 2021-10-07

In the last few years, research and development on Deep Learning models techniques for ultra-low-power devices in a word, TinyML has mainly focused train-then-deploy assumption, with static that cannot be adapted to newly collected data without cloud-based collection fine-tuning. Latent Replay-based Continual (CL) techniques[1] enable online, serverless adaptation principle, but so farthey have still been too computation memory-hungry devices, which are typically based microcontrollers. this...

10.1109/jetcas.2021.3121554 article EN publisher-specific-oa IEEE Journal on Emerging and Selected Topics in Circuits and Systems 2021-10-20

State-of-art brain-inspired computer vision algorithms such as Convolutional Neural Networks (CNNs) are reaching accuracy and performance rivaling that of humans; however, the gap in terms energy consumption is still many degrees magnitude wide. Many-core architectures using shared-memory clusters power-optimized RISC processors have been proposed a possible solution to help close this gap. In work, we propose augment these with Hardware Convolution Engines (HWCEs): ultra-low coprocessors...

10.5555/2755753.2755910 article EN Design, Automation, and Test in Europe 2015-03-09

We present PULP-NN, an optimized computing library for a parallel ultra-low-power tightly coupled cluster of RISC-V processors. The key innovation in PULP-NN is set kernels quantized neural network inference, targeting byte and sub-byte data types, down to INT-1, tuned the recent trend toward aggressive quantization deep inference. proposed exploits both digital signal processing extensions available PULP processors cluster’s parallelism, achieving up 15.5 MACs/cycle on INT-8 improving...

10.1098/rsta.2019.0155 article EN Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences 2019-12-23

Near-sensor data analytics is a promising direction for internet-of-things endpoints, as it minimizes energy spent on communication and reduces network load - but also poses security concerns, valuable are stored or sent over the at various stages of pipeline. Using encryption to protect sensitive boundary on-chip engine way address issues. To cope with combined workload in tight power envelope, we propose Fulmine, system-on-chip (SoC) based tightly-coupled multi-core cluster augmented...

10.1109/tcsi.2017.2698019 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2017-05-15

Deep convolutional neural networks (CNNs) obtain outstanding results in tasks that require human-level understanding of data, like image or speech recognition. However, their computational load is significant, motivating the development CNN-specialized accelerators. This work presents NEURA ghe , a flexible and efficient hardware/software solution for acceleration CNNs on Zynq SoCs. leverages synergistic usage ARM cores powerful Convolution-Specific Processor deployed reconfigurable logic....

10.1145/3284357 article EN ACM Transactions on Reconfigurable Technology and Systems 2018-09-30

The End-Nodes of the Internet Things (IoT) require extreme energy efficiency coupled with wide power-performance operating range. Fully-depleted SOI (FD-SOI) is an attractive technology for ultra-low power and wide-range operation as it offers compelling options to tune power, performance, area (PPA) at design time well run time. This paper describes Quentin: MCU-class (32bit) open-source RISC-V SoC featuring autonomous I/O subsystem optimized deal variety sensors available in IoT end-nodes,...

10.1109/s3s.2018.8640145 article EN 2018-10-01

Many emerging applications of nano-sized unmanned aerial vehicles (UAVs), with a few cm<sup>2</sup> form-factor, revolve around safely interacting humans in complex scenarios, for example, monitoring their activities or looking after people needing care. Such sophisticated autonomous functionality must be achieved while dealing severe constraints payload, battery, and power budget (&#x007E;<b>100mW</b>). In this work, we attack task going from perception to control: estimate maintain the...

10.1109/jiot.2021.3091643 article EN IEEE Internet of Things Journal 2021-06-22

Recent trends in deep learning (DL) imposed hardware accelerators as the most viable solution for several classes of high-performance computing (HPC) applications such image classification, computer vision, and speech recognition. This survey summarizes classifies recent advances designing DL suitable to reach performance requirements HPC applications. In particular, it highlights advanced approaches support accelerations including not only GPU TPU-based but also design-specific FPGA-based...

10.48550/arxiv.2306.15552 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

An exhaustive comparison among different spatial interpolation algorithms was carried out in order to derive annual and monthly air temperature maps for Sicily (Italy). Deterministic, data-driven geostatistics were used, some cases adding the elevation information other physiographic variables improve performance of techniques reconstruction field. The dataset is given by data coming from 84 stations spread around island Sicily. optimized using a subset available dataset, while remaining...

10.3390/w7051866 article EN Water 2015-04-27

State-of-art brain-inspired computer vision algorithms such as Convolutional Neural Networks (CNNs) are reaching accuracy and performance rivaling that of humans; however, the gap in terms energy consumption is still many degrees magnitude wide. Many-core architectures using shared-memory clusters power-optimized RISC processors have been proposed a possible solution to help close this gap. In work, we propose augment these with Hardware Convolution Engines (HWCEs): ultra-low coprocessors...

10.7873/date.2015.0404 article EN Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), 2015 2015-01-01

Nano-size unmanned aerial vehicles (UAVs), with few centimeters of diameter and sub-10 Watts total power budget, have so far been considered incapable running sophisticated visual-based autonomous navigation software without external aid from base-stations, ad-hoc local positioning infrastructure, powerful computation servers. In this work, we present what is, to the best our knowledge, first 27g nano-UAV system able run aboard an end-to-end, closed-loop visual pipeline for based on a...

10.1109/dcoss.2019.00111 preprint EN 2019-05-01

Strongly quantized fixed-point arithmetic is considered the key direction to enable inference of CNNs on low-power, resource-constrained edge devices. However, deployment highly Neural Networks at extreme IoT, fully programmable MCUs, currently limited by lack support, Instruction Set Architecture (ISA) level, for sub-byte data types, making it necessary add numerous instructions packing and unpacking when running low-bitwidth (i.e. 2- 4-bit) QNN kernels, creating a bottleneck performance...

10.23919/date48585.2020.9116529 article EN Design, Automation &amp; Test in Europe Conference &amp; Exhibition (DATE), 2015 2020-03-01

The Internet-of-Things requires end-nodes with ultra-low-power always-on capability for long battery lifetime, as well high performance, energy efficiency, and extreme flexibility to deal complex fast-evolving near-sensor analytics algorithms (NSAAs). We present Vega, an IoT end-node SoC capable of scaling from a 1.7 μW fully retentive COGNITIVE sleep mode up 32.2GOPS (@49.4mW) peak performance on NSAAs, including mobile DNN inference, exploiting 1.6MB state- SRAM, 4MB non-volatile MRAM. To...

10.1109/isscc42613.2021.9365939 article EN 2022 IEEE International Solid- State Circuits Conference (ISSCC) 2021-02-13

Deployment of modern TinyML tasks on small battery-constrained IoT devices requires high computational energy efficiency. Analog In-Memory Computing (IMC) using non-volatile memory (NVM) promises major efficiency improvements in deep neural network (DNN) inference and serves as on-chip storage for DNN weights. However, IMC's functional flexibility limitations their impact performance, energy, area are not yet fully understood at the system level. To target practical end-to-end applications,...

10.1109/jetcas.2022.3170152 article EN IEEE Journal on Emerging and Selected Topics in Circuits and Systems 2022-04-28

Autonomous drone racing competitions are a proxy to improve unmanned aerial vehicles' perception, planning, and control skills. The recent emergence of autonomous nano-sized imposes new challenges, as their <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim$</tex-math></inline-formula> notation="LaTeX">$\text{10} \,\text{c}\text{m}$</tex-math></inline-formula> form factor heavily restricts the resources...

10.1109/lra.2024.3349814 article EN IEEE Robotics and Automation Letters 2024-01-04
Coming Soon ...