NFDI4DS | UHH-SEMS - Publication Details

Florian Kelber

ORCID: 0000-0001-7663-5211

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5036095997

Research Areas

Advanced Memory and Neural Computing
Ferroelectric and Negative Capacitance Devices
Neural dynamics and brain function
Neural Networks and Reservoir Computing
Neural Networks and Applications
Advanced SAR Imaging Techniques
Neuroscience and Neural Engineering
Digital Transformation in Industry
Physical Unclonable Functions (PUFs) and Hardware Security
IoT and Edge/Fog Computing
Modular Robots and Swarm Intelligence
Radar Systems and Signal Processing
Flow Measurement and Analysis
Anomaly Detection Techniques and Applications
Radio Frequency Integrated Circuit Design
Time Series Analysis and Forecasting
Infrared Target Detection Methodologies
Low-power high-performance VLSI design
CCD and CMOS Imaging Sensors
Digital Filter Design and Implementation
Numerical Methods and Algorithms

Technische Universität Dresden
2020-2025

The SpiNNaker 2 Processing Element Architecture for Hybrid Digital Neuromorphic Computing

OPENALEX - Publications

Sebastian Höppner Yexin Yan Andreas Dixius Stefan Scholze Johannes Partzsch and 13 more

This paper introduces the processing element architecture of second generation SpiNNaker chip, implemented in 22nm FDSOI. On circuit level, chip features adaptive body biasing for near-threshold operation, and dynamic voltage-and-frequency scaling driven by spiking activity. system is centered around an ARM M4 core, similar to processor-centric first SpiNNaker. To speed operation subtasks, we have added accelerators numerical operations both (SNN) rate based (deep) neural networks (DNN). PEs...

10.48550/arxiv.2103.08392 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Comparing Loihi with a SpiNNaker 2 prototype on low-latency keyword spotting and adaptive robotic control

OPENALEX - Publications

Yexin Yan Terrence C. Stewart Xuan Choo Bernhard Vogginger Johannes Partzsch and 5 more

Abstract We implemented two neural network based benchmark tasks on a prototype chip of the second-generation SpiNNaker (SpiNNaker 2) neuromorphic system: keyword spotting and adaptive robotic control. Keyword is commonly used in smart speakers to listen for wake words, control applications adapt unknown dynamics an online fashion. highlight benefit multiply-accumulate (MAC) array 2 which ordinarily rate-based machine learning networks when employed neuromorphic, spiking context. In...

10.1088/2634-4386/abf150 article EN cc-by Neuromorphic Computing and Engineering 2021-03-24

A 16-Channel Fully Configurable Neural SoC With 1.52 $\mu$W/Ch Signal Acquisition, 2.79 $\mu$W/Ch Real-Time Spike Classifier, and 1.79 TOPS/W Deep Neural Network Accelerator in 22 nm FDSOI

OPENALEX - Publications

Seyed Mohammad Ali Zeinolabedin Franz Marcus Schüffny Richard George Florian Kelber Heiner Bauer and 8 more

With the advent of high-density micro-electrodes arrays, developing neural probes satisfying real-time and stringent power-efficiency requirements becomes more challenging. A smart probe is an essential device in future neuroscientific research medical applications. To realize such devices, we present a 22 nm FDSOI SoC with complex on-chip data processing training for signal analysis. It consists digitally-assisted 16-channel analog front-end 1.52 μW/Ch, dedicated bio-processing accelerators...

10.1109/tbcas.2022.3142987 article EN IEEE Transactions on Biomedical Circuits and Systems 2022-01-13

SpiNNaker2: A Large-Scale Neuromorphic System for Event-Based and Asynchronous Machine Learning

OPENALEX - Publications

Hector A. Gonzalez Jiaxin Huang Florian Kelber Khaleelulla Khan Nazeer Tim Langer and 9 more

The joint progress of artificial neural networks (ANNs) and domain specific hardware accelerators such as GPUs TPUs took over many domains machine learning research. This development is accompanied by a rapid growth the required computational demands for larger models more data. Concurrently, emerging properties foundation in-context drive new opportunities applications. However, cost applications limiting factor technology in data centers, importantly mobile devices edge systems. To mediate...

10.48550/arxiv.2401.04491 preprint EN other-oa arXiv (Cornell University) 2024-01-01

NLU: An Adaptive, Small-Footprint, Low-Power Neural Learning Unit for Edge and IoT Applications

OPENALEX - Publications

Ali Rostami Seyed Mohammad Ali Zeinolabedin Liyuan Guo Florian Kelber Heiner Bauer and 7 more

10.1109/ojcas.2025.3546067 article EN cc-by IEEE Open Journal of Circuits and Systems 2025-01-01

Hardware Implementation of an OPC UA Server for Industrial Field Devices

OPENALEX - Publications

Heiner Bauer Sebastian Höppner Chris Paul Iatrou Zohra Charania Stephan Hartmann and 11 more

Industrial plants suffer from a high degree of complexity and incompatibility in their communication infrastructure, caused by wild mix proprietary technologies. This prevents transformation towards Industry 4.0 the Internet Things. Open Platform Communications Unified Architecture (OPC UA) is standardized protocol that addresses these problems with uniform semantic across all levels hierarchy. However, its adoption embedded field devices, such as sensors actors, still lacking due to...

10.1109/tvlsi.2021.3117401 article EN IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2021-10-14

Real-time Radar Gesture Classification with Spiking Neural Network on SpiNNaker 2 Prototype

OPENALEX - Publications

Jiaxin Huang Bernhard Vogginger Pascal Gerhards Felix Kreutz Florian Kelber and 3 more

Neuromorphic hardware has been emerging in recent years, seeking various applications to explore its uniqueness, limitations, and possibilities. As a representative application research area, gesture recognition is gaining wider popularity, while the conflict of spiking neural network (SNN) size available memory neuromorphic edge-AI can be thorny issue, which even intensified by demand for continuously processing input data stream from sensor real-world scenario since certain amount required...

10.1109/aicas54282.2022.9869987 article EN 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2022-06-13

Mapping Deep Neural Networks on SpiNNaker2

OPENALEX - Publications

Florian Kelber Binyi Wu Bernhard Vogginger Johannes Partzsch Chen Liu and 2 more

SpiNNaker is an efficient many-core architecture for the real-time simulation of spiking neural networks. To also speed up deep networks (DNNs), 2nd generation SpiNNaker2 will contain dedicated DNN accelerators in each processing element. When realizing large CNNs on SpiNNaker2, layers have to be split, mapped and scheduled onto 144 elements. We describe underlying mapping procedure with optimized data reuse achieve inference VGG-16 ResNet-50 models tens milliseconds.

10.1145/3381755.3381778 article EN 2020-03-17

Efficient Algorithms for Accelerating Spiking Neural Networks on MAC Array of SpiNNaker 2

OPENALEX - Publications

Jiaxin Huang Florian Kelber Bernhard Vogginger Binyi Wu Felix Kreutz and 4 more

The CPU-based system is widely used for simulating the brain-inspired spiking neural networks (SNN) by taking benefit of flexibility, while processing high input rates caused immature coding mechanism costs many CPU cycles, and introduction additional information required serial execution needs time-consuming pre- post-neuron matching algorithm. To address these issues, we propose an algorithm set leveraging multiply-accumulate (MAC) array to accelerate SNN inference. By rearranging...

10.1109/aicas57966.2023.10168559 article EN 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2023-06-11

Ultra-High Compression of Twiddle Factor ROMs in Multi-Core DSP for FMCW Radars

OPENALEX - Publications

Hector A. Gonzalez Florian Kelber Marco Stolba Chen Liu Bernhard Vogginger and 4 more

The increasing density of Multiple-Input Multiple-Output (MIMO) arrays in imaging radars for the automotive industry demands highly parallel systems with low-footprint accelerators, which would enable concurrent processing a high number virtual channels low-latency, and without area overhead. In this paper, we design, implement, test multiple handcrafted compression schemes Twiddle Factor (TF) Read-Only Memories (ROM), aiming to reduce footprint variable-length dual-radix Fast Fourier...

10.1109/iscas51556.2021.9401547 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2021-04-27

Fast Switching Serial and Parallel Paradigms of SNN Inference on Multi-core Heterogeneous Neuromorphic Platform SpiNNaker2

OPENALEX - Publications

Jiaxin Huang Bernhard Vogginger Florian Kelber Hector A. Gonzalez Klaus Knobloch and 1 more

With serial and parallel processors are introduced into Spiking Neural Networks (SNNs) execution, more researchers dedicated to improving the performance of computing paradigms by taking full advantage strengths available processor. In this paper, we compare integrate one SNN compiling system. For a faster switching between them in layer granularity, train classifier prejudge better paradigm before instead making decision afterwards, saving great amount time RAM space on host PC. The...

10.48550/arxiv.2406.17049 preprint EN arXiv (Cornell University) 2024-06-24

CA-CFAR is Convolution: Fast Target Detection with Machine Learning Accelerator

OPENALEX - Publications

Chen Liu Florian Kelber Bernhard Vogginger Christian Mayr

10.1109/meco62516.2024.10577789 article EN 2022 11th Mediterranean Conference on Embedded Computing (MECO) 2024-06-11

Fast Switching Serial and Parallel Paradigms of SNN Inference on Multi-Core Heterogeneous Neuromorphic Platform SpiNNaker2

OPENALEX - Publications

Jiaxin Huang Bernhard Vogginger Florian Kelber Hector A. Gonzalez Klaus Knobloch and 1 more

10.1109/icons62911.2024.00025 article EN 2024-07-30

A 12-ADC 25-Core Smart MPSoC Using ABB in 22FDX for 77GHz MIMO Radars at 52.6mW Average Power

OPENALEX - Publications

Hector A. Gonzalez Bernhard Vogginger Chen Liu Marco Stolba Florian Kelber and 12 more

Industry leaders in automotive radars are moving towards highly dense MIMO (i.e., 4D radars), as they provide robust detection at a high angular resolution. However, these systems come the expense of parallel processing requirements, higher off- chip communication data rates, and power consumption result denser arrays to process. To date, no work open literature addresses low-power requirements DSPs for such FMCW radars, their scalability, on-chip Machine Learning (ML) context those azimuth,...

10.1109/cicc57935.2023.10121258 article EN 2022 IEEE Custom Integrated Circuits Conference (CICC) 2023-04-01

Spiking Neural Network based Real-time Radar Gesture Recognition Live Demonstration

OPENALEX - Publications

Jiaxin Huang Pascal Gerhards Felix Kreutz Bernhard Vogginger Florian Kelber and 3 more

This live demo aims at continuously real-time classifying radar gesture signals from the real world with neuromorphic hardware SpiNNaker 2 prototype to play game. With 10 MHz operation frequency on FPGA, closed-loop setup realizes around 35 ms delay PC sending input data receiving classification output, and there is nearly no feeling of apparent when testers are playing The energy cost per frame 3.29 µJ, cycle less than 8 k. Even if our current middleware has not considered balanced work...

10.1109/aicas54282.2022.9869943 article EN 2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2022-06-13

Efficient SNN multi-cores MAC array acceleration on SpiNNaker 2

OPENALEX - Publications

Jiaxin Huang Florian Kelber Bernhard Vogginger Chen Liu Felix Kreutz and 4 more

The potential low-energy feature of the spiking neural network (SNN) engages attention AI community. Only CPU-involved SNN processing inevitably results in an inherently long temporal span cases large models and massive datasets. This study introduces MAC array, a parallel architecture on each element (PE) SpiNNaker 2, into computational process inference. Based work single-core optimization algorithms, we investigate acceleration algorithms for collaborating with multi-core arrays. proposed...

10.3389/fnins.2023.1223262 article EN cc-by Frontiers in Neuroscience 2023-08-07

Performance models and energy-optimal scheduling of DNNs on many-core hardware with dynamic power management

OPENALEX - Publications

Bernhard Vogginger Florian Kelber Shambhavi Balamuthu Sampath Johannes Partzsch Christian Mayr

10.1145/3615338.3618127 article EN 2023-09-21

Hardware-Efficient Ultrasonic Entrance Counting: Comparing Different Machine Learning Approaches

OPENALEX - Publications

Tim Langer Bernd Waschneck Johannes Partzsch Florian Kelber Christian Mayr

In this work, the classification of walking direction based on ultrasonic signals has been examined for entrance counting. Feed-forward and recurrent neural network architectures as well simpler machine learning techniques have investigated compared with classical signal processing techniques.Using only a single receiver, focus was set development hardware-efficient system concept. Different measurement methods in time frequency domain perspective holistic energy optimization. The analysis...

10.1109/icpr56361.2022.9955643 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2022-08-21

Coming Soon ...