Vladimir Lončar

ORCID: 0000-0003-3651-0232
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Particle Detector Development and Performance
  • Particle physics theoretical and experimental studies
  • Advanced Neural Network Applications
  • Parallel Computing and Optimization Techniques
  • Radiation Detection and Scintillator Technologies
  • Computational Physics and Python Applications
  • Medical Imaging Techniques and Applications
  • Neural Networks and Applications
  • Adversarial Robustness in Machine Learning
  • Model Reduction and Neural Networks
  • Advanced Data Storage Technologies
  • Cold Atom Physics and Bose-Einstein Condensates
  • Machine Learning and Data Classification
  • CCD and CMOS Imaging Sensors
  • Numerical Methods and Algorithms
  • Topic Modeling
  • Distributed and Parallel Computing Systems
  • Physics of Superconductivity and Magnetism
  • Particle accelerators and beam dynamics
  • Radiation Effects in Electronics
  • Advanced Database Systems and Queries
  • Atomic and Subatomic Physics Research
  • Superconducting Materials and Applications
  • Laser-Plasma Interactions and Diagnostics
  • Scientific Computing and Data Management

European Organization for Nuclear Research
2019-2025

Massachusetts Institute of Technology
2023-2025

University of Michigan
2025

Rensselaer Polytechnic Institute
2025

Brookhaven National Laboratory
2025

Oak Ridge National Laboratory
2025

Central China Normal University
2025

Los Alamos National Laboratory
2025

New Jersey Institute of Technology
2025

Georgia Institute of Technology
2025

Abstract Compact symbolic expressions have been shown to be more efficient than neural network (NN) models in terms of resource consumption and inference speed when implemented on custom hardware such as field-programmable gate arrays (FPGAs), while maintaining comparable accuracy (Tsoi et al 2024 EPJ Web Conf. 295 09036). These capabilities are highly valuable environments with stringent computational constraints, high-energy physics experiments at the CERN Large Hadron Collider. However,...

10.1088/2632-2153/adaad8 article EN cc-by Machine Learning Science and Technology 2025-01-15

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate inference latency of $5\,\mu$s using architectures, targeting microsecond applications like those at CERN Large Hadron Collider. Considering benchmark models trained Street View House Numbers Dataset, various methods model compression in order to fit computational constraints a typical FPGA device used trigger and...

10.1088/2632-2153/ac0ea1 article EN cc-by Machine Learning Science and Technology 2021-06-25

We present the implementation of binary and ternary neural networks in hls4ml library, designed to automatically convert deep network models digital circuits with FPGA firmware. Starting from benchmark trained floating point precision, we investigate different strategies reduce network's resource consumption by reducing numerical precision parameters or ternary. discuss trade-off between model accuracy consumption. In addition, show how balance latency retaining full on a selected subset...

10.1088/2632-2153/aba042 article EN cc-by Machine Learning Science and Technology 2020-06-26

Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged tracking, jet tagging, and clustering. An important domain the application of these is FGPA-based first layer real-time data filtering at CERN Large Hadron Collider, which has strict latency resource constraints. We discuss how design distance-weighted graph that can be executed with a less than 1$\mu\mathrm{s}$ on an FPGA. To do so, we consider representative...

10.3389/fdata.2020.598927 article EN cc-by Frontiers in Big Data 2021-01-12

Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains amount data to be transported from detector off-detector where decisions are made. We demonstrate that neural network autoencoder model can implemented radiation tolerant ASIC perform lossy compression alleviating transmission problem while preserving critical information energy profile. For our application, we consider high-granularity calorimeter CMS experiment at CERN Large...

10.1109/tns.2021.3087100 article EN IEEE Transactions on Nuclear Science 2021-06-08

The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes machine learning technique called symbolic regression (SR). It searches equation space discover algebraic relations approximating dataset. We use PySR (a software uncover these expressions...

10.1051/epjconf/202429509036 article EN cc-by EPJ Web of Conferences 2024-01-01

We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary designs are OpenCL, a framework writing programs that execute across heterogeneous platforms, hls4ml, high-level-synthesis-based compiler network to firmware conversion. evaluate compare the resource usage, latency, performance our benchmark dataset. find considerable speedup over CPU-based execution is possible, potentially enabling such be used...

10.48550/arxiv.2012.01563 preprint EN cc-by arXiv (Cornell University) 2020-01-01

This R&D project, initiated by the DOE Nuclear Physics AI-Machine Learning initiative in 2022, leverages AI to address data processing challenges high-energy nuclear experiments (RHIC, LHC, and future EIC). Our focus is on developing a demonstrator for real-time of high-rate streams from sPHENIX experiment tracking detectors. The limitations 15 kHz maximum trigger rate imposed calorimeters can be negated intelligent use streaming technology system. approach efficiently identifies low...

10.22323/1.476.1033 article EN cc-by-nc-nd 2025-01-07

This R\&D project, initiated by the DOE Nuclear Physics AI-Machine Learning initiative in 2022, leverages AI to address data processing challenges high-energy nuclear experiments (RHIC, LHC, and future EIC). Our focus is on developing a demonstrator for real-time of high-rate streams from sPHENIX experiment tracking detectors. The limitations 15 kHz maximum trigger rate imposed calorimeters can be negated intelligent use streaming technology system. approach efficiently identifies low...

10.48550/arxiv.2501.04845 preprint EN arXiv (Cornell University) 2025-01-08

In high-energy physics, the increasing luminosity and detector granularity at Large Hadron Collider are driving need for more efficient data processing solutions. Machine Learning has emerged as a promising tool reconstructing charged particle tracks, due to its potentially linear computational scaling with hits. The recent implementation of graph neural network-based track reconstruction pipeline in first level trigger LHCb experiment on GPUs serves platform comparative studies between...

10.48550/arxiv.2502.02304 preprint EN arXiv (Cornell University) 2025-02-04

As machine learning (ML) increasingly serves as a tool for addressing real-time challenges in scientific applications, the development of advanced tooling has significantly reduced time required to iterate on various designs. These advancements have solved major obstacles, but also exposed new challenges. For example, processes that were not previously considered bottlenecks, such model synthesis, are now becoming limiting factors rapid iteration To reduce these emerging constraints,...

10.1145/3706628.3708827 article EN 2025-02-26

Abstract This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays (FPGAs) using hls4ml . We demonstrate the strategy for implementing multi head attention, softmax, and normalization layer evaluate three distinct models. Their deployment on VU13P FPGA chip achieved latency less than 2 μs, demonstrating potential real-time applications. 's compatibility with any TensorFlow-built model further enhances scalability applicability this work.

10.1088/1748-0221/20/04/p04014 article EN cc-by Journal of Instrumentation 2025-04-01

This paper presents novel reconfigurable architectures for reducing the latency of recurrent neural networks (RNNs) that are used detecting gravitational waves. Gravitational interferometers such as LIGO detectors capture cosmic events black hole mergers which happen at unknown times and varying durations, producing time-series data. We have developed a new architecture capable accelerating RNN inference analyzing data from detectors. is based on optimizing initiation intervals (II) in...

10.1109/asap52443.2021.00025 preprint EN 2021-07-01

Abstract In this paper, we investigate how field programmable gate arrays can serve as hardware accelerators for real-time semantic segmentation tasks relevant autonomous driving. Considering compressed versions of the ENet convolutional neural network architecture, demonstrate a fully-on-chip deployment with latency 4.9 ms per image, using less than 30% available resources on Xilinx ZCU102 evaluation board. The is reduced to 3 image when increasing batch size ten, corresponding use case...

10.1088/2632-2153/ac9cb5 article EN cc-by Machine Learning Science and Technology 2022-10-21

We describe the implementation of Boosted Decision Trees in hls4ml library, which allows translation a trained model into FPGA firmware through an automated conversion process. Thanks to its fully on-chip implementation, performs inference Tree models with extremely low latency. With typical latency less than 100 ns, this solution is suitable for FPGA-based real-time processing, such as Level-1 Trigger system collider experiment. These developments open up prospects physicists deploy BDTs...

10.1088/1748-0221/15/05/p05026 article EN cc-by Journal of Instrumentation 2020-05-29

Abstract Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus widely adopted. Their use low-latency environments has, however, limited as a result of the difficulties implementing recurrent on field-programmable gate arrays (FPGAs). In this paper we present an implementation two types network layers—long short-term memory gated unit—within hls4ml framework. We demonstrate that our is capable producing designs both small large...

10.1088/2632-2153/acc0d7 article EN cc-by Machine Learning Science and Technology 2023-03-02
Coming Soon ...