NFDI4DS | UHH-SEMS - Publication Details

Vladimir Lončar

ORCID: 0000-0003-3651-0232

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5039731897

Research Areas

Particle Detector Development and Performance
Particle physics theoretical and experimental studies
Advanced Neural Network Applications
Parallel Computing and Optimization Techniques
Radiation Detection and Scintillator Technologies
Computational Physics and Python Applications
Medical Imaging Techniques and Applications
Neural Networks and Applications
Adversarial Robustness in Machine Learning
Model Reduction and Neural Networks
Advanced Data Storage Technologies
Cold Atom Physics and Bose-Einstein Condensates
Machine Learning and Data Classification
CCD and CMOS Imaging Sensors
Numerical Methods and Algorithms
Topic Modeling
Distributed and Parallel Computing Systems
Physics of Superconductivity and Magnetism
Particle accelerators and beam dynamics
Radiation Effects in Electronics
Advanced Database Systems and Queries
Atomic and Subatomic Physics Research
Superconducting Materials and Applications
Laser-Plasma Interactions and Diagnostics
Scientific Computing and Data Management

European Organization for Nuclear Research
2019-2025

Massachusetts Institute of Technology
2023-2025

University of Michigan
2025

Rensselaer Polytechnic Institute
2025

Brookhaven National Laboratory
2025

Oak Ridge National Laboratory
2025

Central China Normal University
2025

Los Alamos National Laboratory
2025

New Jersey Institute of Technology
2025

Georgia Institute of Technology
2025

Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors

OPENALEX - Publications

Claudionor N. Coelho Aki Kuusela Shan Li Hao Zhuang J. Ngadiuba and 5 more

10.1038/s42256-021-00356-5 article EN Nature Machine Intelligence 2021-06-21

SymbolNet: Neural symbolic regression with adaptive dynamic pruning for compression

OPENALEX - Publications

Ho Fung Tsoi Vladimir Lončar Sridhara Dasu Philip Harris

Abstract Compact symbolic expressions have been shown to be more efficient than neural network (NN) models in terms of resource consumption and inference speed when implemented on custom hardware such as field-programmable gate arrays (FPGAs), while maintaining comparable accuracy (Tsoi et al 2024 EPJ Web Conf. 295 09036). These capabilities are highly valuable environments with stringent computational constraints, high-energy physics experiments at the CERN Large Hadron Collider. However,...

10.1088/2632-2153/adaad8 article EN cc-by Machine Learning Science and Technology 2025-01-15

Fast convolutional neural networks on FPGAs with hls4ml

OPENALEX - Publications

T. K. Aarrestad Vladimir Lončar Nicolò Ghielmetti M. Pierini S. Summers and 15 more

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate inference latency of $5\,\mu$s using architectures, targeting microsecond applications like those at CERN Large Hadron Collider. Considering benchmark models trained Street View House Numbers Dataset, various methods model compression in order to fit computational constraints a typical FPGA device used trigger and...

10.1088/2632-2153/ac0ea1 article EN cc-by Machine Learning Science and Technology 2021-06-25

Autoencoders on field-programmable gate arrays for real-time, unsupervised new physics detection at 40 MHz at the Large Hadron Collider

OPENALEX - Publications

Ekaterina Govorkova Ema Puljak T. K. Aarrestad Thomas James Vladimir Lončar and 9 more

10.1038/s42256-022-00441-3 article EN Nature Machine Intelligence 2022-02-23

Compressing deep neural networks on FPGAs to binary and ternary precision with hls4ml

OPENALEX - Publications

J. Ngadiuba Vladimir Lončar M. Pierini S. Summers Giuseppe Di Guglielmo and 11 more

We present the implementation of binary and ternary neural networks in hls4ml library, designed to automatically convert deep network models digital circuits with FPGA firmware. Starting from benchmark trained floating point precision, we investigate different strategies reduce network's resource consumption by reducing numerical precision parameters or ternary. discuss trade-off between model accuracy consumption. In addition, show how balance latency retaining full on a selected subset...

10.1088/2632-2153/aba042 article EN cc-by Machine Learning Science and Technology 2020-06-26

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

OPENALEX - Publications

Y. Iiyama Gianluca Cerminara Abhijay Gupta J. Kieseler Vladimir Lončar and 17 more

Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged tracking, jet tagging, and clustering. An important domain the application of these is FGPA-based first layer real-time data filtering at CERN Large Hadron Collider, which has strict latency resource constraints. We discuss how design distance-weighted graph that can be executed with a less than 1$\mu\mathrm{s}$ on an FPGA. To do so, we consider representative...

10.3389/fdata.2020.598927 article EN cc-by Frontiers in Big Data 2021-01-12

A Reconfigurable Neural Network ASIC for Detector Front-End Data Compression at the HL-LHC

OPENALEX - Publications

Giuseppe Di Guglielmo Farah Fahim T. C. Herwig Manuel Blanco Valentín J. Duarte and 13 more

Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains amount data to be transported from detector off-detector where decisions are made. We demonstrate that neural network autoencoder model can implemented radiation tolerant ASIC perform lossy compression alleviating transmission problem while preserving critical information energy profile. For our application, we consider high-granularity calorimeter CMS experiment at CERN Large...

10.1109/tns.2021.3087100 article EN IEEE Transactions on Nuclear Science 2021-06-08

CUDA programs for solving the time-dependent dipolar Gross–Pitaevskii equation in an anisotropic trap

OPENALEX - Publications

Vladimir Lončar Antun Balaž Aleksandar Bogojević Srdjan Škrbić Paulsamy Muruganandam and 1 more

10.1016/j.cpc.2015.11.014 article EN Computer Physics Communications 2015-12-18

Symbolic Regression on FPGAs for Fast Machine Learning Inference

OPENALEX - Publications

Ho Fung Tsoi Adrian Alan Pol Vladimir Lončar Ekaterina Govorkova Miles Cranmer and 5 more

The high-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs) to enhance sensitivity while still meeting data processing time constraints. In this contribution, we introduce a novel end-to-end procedure that utilizes machine learning technique called symbolic regression (SR). It searches equation space discover algebraic relations approximating dataset. We use PySR (a software uncover these expressions...

10.1051/epjconf/202429509036 article EN cc-by EPJ Web of Conferences 2024-01-01

OpenMP, OpenMP/MPI, and CUDA/MPI C programs for solving the time-dependent dipolar Gross–Pitaevskii equation

OPENALEX - Publications

Vladimir Lončar Luis E. Young-S. Srdjan Škrbić Paulsamy Muruganandam Sadhan K. Adhikari and 1 more

10.1016/j.cpc.2016.07.029 article EN Computer Physics Communications 2016-09-09

OpenMP GNU and Intel Fortran programs for solving the time-dependent Gross–Pitaevskii equation

OPENALEX - Publications

Luis E. Young-S. Paulsamy Muruganandam Sadhan K. Adhikari Vladimir Lončar D. Vudragović and 1 more

10.1016/j.cpc.2017.07.013 article EN Computer Physics Communications 2017-08-10

C and Fortran OpenMP programs for rotating Bose–Einstein condensates

OPENALEX - Publications

R. Kishor Kumar Vladimir Lončar Paulsamy Muruganandam Sadhan K. Adhikari Antun Balaž

10.1016/j.cpc.2019.03.004 article EN Computer Physics Communications 2019-03-18

Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs

OPENALEX - Publications

Aneesh Heintz L. Gray Nhan Viet Tran S. J. Thais G. Dezoort and 16 more

We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary designs are OpenCL, a framework writing programs that execute across heterogeneous platforms, hls4ml, high-level-synthesis-based compiler network to firmware conversion. evaluate compare the resource usage, latency, performance our benchmark dataset. find considerable speedup over CPU-based execution is possible, potentially enabling such be used...

10.48550/arxiv.2012.01563 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Intelligent experiments through real-time AI: Fast Data Processing and Autonomous Detector Control for sPHENIX and future EIC detectors

OPENALEX - Publications

J. Kvapil Giorgian Borca‐Tasciuc Hannah Bossi Kai Chen Yaozhong Chen and 28 more

This R&D project, initiated by the DOE Nuclear Physics AI-Machine Learning initiative in 2022, leverages AI to address data processing challenges high-energy nuclear experiments (RHIC, LHC, and future EIC). Our focus is on developing a demonstrator for real-time of high-rate streams from sPHENIX experiment tracking detectors. The limitations 15 kHz maximum trigger rate imposed calorimeters can be negated intelligent use streaming technology system. approach efficiently identifies low...

10.22323/1.476.1033 article EN cc-by-nc-nd 2025-01-07

Intelligent experiments through real-time AI: Fast Data Processing and Autonomous Detector Control for sPHENIX and future EIC detectors

OPENALEX - Publications

J. Kvapil Giorgian Borca‐Tasciuc H.J. Bossi Kewei Chen Yi Chen and 28 more

This R\&D project, initiated by the DOE Nuclear Physics AI-Machine Learning initiative in 2022, leverages AI to address data processing challenges high-energy nuclear experiments (RHIC, LHC, and future EIC). Our focus is on developing a demonstrator for real-time of high-rate streams from sPHENIX experiment tracking detectors. The limitations 15 kHz maximum trigger rate imposed calorimeters can be negated intelligent use streaming technology system. approach efficiently identifies low...

10.48550/arxiv.2501.04845 preprint EN arXiv (Cornell University) 2025-01-08

Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCb

OPENALEX - Publications

F. I. Giasemis Vladimir Lončar Bertrand Granado V. V. Gligorov

In high-energy physics, the increasing luminosity and detector granularity at Large Hadron Collider are driving need for more efficient data processing solutions. Machine Learning has emerged as a promising tool reconstructing charged particle tracks, due to its potentially linear computational scaling with hits. The recent implementation of graph neural network-based track reconstruction pipeline in first level trigger LHCb experiment on GPUs serves platform comparative studies between...

10.48550/arxiv.2502.02304 preprint EN arXiv (Cornell University) 2025-02-04

wa-hls4ml and lui-gnn: A Benchmark and GNN based Surrogate Model for hls4ml Resource and Latency Estimation

OPENALEX - Publications

Benjamin Hawks Dennis Plotnikov Nhan Viet Tran Karla Tame-Narvaez Mohammad Mehdi Rahimifar and 5 more

As machine learning (ML) increasingly serves as a tool for addressing real-time challenges in scientific applications, the development of advanced tooling has significantly reduced time required to iterate on various designs. These advancements have solved major obstacles, but also exposed new challenges. For example, processes that were not previously considered bottlenecks, such model synthesis, are now becoming limiting factors rapid iteration To reduce these emerging constraints,...

10.1145/3706628.3708827 article EN 2025-02-26

wa-hls4ml and lui-gnn: A benchmark and GNN-based surrogate model for hls4ml resource and latency estimation

OPENALEX - Publications

Benjamin Hawks Dennis Plotnikov Karla Tame-Narvaez Hilal Rahali Mohammad Mehdi Rahimifar and 5 more

10.2172/2549315 article EN 2025-04-03

Low latency transformer inference on FPGAs for physics applications with hls4ml

OPENALEX - Publications

Z. Jiang Dennis Yin Yi-Hui Chen E. E. Khoda Scott Hauck and 5 more

Abstract This study presents an efficient implementation of transformer architectures in Field-Programmable Gate Arrays (FPGAs) using hls4ml . We demonstrate the strategy for implementing multi head attention, softmax, and normalization layer evaluate three distinct models. Their deployment on VU13P FPGA chip achieved latency less than 2 μs, demonstrating potential real-time applications. 's compatibility with any TensorFlow-built model further enhances scalability applicability this work.

10.1088/1748-0221/20/04/p04014 article EN cc-by Journal of Instrumentation 2025-04-01

Accelerating Recurrent Neural Networks for Gravitational Wave Experiments

OPENALEX - Publications

Zhiqiang Que Erwei Wang Umar Marikar Eric Moreno J. Ngadiuba and 8 more

This paper presents novel reconfigurable architectures for reducing the latency of recurrent neural networks (RNNs) that are used detecting gravitational waves. Gravitational interferometers such as LIGO detectors capture cosmic events black hole mergers which happen at unknown times and varying durations, producing time-series data. We have developed a new architecture capable accelerating RNN inference analyzing data from detectors. is based on optimizing initiation intervals (II) in...

10.1109/asap52443.2021.00025 preprint EN 2021-07-01

Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml

OPENALEX - Publications

Nicolò Ghielmetti Vladimir Lončar M. Pierini Marcel Roed S. Summers and 6 more

Abstract In this paper, we investigate how field programmable gate arrays can serve as hardware accelerators for real-time semantic segmentation tasks relevant autonomous driving. Considering compressed versions of the ENet convolutional neural network architecture, demonstrate a fully-on-chip deployment with latency 4.9 ms per image, using less than 30% available resources on Xilinx ZCU102 evaluation board. The is reduced to 3 image when increasing batch size ten, corresponding use case...

10.1088/2632-2153/ac9cb5 article EN cc-by Machine Learning Science and Technology 2022-10-21

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

OPENALEX - Publications

J. Duarte Philip Harris Scott Hauck B. Holzman S.‐C. Hsu and 18 more

10.1007/s41781-019-0027-2 article EN Computing and Software for Big Science 2019-10-14

Fast inference of Boosted Decision Trees in FPGAs for particle physics

OPENALEX - Publications

S. Summers Giuseppe Di Guglielmo J. Duarte P. Harris Duc Hoang and 8 more

We describe the implementation of Boosted Decision Trees in hls4ml library, which allows translation a trained model into FPGA firmware through an automated conversion process. Thanks to its fully on-chip implementation, performs inference Tree models with extremely low latency. With typical latency less than 100 ns, this solution is suitable for FPGA-based real-time processing, such as Level-1 Trigger system collider experiment. These developments open up prospects physicists deploy BDTs...

10.1088/1748-0221/15/05/p05026 article EN cc-by Journal of Instrumentation 2020-05-29

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

OPENALEX - Publications

E. E. Khoda Dylan Rankin R. Teixeira De Lima Philip Harris Scott Hauck and 8 more

Abstract Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus widely adopted. Their use low-latency environments has, however, limited as a result of the difficulties implementing recurrent on field-programmable gate arrays (FPGAs). In this paper we present an implementation two types network layers—long short-term memory gated unit—within hls4ml framework. We demonstrate that our is capable producing designs both small large...

10.1088/2632-2153/acc0d7 article EN cc-by Machine Learning Science and Technology 2023-03-02

Coming Soon ...