NFDI4DS | UHH-SEMS - Publication Details

N. S. Nolte

ORCID: 0000-0003-2536-4209

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5112958169

Research Areas

Particle physics theoretical and experimental studies
Quantum Chromodynamics and Particle Interactions
High-Energy Particle Collisions Research
Neutrino Physics Research
Particle Detector Development and Performance
Computational Physics and Python Applications
Black Holes and Theoretical Physics
Dark Matter and Cosmic Phenomena
Particle Accelerators and Free-Electron Lasers
Medical Imaging Techniques and Applications
Superconducting Materials and Applications
Nuclear physics research studies
Distributed and Parallel Computing Systems
Atomic and Subatomic Physics Research
Stochastic processes and statistical mechanics
Adversarial Robustness in Machine Learning
Advanced Data Storage Technologies
International Science and Diplomacy
Algorithms and Data Compression
Advanced Neural Network Applications
Parallel Computing and Optimization Techniques
Markov Chains and Monte Carlo Methods
Radiation Detection and Scintillator Technologies
Robotics and Sensor-Based Localization
Congenital limb and hand anomalies

Massachusetts Institute of Technology
2021-2025

University of Cincinnati
2023

TU Dortmund University
2019-2022

European Organization for Nuclear Research
2018-2021

The NSF AI Institute for Artificial Intelligence and Fundamental Interactions
2021

Otto-von-Guericke University Magdeburg
2018

Beware the intruder: Real time observation of infiltrated neutrophils and neutrophil—Microglia interaction during stroke in vivo

OPENALEX - Publications

Jens Neumann Sophie Henneberg Susanne von Kenne N. S. Nolte Andreas J. Müller and 5 more

Inflammation plays an important role in the pathogenesis of ischemic stroke including acute and prolonged inflammatory process. The neutrophil granulocytes as first driver immune reaction from blood site is under debate due to controversial findings. In bone marrow chimeric mice we were able study dynamics tdTomato-expressing neutrophils GFP-expressing microglia after photothrombosis using intravital two-photon microscopy. We demonstrate infiltration into brain parenchyma confirm a...

10.1371/journal.pone.0193970 article EN cc-by PLoS ONE 2018-03-15

Transforming the bootstrap: using Transformers to compute scattering amplitudes in planar N = 4 Super Yang-Mills theory

OPENALEX - Publications

Tianji Cai G. Merz François Charton N. S. Nolte Matthias Wilhelm and 2 more

Abstract We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" overflow="scroll"> <mml:mrow> <mml:mi class="MJX-tex-calligraphic">N</mml:mi> </mml:mrow> <mml:mo>=</mml:mo> <mml:mn>4</mml:mn> </mml:math> Super Yang–Mills theory is a close cousin that describes Higgs boson production at Large Hadron Collider; its scattering amplitudes are large mathematical...

10.1088/2632-2153/ad743e article EN cc-by Machine Learning Science and Technology 2024-08-27

Towards Understanding Grokking: An Effective Theory of Representation Learning

OPENALEX - Publications

Ziming Liu O. Kitouni N. S. Nolte Eric J. Michaud Max Tegmark and 1 more

We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set. present both microscopic analysis anchored by an effective theory and macroscopic of phase diagrams describing learning performance across hyperparameters. find that generalization originates from structured representations whose dynamics dependence on set size can be predicted our in toy setting. observe empirically the presence four phases: comprehension, memorization, confusion....

10.48550/arxiv.2205.10343 preprint EN cc-by arXiv (Cornell University) 2022-01-01

A Comparison of CPU and GPU Implementations for the LHCb Experiment Run 3 Trigger

OPENALEX - Publications

R. Aaij M. Adinolfi S. Aiola S. Akar J. Albrecht and 95 more

The LHCb experiment at CERN is undergoing an upgrade in preparation for the Run 3 data taking period of LHC. As part this trigger moving to a fully software implementation operating LHC bunch crossing rate. We present evaluation CPU-based and GPU-based first stage High Level Trigger. After detailed comparison both options are found be viable. This document summarizes performance details these options, outcome which has led choice as baseline.

10.1007/s41781-021-00070-2 article EN cc-by Computing and Software for Big Science 2021-12-22

Robust and provably monotonic networks

OPENALEX - Publications

O. Kitouni N. S. Nolte M. Williams

Abstract The Lipschitz constant of the map between input and output space represented by a neural network is natural metric for assessing robustness model. We present new method to constrain dense deep learning models that can also be generalized other architectures. relies on simple weight normalization scheme during training ensures every layer below an upper limit specified analyst. A monotonic residual connection then used make model in any subset its inputs, which useful scenarios where...

10.1088/2632-2153/aced80 article EN cc-by Machine Learning Science and Technology 2023-08-04

Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors

OPENALEX - Publications

Samuel Stevens Emily Wenger Cathy Li N. S. Nolte Eshika Saxena and 2 more

Learning with Errors (LWE) is a hard math problem underlying recently standardized post-quantum cryptography (PQC) systems for key exchange and digital signatures. Prior work proposed new machine learning (ML)-based attacks on LWE problems small, sparse secrets, but these require millions of samples to train take days recover secrets. We propose three methods -- better preprocessing, angular embeddings model pre-training improve attacks, speeding up preprocessing by $25\times$ improving...

10.48550/arxiv.2402.01082 preprint EN arXiv (Cornell University) 2024-02-01

Expressive Monotonic Neural Networks

OPENALEX - Publications

O. Kitouni N. S. Nolte Michael Williams

The monotonic dependence of the outputs a neural network on some its inputs is crucial inductive bias in many scenarios where domain knowledge dictates such behavior. This especially important for interpretability and fairness considerations. In broader context, which monotonicity can be found finance, medicine, physics, other disciplines. It thus desirable to build architectures that implement this provably. work, we propose weight-constrained architecture with single residual connection...

10.48550/arxiv.2307.07512 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Finding NEEMo: Geometric Fitting using Neural Estimation of the Energy Mover's Distance

OPENALEX - Publications

O. Kitouni N. S. Nolte Mike Williams

A novel neural architecture was recently developed that enforces an exact upper bound on the Lipschitz constant of model by constraining norm its weights in a minimal way, resulting higher expressiveness compared to other techniques. We present new and interesting direction for this architecture: estimation Wasserstein metric (Earth Mover's Distance) optimal transport employing Kantorovich-Rubinstein duality enable use geometric fitting applications. Specifically, we focus field high-energy...

10.48550/arxiv.2209.15624 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Robust and Provably Monotonic Networks

OPENALEX - Publications

O. Kitouni N. S. Nolte M. Williams

The Lipschitz constant of the map between input and output space represented by a neural network is natural metric for assessing robustness model. We present new method to constrain dense deep learning models that can also be generalized other architectures. relies on simple weight normalization scheme during training ensures every layer below an upper limit specified analyst. A monotonic residual connection then used make model in any subset its inputs, which useful scenarios where domain...

10.48550/arxiv.2112.00038 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Evolution of the energy efficiency of LHCb’s real-time processing

OPENALEX - Publications

R. Aaij D. H. Cámpora Pérez T. Colombo C. Fitzpatrick V. V. Gligorov and 5 more

The upgraded LHCb detector, due to start datataking in 2022, will have process an average data rate of 4 TB/s real time. Because LHCb’s physics objectives require that the full detector information for every LHC bunch crossing is read out and made available real-time processing, this bandwidth challenge equivalent ATLAS CMS HL-LHC software read-out, but deliverable five years earlier. Over past six years, collaboration has undertaken a bottom-up rewrite its infrastructure, pattern...

10.1051/epjconf/202125104009 article EN cc-by EPJ Web of Conferences 2021-01-01

The cool and the cruel: separating hard parts of LWE secrets

OPENALEX - Publications

N. S. Nolte Mohamed Malhou Emily Wenger Samuel Stevens Cathy Li and 2 more

Sparse binary LWE secrets are under consideration for standardization Homomorphic Encryption and its applications to private computation. Known attacks on sparse include the dual attack hybrid dual-meet in middle which requires significant memory. In this paper, we provide a new statistical with low memory requirement. The relies some initial lattice reduction. key observation is that, after reduction applied rows of q-ary-like embedded random matrix $\mathbf A$, entries high variance...

10.48550/arxiv.2403.10328 preprint EN arXiv (Cornell University) 2024-03-15

Fast Inclusive Flavour Tagging at LHCb

OPENALEX - Publications

C. Prouvé N. S. Nolte C. Hasse

The task of identifying B meson flavor at the primary interaction point in LHCb detector is crucial for measurements mixing and time-dependent CP violation. Flavour tagging usually done with a small number expert systems that find important tracks to infer flavour from. Recent advances show replacing all those one ML algorithm considers an event yields increase power. However, training current classifier takes long time not suitable use real-time triggers. In this work we present new...

10.48550/arxiv.2404.14145 preprint EN arXiv (Cornell University) 2024-04-22

Fast Inclusive Flavour Tagging at LHCb

OPENALEX - Publications

C. Prouvé N. S. Nolte C. Hasse

The task of identifying B meson flavor at the primary interaction point in LHCb detector is crucial for measurements mixing and timedependent CP violation. Flavor tagging usually done with a small number expert systems that find important tracks to infer from. Recent advances show replacing all those one ML algorithm considers an event yields increase power. However, training current classifier takes long time it not suitable use real triggers. In this work we present new classifier, based...

10.1051/epjconf/202429509018 article EN cc-by EPJ Web of Conferences 2024-01-01

Applications of Lipschitz neural networks to the Run 3 LHCb trigger system

OPENALEX - Publications

B. Delaney N. Schulte G. Ciezarek N. S. Nolte M. Williams and 1 more

The operating conditions defining the current data taking campaign at Large Hadron Collider, known as Run 3, present unparalleled challenges for real-time acquisition workflow of LHCb experiment CERN. To address anticipated surge in luminosity and consequent event rate, is transitioning to a fully software-based trigger system. This evolution necessitated innovations hardware configurations, software paradigms, algorithmic design. A significant advancement integration monotonic Lipschitz...

10.1051/epjconf/202429509005 article EN cc-by EPJ Web of Conferences 2024-01-01

Memory Mosaics

OPENALEX - Publications

Jianyu Zhang N. S. Nolte Ranajoy Sadhukhan Beidi Chen Léon Bottou

Memory Mosaics are networks of associative memories working in concert to achieve a prediction task interest. Like transformers, memory mosaics possess compositional capabilities and in-context learning capabilities. Unlike these comparatively transparent ways. We demonstrate on toy examples we also show that perform as well or better than transformers medium-scale language modeling tasks.

10.48550/arxiv.2405.06394 preprint EN arXiv (Cornell University) 2024-05-10

Transforming the Bootstrap: Using Transformers to Compute Scattering Amplitudes in Planar N = 4 Super Yang-Mills Theory

OPENALEX - Publications

Tianji Cai G. Merz François Charton N. S. Nolte Matthias Wilhelm and 2 more

We pursue the use of deep learning methods to improve state-of-the-art computations in theoretical high-energy physics. Planar N = 4 Super Yang-Mills theory is a close cousin that describes Higgs boson production at Large Hadron Collider; its scattering amplitudes are large mathematical expressions containing integer coefficients. In this paper, we apply Transformers predict these The problem can be formulated language-like representation amenable standard cross-entropy training objectives....

10.48550/arxiv.2405.06107 preprint EN other-oa OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) 2024-05-09

From Neurons to Neutrons: A Case Study in Interpretability

OPENALEX - Publications

O. Kitouni N. S. Nolte Víctor Samuel Pérez-Díaz Sokratis Trifinopoulos M. Williams

Mechanistic Interpretability (MI) promises a path toward fully understanding how neural networks make their predictions. Prior work demonstrates that even when trained to perform simple arithmetic, models can implement variety of algorithms (sometimes concurrently) depending on initialization and hyperparameters. Does this mean neuron-level interpretability techniques have limited applicability? We argue high-dimensional learn low-dimensional representations training data are useful beyond...

10.48550/arxiv.2405.17425 preprint EN arXiv (Cornell University) 2024-05-27

The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More

OPENALEX - Publications

O. Kitouni N. S. Nolte Diane Bouchacourt Adina Williams Mike Rabbat and 1 more

Today's best language models still struggle with hallucinations: factually incorrect generations, which impede their ability to reliably retrieve information seen during training. The reversal curse, where cannot recall when probed in a different order than was encountered training, exemplifies this retrieval. We reframe the curse as factorization - failure of learn same joint distribution under factorizations. Through series controlled experiments increasing levels realism including...

10.48550/arxiv.2406.05183 preprint EN arXiv (Cornell University) 2024-06-07

MagicPIG: LSH Sampling for Efficient LLM Generation

OPENALEX - Publications

Zhuoming Chen Ranajoy Sadhukhan Zihao Ye Yang Zhou Jianyu Zhang and 6 more

Large language models (LLMs) with long context windows have gained significant attention. However, the KV cache, stored to avoid re-computation, becomes a bottleneck. Various dynamic sparse or TopK-based attention approximation methods been proposed leverage common insight that is sparse. In this paper, we first show TopK itself suffers from quality degradation in certain downstream tasks because not always as expected. Rather than selecting keys and values highest scores, sampling...

10.48550/arxiv.2410.16179 preprint EN arXiv (Cornell University) 2024-10-21

Transformers Can Navigate Mazes With Multi-Step Prediction

OPENALEX - Publications

N. S. Nolte O. Kitouni Adina Williams Mike Rabbat Mark Ibrahim

Despite their remarkable success in language modeling, transformers trained to predict the next token a sequence struggle with long-term planning. This limitation is particularly evident tasks requiring foresight plan multiple steps ahead such as maze navigation. The standard single prediction objective, however, offers no explicit mechanism - or revisit path taken so far. Consequently, this work we study whether explicitly predicting (and backwards) can improve transformers' We train...

10.48550/arxiv.2412.05117 preprint EN arXiv (Cornell University) 2024-12-06

A parallel algorithm for fast reconstruction of proton collisions on heterogeneous architectures

OPENALEX - Publications

Agnieszka Dziurda Maciej Artur Giza V. V. Gligorov W. Hulsbergen Bogdan Kutsenko and 6 more

The physics programme of the LHCb experiment at Large Hadron Collider requires an efficient and precise reconstruction particle collision vertices. Upgrade detector relies on a fully software-based trigger with online rate 30 MHz, necessitating fast vertex finding algorithms. This paper describes new approach to developed for this purpose. algorithm is based cluster within histogram trajectory projections along beamline adaptive fit. Its implementations optimisations x86 GPU architectures...

10.48550/arxiv.2412.14966 preprint EN arXiv (Cornell University) 2024-12-19

Transformers Can Navigate Mazes With Multi-Step Prediction

OPENALEX - Publications

N. S. Nolte

Despite their remarkable success in language modeling, transformers trained to predict the next token a sequence struggle with long-term planning. This limitation is particularly evident tasks requiring foresight plan multiple steps ahead such as maze navigation. The standard _single_ prediction objective, however, offers no explicit mechanism ahead—or revisit path taken so far. Consequently, this work we study whether explicitly predicting (and backwards) can improve transformers’ We train...

10.32388/3q1xzw preprint EN cc-by 2024-12-27

A new scheduling algorithm for the LHCb upgrade trigger application

OPENALEX - Publications

E. Govorkova C. Hasse R. Matev N. S. Nolte S. Ponce and 2 more

Abstract During Run3 of the LHC LHCb detector will process a 30 MHz event rate with full readout followed by software trigger. To deal increased computational requirements, framework is reviewed and optimized on large scale. One challenge efficient scheduling O (10 3 )- 4 ) algorithms in High Level Trigger (HLT) application. This document describes design new algorithm scheduler which allows for static-order intra-event minimum complexity while still providing required flexibility.

10.1088/1742-6596/1525/1/012052 article EN Journal of Physics Conference Series 2020-04-01

Coming Soon ...