- Particle physics theoretical and experimental studies
- High-Energy Particle Collisions Research
- Distributed and Parallel Computing Systems
- Particle Detector Development and Performance
- Computational Physics and Python Applications
- Quantum Chromodynamics and Particle Interactions
- Gaussian Processes and Bayesian Inference
- Superconducting Materials and Applications
- Particle Accelerators and Free-Electron Lasers
- Scientific Computing and Data Management
- Particle accelerators and beam dynamics
- Stochastic Gradient Optimization Techniques
- Computer Graphics and Visualization Techniques
- Generative Adversarial Networks and Image Synthesis
- Machine Learning and Data Classification
- Astrophysics and Cosmic Phenomena
- Quantum Computing Algorithms and Architecture
- Neural Networks and Applications
- Big Data and Business Intelligence
- Matrix Theory and Algorithms
- Anomaly Detection Techniques and Applications
- Machine Learning in Materials Science
- Neutrino Physics Research
- Quantum, superfluid, helium dynamics
- Sparse and Compressive Sensing Techniques
Massachusetts Institute of Technology
2017-2023
The NSF AI Institute for Artificial Intelligence and Fundamental Interactions
2021-2022
Moscow Institute of Thermal Technology
2019-2022
Harvard University
2019-2020
Center for Theoretical Biological Physics
2019
Johns Hopkins University Applied Physics Laboratory
1968
Johns Hopkins University
1964
A key question for machine learning approaches in particle physics is how to best represent and learn from collider events. As an event intrinsically a variable-length unordered set of particles, we build upon recent efforts directly sets features or "point clouds". Adapting specializing the "Deep Sets" framework physics, introduce Energy Flow Networks, which respect infrared collinear safety by construction. We also develop Particle allow general energy dependence inclusion additional...
A bstract We introduce the energy flow polynomials: a complete set of jet substructure observables which form discrete linear basis for all infrared- and collinear-safe observables. Energy polynomials are multiparticle correlators with specific angular structures that direct consequence infrared collinear safety. establish powerful graph-theoretic representation allows us to design efficient algorithms their computation. Many common exact combinations polynomials, we demonstrate spanning...
Artificial intelligence offers the potential to automate challenging data-processing tasks in collider physics. To establish its prospects, we explore what extent deep learning with convolutional neural networks can discriminate quark and gluon jets better than observables designed by physicists. Our approach builds upon paradigm that a jet be treated as an image, intensity given local calorimeter deposits. We supplement this construction adding color images, red, green blue intensities...
Jets of hadrons produced at high-energy colliders provide experimental access to the dynamics asymptotically free quarks and gluons their confinement into hadrons. In this Letter, we show that high energies Large Hadron Collider (LHC), together with exceptional resolution its detectors, allow multipoint correlation functions energy flow operators be directly measured within jets for first time. Using Open Data from CMS experiment, reformulating jet substructure in terms these correlators...
Collider data must be corrected for detector effects (``unfolded'') to compared with many theoretical calculations and measurements from other experiments. Unfolding is traditionally done individual, binned observables without including all information relevant characterizing the response. We introduce OmniFold, an unfolding method that iteratively reweights a simulated dataset, using machine learning capitalize on available information. Our approach unbinned, works arbitrarily...
Based on the established task of identifying boosted, hadronically decaying top quarks, we compare a wide range modern machine learning approaches. Unlike most methods they rely low-level input, for instance calorimeter output. While their network architectures are vastly different, performance is comparatively similar. In general, find that these new approaches extremely powerful and great fun.
When are two collider events similar? Despite the simplicity and generality of this question, there is no established notion distance between events. To address we develop a metric for space based on earth mover's distance: "work" required to rearrange radiation pattern one event into another. We expose interesting connections structure infrared- collinear-safe observables, providing novel technique quantify modifications due hadronization, pileup, detector effects. showcase how metrization...
A persistent challenge in practical classification tasks is that labeled training sets are not always available. In particle physics, this surmounted by the use of simulations. These simulations accurately reproduce most features data, but cannot be trusted to capture all complex correlations exploitable modern machine learning methods. Recent work weakly supervised has shown simple, low-dimensional classifiers can trained using only impure mixtures present data. Here, we demonstrate...
While "quark" and "gluon" jets are often treated as separate, well-defined objects in both theoretical experimental contexts, no precise, practical, hadron-level definition of jet flavor presently exists. To remedy this issue, we develop advocate for a data-driven, operational quark gluon that is readily applicable at colliders. Rather than specifying per-jet label, aggregately define the distribution level terms measured hadronic cross sections. Intuitively, emerge two maximally separable...
Pileup involves the contamination of energy distribution arising from primary collision interest (leading vertex) by radiation soft collisions (pileup). We develop a new technique for removing this using machine learning and convolutional neural networks. The network takes as input charged leading vertex particles, pileup all neutral particles outputs coming alone. PUMML algorithm performs remarkably well at eliminating distortion on wide range simple complex jet observables. test robustness...
We explore the metric space of jets using public collider data from CMS experiment. Starting 2.3/fb 7 TeV proton-proton collisions collected at Large Hadron Collider in 2011, we isolate a sample 1,690,984 central with transverse momentum above 375 GeV. To validate performance detector reconstructing energy flow jets, compare Open Data to corresponding simulated samples for variety jet kinematic and substructure observables. Even without unfolding, find very good agreement track-based...
A bstract We establish that many fundamental concepts and techniques in quantum field theory collider physics can be naturally understood unified through a simple new geometric language. The idea is to equip the space of events with metric, from which other objects rigorously defined. Our analysis based on energy mover’s distance, quantifies “work” required rearrange one event into another. This operates purely at level observable flow information, allows for clarified definition infrared...
We study quark and gluon jets separately using public collider data from the CMS experiment. Our analysis is based on $2.3\text{ }\text{ }{\mathrm{fb}}^{\ensuremath{-}1}$ of proton-proton collisions at $\sqrt{s}=7\text{ }\mathrm{TeV}$, collected Large Hadron Collider in 2011. define two nonoverlapping samples via a pseudorapidity cut---central with $|\ensuremath{\eta}|\ensuremath{\le}0.65$ forward $|\ensuremath{\eta}|>0.65$---and employ jet topic modeling to extract individual distributions...
Multiparticle correlators are mathematical objects frequently encountered in quantum field theory and collider physics. By translating multiparticle into the language of graph theory, we can gain new insights their structure as well identify efficient ways to manipulate them. We highlight power this graph-theoretic approach by ``cutting open'' vertices edges graphs, allowing us systematically classify linear relations among develop faster methods for computation. The naive computational...
A common setting for scientific inference is the ability to sample from a high-fidelity forward model (simulation) without having an explicit probability density of data. We propose simulation-based maximum likelihood deconvolution approach in this called OmniFold. Deep learning enables be naturally unbinned and (variable-, and) high-dimensional. In contrast parameter estimation, goal remove detector distortions order enable variety down-stream tasks. Our deep generalization Richardson-Lucy...
A bstract Jet grooming is an important strategy for analyzing relativistic particle collisions in the presence of contaminating radiation. Most jet techniques introduce hard cutoffs to remove soft radiation, leading discontinuous behavior and associated experimental theoretical challenges. In this paper, we Pileup Infrared Radiation Annihilation (P iranha ), a paradigm continuous that overcomes discontinuity infrared sensitivity hard-cutoff procedures. We motivate P from perspective optimal...
Direct searches for new particles at colliders have traditionally been factorized into model proposals by theorists and testing experimentalists. With the recent advent of machine learning methods that allow simultaneous unfolding all observables in a given phase space region, there is opportunity to blur these traditional boundaries performing on unfolded data. This could facilitate research program where data are explored their natural high dimensionality with as little bias possible. We...
We present the Pileup Mitgation with Machine Learning (PUMML) algorithm for pileup removal at Large Hadron Collider (LHC) based on jet images framework using state-of-the-art machine learning techniques. demonstrate that our outperforms existing methods a wide range of observables up to levels 140 collisions per bunch crossing. also investigate what aspects event algorithms are utilizing by understanding learned parameters simplified version model.