- Particle physics theoretical and experimental studies
- Topic Modeling
- Natural Language Processing Techniques
- Particle Detector Development and Performance
- High-Energy Particle Collisions Research
- Adversarial Robustness in Machine Learning
- Computational Physics and Python Applications
- Quantum Chromodynamics and Particle Interactions
- Quantum Mechanics and Applications
- Cosmology and Gravitation Theories
- Machine Learning and Data Classification
- Gaussian Processes and Bayesian Inference
- Genomics and Phylogenetic Studies
- Cold Atom Physics and Bose-Einstein Condensates
- Black Holes and Theoretical Physics
- Multimodal Machine Learning Applications
- Domain Adaptation and Few-Shot Learning
- Computer Graphics and Visualization Techniques
- Infectious Diseases and Mycology
- Anomaly Detection Techniques and Applications
- Ferroelectric and Negative Capacitance Devices
- Neural Networks and Applications
- Markov Chains and Monte Carlo Methods
- Relativity and Gravitational Theory
- Big Data and Business Intelligence
Google (United States)
2019-2021
Lawrence Berkeley National Laboratory
2019-2020
University of California, Berkeley
2019-2020
Harvard University
2014-2019
Harvard University Press
2016
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these are as yet poorly characterized. In order to inform future research, prepare for disruptive model capabilities, ameliorate socially harmful effects, it is vital that we understand the present near-future limitations of language models. To address this challenge, introduce Beyond Imitation Game benchmark (BIG-bench). BIG-bench...
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini consists Ultra, Pro, Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on broad range benchmarks shows our most-capable Ultra model advances the state art in 30 32 these - notably being first achieve human-expert performance well-studied exam...
Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art generally struggled with quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, large model pretrained general data further trained technical content. The achieves benchmarks without use external tools. We also evaluate our over two hundred...
In applications of machine learning to particle physics, a persistent challenge is how go beyond discrimination learn about the underlying physics. To this end, powerful tool would be framework for unsupervised learning, where learns intricate high-dimensional contours data upon which it trained, without reference pre-established labels. order approach such complex task, an network must structured intelligently, based on qualitative understanding data. paper, we scaffold neural network's...
Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs. However, they struggle with require unbounded multi-step computation, adding integers executing Surprisingly, we find these same are able to complex computations -- even in the few-shot regime when asked operation "step by step", showing results of intermediate computations. In particular, train transformers asking them emit...
Collider data must be corrected for detector effects (``unfolded'') to compared with many theoretical calculations and measurements from other experiments. Unfolding is traditionally done individual, binned observables without including all information relevant characterizing the response. We introduce OmniFold, an unfolding method that iteratively reweights a simulated dataset, using machine learning capitalize on available information. Our approach unbinned, works arbitrarily...
Given the lack of evidence for new particle discoveries at Large Hadron Collider (LHC), it is critical to broaden search program. A variety model-independent searches have been proposed, adding sensitivity unexpected signals. There are generally two types such searches: those that rely heavily on simulations and entirely based (unlabeled) data. This paper introduces a hybrid method makes best both approaches. For potential signals resonant in one known feature, this first learns...
In a classically scale-invariant quantum field theory, tunneling rates are infrared divergent due to the existence of instantons any size. While one expects such divergences be resolved by effects, it has been unclear how higher-loop corrections can resolve problem appearing already at loop. With careful power counting, we uncover series loop contributions that dominate over one-loop result and sum all necessary terms. We also clarify previously incomplete treatments related issues...
The stability of the standard model is determined by true minimum effective Higgs potential. We show that potential at its when computed traditional method strongly dependent on gauge parameter. It moreover depends scale where calculated. provide a consistent for determining absolute independent both and calculation scale, order in perturbation theory. This leads to revised bounds ${m}_{h}^{\text{pole}}>(129.4\ifmmode\pm\else\textpm\fi{}2.3)\text{ }\text{ }\mathrm{GeV}$...
Tunneling in quantum field theory is worth understanding properly, not least because it controls the long-term fate of our Universe. There are, however, a number features tunneling rate calculations which lack desirable transparency, such as necessity analytic continuation, appropriateness using an effective instead classical potential, and sensitivity to short-distance physics. This paper attempts review pedagogical detail physical origin its connection path integral. Both traditional...
It is well known that effective potentials can be gauge dependent while their values at extrema should invariant. Unfortunately, establishing this invariance in perturbation theory not straightforward, since contributions from arbitrarily high-order loops of the same size. We show massless scalar QED an infinite class summed (and must summed) to give a gauge-invariant value for potential its minimum. In addition, we exact depends on both scale which it calculated and normalization fields,...
Precise scientific analysis in collider-based particle physics is possible because of complex simulations that connect fundamental theories to observable quantities. The significant computational cost these programs limits the scope, precision, and accuracy Standard Model measurements searches for new phenomena. We therefore introduce Deep neural networks using Classification Tuning Reweighting (DCTR), a network-based approach reweight fit all kinematic flavor information -- full phase...
The ability to extrapolate from short problem instances longer ones is an important form of out-of-distribution generalization in reasoning tasks, and crucial when learning datasets where are rare. These include theorem proving, solving quantitative mathematics problems, reading/summarizing novels. In this paper, we run careful empirical studies exploring the length capabilities transformer-based language models. We first establish that naively finetuning transformers on tasks shows...
Empirical studies suggest that machine learning models often rely on features, such as the background, may be spuriously correlated with label only during training time, resulting in poor accuracy test-time. In this work, we identify fundamental factors give rise to behavior, by explaining why fail way {\em even} easy-to-learn tasks where one would expect these succeed. particular, through a theoretical study of gradient-descent-trained linear classifiers some tasks, uncover two...
The decay rates of quasistable states in quantum field theories are usually calculated using instanton methods. Standard derivations these methods rely a crucial way upon deformations and analytic continuations the physical potential, on saddle point approximation. While resulting procedure can be checked against other semi-classical approaches some one-dimensional cases, it is challenging to trace role relevant scales, any intuitive handle precision approximations involved at best obscure....
junipr is an approach to unsupervised learning in particle physics that scaffolds a probabilistic model for jets around their representation as binary trees. Separate models can be learned different event or jet types, then compared and explored physical insight. The relative probabilities also used discrimination. In this Letter, we show how the training of separate refined context classification optimize discrimination power. We refer junipr. achieves state-of-the-art performance...
Although machine learning models typically experience a drop in performance on out-of-distribution data, accuracies in- versus data are widely observed to follow single linear trend when evaluated across testbed of models. Models that more accurate the relative this baseline exhibit "effective robustness" and exceedingly rare. Identifying such models, understanding their properties, is key improving performance. We conduct thorough empirical investigation effective robustness during...
Histogram-based template fits are the main technique used for estimating parameters of high energy physics Monte Carlo generators. Parametrized neural network reweighting can be to extend this fitting procedure many dimensions and does not require binning. If fit is performed using reconstructed data, then expensive detector simulations must training networks. We introduce a new two-level approach that only requires one dataset with simulation set additional generation-level datasets without...
A common setting for scientific inference is the ability to sample from a high-fidelity forward model (simulation) without having an explicit probability density of data. We propose simulation-based maximum likelihood deconvolution approach in this called OmniFold. Deep learning enables be naturally unbinned and (variable-, and) high-dimensional. In contrast parameter estimation, goal remove detector distortions order enable variety down-stream tasks. Our deep generalization Richardson-Lucy...
Wide neural networks have proven to be a rich class of architectures for both theory and practice. Motivated by the observation that finite width convolutional appear outperform infinite networks, we study scaling laws wide CNNs with skip connections. Following approach (Dyer & Gur-Ari, 2019), present simple diagrammatic recipe derive asymptotic dependence many quantities interest. These relationships provide solvable description training dynamics networks. We test these relations across...
The measurement of the top quark mass has large systematic uncertainties coming from Monte Carlo simulations that are used to match theory and experiment. We explore how much uncertainty can be reduced by using jet grooming procedures. Using ATLAS A14 tunes pythia, we estimate choice tuning parameters in what is meant around 530 MeV without any corrections. This 60% 200 calibrating W 70% 140 additionally applying soft-drop (or 170 trimming). At e + − colliders, associated 110 MeV, reducing...