NFDI4DS | UHH-SEMS - Publication Details

Johannes Schmidt-Hieber

ORCID: 0000-0003-2699-4990

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5002981992

Research Areas

Statistical Methods and Inference
Neural Networks and Applications
Financial Risk and Volatility Modeling
Bayesian Methods and Mixture Models
Stochastic processes and financial applications
Statistical Methods and Bayesian Inference
Stochastic Gradient Optimization Techniques
Advanced Statistical Methods and Models
Sparse and Compressive Sensing Techniques
Face and Expression Recognition
Statistical Mechanics and Entropy
Image and Signal Denoising Methods
Gaussian Processes and Bayesian Inference
Advanced Statistical Process Monitoring
Machine Learning and Algorithms
Markov Chains and Monte Carlo Methods
Complex Systems and Time Series Analysis
Numerical methods in inverse problems
Monetary Policy and Economic Impact
Aortic aneurysm repair treatments
Model Reduction and Neural Networks
Machine Learning in Materials Science
Probabilistic and Robust Engineering Design
Medical Image Segmentation Techniques
Bayesian Modeling and Causal Inference

University of Twente
2019-2025

Leiden University
2014-2021

University of Göttingen
2009-2019

University of Canterbury
2019

King's College London
2019

École Nationale de la Statistique et de l'Administration Économique
2014

Centre de Recherche en Économie et Statistique
2014

Vrije Universiteit Amsterdam
2013

Gesellschaft Fur Mathematik Und Datenverarbeitung
2010

Nonparametric regression using deep neural networks with ReLU activation function

OPENALEX - Publications

Johannes Schmidt-Hieber

Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve minimax rates of convergence (up to $\log n$-factors) under a general composition assumption function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there lot flexibility in architecture, tuning parameter sparsity...

10.1214/19-aos1875 article EN The Annals of Statistics 2020-08-01

A comparison of deep networks with ReLU activation function and linear spline-type methods

OPENALEX - Publications

Konstantin Eckle Johannes Schmidt-Hieber

Deep neural networks (DNNs) generate much richer function spaces than shallow networks. Since the induced by have several approximation theoretic drawbacks, this explains, however, not necessarily success of deep In article we take another route comparing expressive power DNNs with ReLU activation to linear spline methods. We show that MARS (multivariate adaptive regression splines) is improper learnable in sense for any given can be expressed as a M parameters there exists multilayer...

10.1016/j.neunet.2018.11.005 article EN cc-by Neural Networks 2018-12-04

Bayesian linear regression with sparse priors

OPENALEX - Publications

Ismaël Castillo Johannes Schmidt-Hieber Aad van der Vaart

We study full Bayesian procedures for high-dimensional linear regression under sparsity constraints. The prior is a mixture of point masses at zero and continuous distributions. Under compatibility conditions on the design matrix, posterior distribution shown to contract optimal rate recovery unknown sparse vector, give prediction response vector. It also select correct model, or least coefficients that are significantly different from zero. asymptotic shape characterized employed...

10.1214/15-aos1334 article EN other-oa The Annals of Statistics 2015-08-03

The Kolmogorov–Arnold representation theorem revisited

OPENALEX - Publications

Johannes Schmidt-Hieber

There is a longstanding debate whether the Kolmogorov–Arnold representation theorem can explain use of more than one hidden layer in neural networks. The decomposes multivariate function into an interior and outer therefore has indeed similar structure as network with two layers. But there are distinctive differences. One main obstacles that depends on represented be wildly varying even if smooth. We derive modifications transfer smoothness properties to well approximated by ReLU It appears...

10.1016/j.neunet.2021.01.020 article EN cc-by Neural Networks 2021-01-29

On adaptive posterior concentration rates

OPENALEX - Publications

Marc Hoffmann Judith Rousseau Johannes Schmidt-Hieber

We investigate the problem of deriving posterior concentration rates under different loss functions in nonparametric Bayes. first provide a lower bound on coverages shrinking neighbourhoods that relates metric or which neighbourhood is considered, and an intrinsic pre-metric linked to frequentist separation rates. In Gaussian white noise model, we construct feasible priors based spike slab procedure reminiscent wavelet thresholding achieve adaptive contraction $L^2$ $L^{\infty}$ metrics when...

10.1214/15-aos1341 article EN other-oa The Annals of Statistics 2015-09-16

Conditions for posterior contraction in the sparse normal means problem

OPENALEX - Publications

Stéphanie van der Pas Jean-Bernard Salomond Johannes Schmidt-Hieber

The first Bayesian results for the sparse normal means problem were proven spike-and-slab priors. However, these priors are less convenient from a computational point of view. In meanwhile, large number continuous shrinkage has been proposed. Many can be written as scale mixture normals, which makes them particularly easy to implement. We propose general conditions on prior local variance in mixtures such that posterior contraction at minimax rate is assured. require tails least heavy...

10.1214/16-ejs1130 article EN cc-by Electronic Journal of Statistics 2016-01-01

Deep ReLU network approximation of functions on a manifold

OPENALEX - Publications

Johannes Schmidt-Hieber

Whereas recovery of the manifold from data is a well-studied topic, approximation rates for functions defined on manifolds are less known. In this work, we study regression problem with inputs $d^*$-dimensional that embedded into space potentially much larger ambient dimension. It shown sparsely connected deep ReLU networks can approximate Hölder function smoothness index $β$ up to error $ε$ using order $ε^{-d^*/β}\log(1/ε)$ many non-zero network parameters. As an application, derive...

10.48550/arxiv.1908.00695 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Multiscale methods for shape constraints in deconvolution: Confidence statements for qualitative features

OPENALEX - Publications

Johannes Schmidt-Hieber Axel Munk Lutz Dümbgen

We derive multiscale statistics for deconvolution in order to detect qualitative features of the unknown density. An important example covered within this framework is test local monotonicity on all scales simultaneously. investigate moderately ill-posed setting, where Fourier transform error density model polynomial decay. For testing, we consider a calibration, motivated by modulus continuity Brownian motion. performance our results from both theoretical and simulation based point view. A...

10.1214/13-aos1089 article EN other-oa The Annals of Statistics 2013-06-01

Ordinal Patterns Based Change Points Detection

OPENALEX - Publications

Annika Betken Giorgio Micali Johannes Schmidt-Hieber

The ordinal patterns of a fixed number consecutive values in time series is the spatial ordering these values. Counting how often specific pattern occurs provides important insights into properties series. In this work, we prove asymptotic normality relative frequency for with linear increments. Moreover, apply to detect changes distribution

10.48550/arxiv.2502.03099 preprint EN arXiv (Cornell University) 2025-02-05

Challenges and Opportunities for Statistics in the Era of Data Science

OPENALEX - Publications

Claudia Kirch Soumendra Nath Lahiri Harald Binder Werner Brannath Ivor Cribben and 30 more

10.1162/99608f92.abf14c9d article EN cc-by Harvard data science review 2025-04-21

Convergence rates of deep ReLU networks for multiclass classification

OPENALEX - Publications

Thijs Bos Johannes Schmidt-Hieber

For classification problems, trained deep neural networks return probabilities of class memberships. In this work we study convergence the learned to true conditional probabilities. More specifically consider sparse ReLU network reconstructions minimizing cross-entropy loss in multiclass setup. Interesting phenomena occur when membership are close zero. Convergence rates derived that depend on near-zero behaviour via a margin-type condition.

10.1214/22-ejs2011 article EN cc-by Electronic Journal of Statistics 2022-01-01

Rejoinder: “Nonparametric regression using deep neural networks with ReLU activation function”

OPENALEX - Publications

Johannes Schmidt-Hieber

10.1214/19-aos1931 article EN The Annals of Statistics 2020-08-01

Correction to “Nonparametric regression using deep neural networks with ReLU activation function”

OPENALEX - Publications

Johannes Schmidt-Hieber Don Vu

10.1214/24-aos2351 article EN The Annals of Statistics 2024-02-01

Local convergence rates of the nonparametric least squares estimator with applications to transfer learning

OPENALEX - Publications

Johannes Schmidt-Hieber Petr Zamolodtchikov

Convergence properties of empirical risk minimizers can be conveniently expressed in terms the associated population risk. To derive bounds for performance estimator under covariate shift, however, pointwise convergence rates are required. Under weak assumptions on design distribution, it is shown that least squares estimators (LSE) over 1-Lipschitz functions also minimax rate optimal with respect to a weighted uniform norm, where weighting accounts natural way non-uniformity distribution....

10.3150/23-bej1655 article EN Bernoulli 2024-05-15

Nonparametric estimation of the volatility function in a high-frequency model corrupted by noise

OPENALEX - Publications

Axel Munk Johannes Schmidt-Hieber

We consider the models Yi,n=∫0i/nσ(s)dWs+τ(i/n)εi,n, and Ỹi,n=σ(i/n)Wi/n+τ(i/n)εi,n, i=1,…,n, where (Wt)t∈[0,1] denotes a standard Brownian motion εi,n are centered i.i.d. random variables with E (εi,n2)=1 finite fourth moment. Furthermore, σ τ unknown deterministic functions (ε1,n,…,εn,n) assumed to be independent processes. Based on spectral decomposition of covariance structures we derive series estimators for σ2 τ2 investigate their rate convergence MISE in dependence smoothness. To...

10.1214/10-ejs568 article EN cc-by Electronic Journal of Statistics 2010-01-01

Adaptive wavelet estimation of the diffusion coefficient under additive error measurements

OPENALEX - Publications

M. Hoffmann Axel Munk Johannes Schmidt-Hieber

On étudie l’estimation non-paramétrique du coefficient de diffusion à partir d’observations discrètes, lorsque les observations sont bruitées par un bruit additionnel. De tels problèmes se développés au cours des dix dernières années dans plusieurs champs d’application, en particuler pour la modélisation données haute fréquence finance, cependant plutôt d’un point vue paramétrique ou semi-paramétrique. Ce travail concerne trajectoire (éventuellement stochastique) cadre relativement général....

10.1214/11-aihp472 article FR Annales de l Institut Henri Poincaré Probabilités et Statistiques 2012-11-01

Tests for qualitative features in the random coefficients model

OPENALEX - Publications

Fabian Dunker Konstantin Eckle Katharina Proksch Johannes Schmidt-Hieber

The random coefficients model is an extension of the linear regression that allows for unobserved heterogeneity in population by modeling as variables. Given data from this model, statistical challenge to recover information about joint density which a multivariate and ill-posed problem. Because curse dimensionality ill-posedness, nonparametric estimation difficult suffers slow convergence rates. Larger features, such increase along some direction or well-accentuated mode can, however, be...

10.1214/19-ejs1570 article EN cc-by Electronic Journal of Statistics 2019-01-01

The Le Cam distance between density estimation, Poisson processes and Gaussian white noise

OPENALEX - Publications

Kolyan Ray Johannes Schmidt-Hieber

It is well-known that density estimation on the unit interval asymptotically equivalent to a Gaussian white noise experiment, provided densities have H\"older smoothness larger than $1/2$ and are uniformly bounded away from zero. We derive matching lower constructive upper bounds for Le Cam deficiencies between these experiments, with explicit dependence both sample size of in parameter space. As consequence, we sharp conditions how small can be asymptotic equivalence hold. The related case...

10.4171/msl/1-2-1 article EN Mathematical Statistics and Learning 2018-09-05

Minimax theory for a class of nonlinear statistical inverse problems

OPENALEX - Publications

Kolyan Ray Johannes Schmidt-Hieber

We study a class of statistical inverse problems with nonlinear pointwise operators motivated by concrete applications. A two-step procedure is proposed, where the first step smoothes data and inverts nonlinearity. This reduces initial problem to linear deterministic noise, which then solved in second step. The noise reduction based on wavelet thresholding shown be minimax optimal (up logarithmic factors) function-dependent sense. Our analysis modified notion Hölder smoothness scales that...

10.1088/0266-5611/32/6/065003 article EN Inverse Problems 2016-04-25

On lower bounds for the bias-variance trade-off

OPENALEX - Publications

Alexis Derumigny Johannes Schmidt-Hieber

It is a common phenomenon that for high-dimensional and nonparametric statistical models, rate-optimal estimators balance squared bias variance. Although this balancing widely observed, little known whether methods exist could avoid the trade-off between We propose general strategy to obtain lower bounds on variance of any estimator with smaller than prespecified bound. This shows which extent bias-variance unavoidable allows quantify loss performance do not obey it. The approach based...

10.1214/23-aos2279 article EN The Annals of Statistics 2023-08-01

Nonparametric Bayesian analysis of the compound Poisson prior for support boundary recovery

OPENALEX - Publications

Markus Reiß Johannes Schmidt-Hieber

Given data from a Poisson point process with intensity $(x,y)\mapston\mathbf{1}(f(x)\leq y)$, frequentist properties for the Bayesian reconstruction of support boundary function $f$ are derived. We mainly study compound priors fixed proving that posterior contracts nearly optimal rate monotone boundaries and adapts to Hölder smooth boundaries. then derive limiting shape result prior space increasing parameter dimension. It is shown marginal mean functional performs an automatic bias...

10.1214/19-aos1853 article EN The Annals of Statistics 2020-06-01

Interpreting Learning in Biological Neural Networks as Zero-Order Optimization Method

OPENALEX - Publications

Johannes Schmidt-Hieber

Recently, significant progress has been made regarding the statistical understanding of artificial neural networks (ANNs). ANNs are motivated by functioning brain, but differ in several crucial aspects. In particular, locality updating rule connection parameters biological (BNNs) makes it biologically implausible that learning brain is based on gradient descent. this work, we look at as a method for supervised learning. The main contribution to relate local BNNs zero-order optimization...

10.2139/ssrn.4415800 preprint EN 2023-01-01

A regularity class for the roots of nonnegative functions

OPENALEX - Publications

Kolyan Ray Johannes Schmidt-Hieber

10.1007/s10231-017-0655-2 article EN Annali di Matematica Pura ed Applicata (1923 -) 2017-03-27

Coming Soon ...