NFDI4DS | UHH-SEMS - Publication Details

Søren Hauberg

ORCID: 0000-0001-7223-877X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5072186051

Research Areas

Generative Adversarial Networks and Image Synthesis
Gaussian Processes and Bayesian Inference
Human Pose and Action Recognition
Morphological variations and asymmetry
Neural Networks and Applications
Model Reduction and Neural Networks
Anomaly Detection Techniques and Applications
Human Motion and Animation
Computational Physics and Python Applications
AI in cancer detection
3D Shape Modeling and Analysis
Sparse and Compressive Sensing Techniques
Face and Expression Recognition
Advanced Image and Video Retrieval Techniques
Machine Learning and Data Classification
Video Surveillance and Tracking Methods
Advanced Vision and Imaging
Topological and Geometric Data Analysis
Robot Manipulation and Learning
Bayesian Methods and Mixture Models
Image Retrieval and Classification Techniques
Medical Image Segmentation Techniques
Domain Adaptation and Few-Shot Learning
Face recognition and analysis
Functional Brain Connectivity Studies

Technical University of Denmark
2016-2025

Universidad de Zaragoza
2020

University of California, Berkeley
2020

Danmarks Nationalbank
2016

Compute Canada
2015

Max Planck Society
2012-2014

Max Planck Institute for Intelligent Systems
2012-2014

University of Copenhagen
2008-2012

Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition

OPENALEX - Publications

Frederik Warburg Søren Hauberg Manuel López-Antequera Pau Gargallo Yubin Kuang and 1 more

Lifelong place recognition is an essential and challenging task in computer vision with vast applications robust localization efficient large-scale 3D reconstruction. Progress currently hindered by a lack of large, diverse, publicly available datasets. We contribute Mapillary Street-Level Sequences (SLS), large dataset for urban suburban from image sequences. It contains more than 1.6 million images curated the collaborative mapping platform. The orders magnitude larger current data sources,...

10.1109/cvpr42600.2020.00270 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Learning meaningful representations of protein sequences

OPENALEX - Publications

Nicki Skafte Detlefsen Søren Hauberg Wouter Boomsma

How we choose to represent our data has a fundamental impact on ability subsequently extract information from them. Machine learning promises automatically determine efficient representations large unstructured datasets, such as those arising in biology. However, empirical evidence suggests that seemingly minor changes these machine models yield drastically different result biological interpretations of data. This begs the question what even constitutes most meaningful representation. Here,...

10.1038/s41467-022-29443-w article EN cc-by Nature Communications 2022-04-08

Geodesic exponential kernels: When curvature and linearity conflict

OPENALEX - Publications

Aasa Feragen François Lauze Søren Hauberg

We consider kernel methods on general geodesic metric spaces and provide both negative positive results. First we show that the common Gaussian can only be generalized to a definite space if is flat. As result, for data Riemannian manifold, manifold Euclidean. This implies any attempt design kernels curved manifolds futile. However, with conditionally distances Laplacian while retaining definiteness. some spaces, including spheres hyperbolic spaces. Our theoretical results are verified empirically.

10.1109/cvpr.2015.7298922 article EN 2015-06-01

Unscented Kalman Filtering on Riemannian Manifolds

OPENALEX - Publications

Søren Hauberg François Lauze Kim Steenstrup Pedersen

10.1007/s10851-012-0372-9 article EN Journal of Mathematical Imaging and Vision 2012-08-02

Grassmann Averages for Scalable Robust PCA

OPENALEX - Publications

Søren Hauberg Aasa Feragen Michael J. Black

As the collection of large datasets becomes increasingly automated, occurrence outliers will increase -- "big data" implies outliers". While principal component analysis (PCA) is often used to reduce size data, and scalable solutions exist, it well-known that can arbitrarily corrupt results. Unfortunately, state-of-the-art approaches for robust PCA do not scale beyond small-to-medium sized datasets. To address this, we introduce Grassmann Average (GA), which expresses dimensionality...

10.1109/cvpr.2014.481 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

Latent Space Oddity: on the Curvature of Deep Generative Models

OPENALEX - Publications

Georgios Arvanitidis Lars Kai Hansen Søren Hauberg

Deep generative models provide a systematic way to learn nonlinear data distributions, through set of latent variables and "generator" function that maps points into the input space. The nonlinearity generator imply space gives distorted view Under mild conditions, we show this distortion can be characterized by stochastic Riemannian metric, demonstrate distances interpolants are significantly improved under metric. This in turn improves probability sampling algorithms clustering Our...

10.48550/arxiv.1710.11379 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Dreaming More Data: Class-dependent Distributions over Diffeomorphisms for Learned Data Augmentation

OPENALEX - Publications

Søren Hauberg Oren Freifeld Anders Larsen John W. Fisher Lars Kai Hansen

Data augmentation is a key element in training high-dimensional models. In this approach, one synthesizes new observations by applying pre-specified transformations to the original data; e.g.~new images are formed rotating old ones. Current schemes, however, rely on manual specification of applied transformations, making data an implicit form feature engineering. With eye towards true end-to-end learning, we suggest learning per-class basis. Particularly, align image pairs within each class...

10.48550/arxiv.1510.02795 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Automated Quantification of sTIL Density with H&E-Based Digital Image Analysis Has Prognostic Potential in Triple-Negative Breast Cancers

OPENALEX - Publications

Jeppe Thagaard Elisabeth Specht Stovgaard Line Grove Vognsen Søren Hauberg Anders Bjorholm Dahl and 8 more

Triple-negative breast cancer (TNBC) is an aggressive and difficult-to-treat type that represents approximately 15% of all cancers. Recently, stromal tumor-infiltrating lymphocytes (sTIL) resurfaced as a strong prognostic biomarker for overall survival (OS) TNBC patients. Manual assessment has innate limitations hinder clinical adoption, the International Immuno-Oncology Biomarker Working Group (TIL-WG) therefore envisioned computational sTIL could overcome these recommended any algorithm...

10.3390/cancers13123050 article EN Cancers 2021-06-18

Are generative models fair? A study of racial bias in dermatological image generation

OPENALEX - Publications

Miguel López-Pérez Søren Hauberg Aasa Feragen

Racial bias in medicine, particularly dermatology, presents significant ethical and clinical challenges. It often results from the underrepresentation of darker skin tones training datasets for machine learning models. While efforts to address dermatology have focused on improving dataset diversity mitigating disparities discriminative models, impact racial generative models remains underexplored. Generative such as Variational Autoencoders (VAEs), are increasingly used healthcare...

10.48550/arxiv.2501.11752 preprint EN arXiv (Cornell University) 2025-01-20

Score Matching Riemannian Diffusion Means

OPENALEX - Publications

Frederik Möbius Rygaard Steen Markvorsen Søren Hauberg Stefan Sommer

Estimating means on Riemannian manifolds is generally computationally expensive because the distance function not known in closed-form for most manifolds. To overcome this, we show that diffusion can be efficiently estimated using score matching with gradient of Brownian motion transition densities same principle as models. Empirically, this more efficient than Monte Carlo simulation while retaining accuracy and also applicable to learned Our method, furthermore, extends computing Fr\'echet...

10.48550/arxiv.2502.13106 preprint EN arXiv (Cornell University) 2025-02-18

Principal Curves on Riemannian Manifolds

OPENALEX - Publications

Søren Hauberg

Euclidean statistics are often generalized to Riemannian manifolds by replacing straight-line interpolations with geodesic ones. While these models familiar-looking, they restricted the inflexibility of geodesics, and rely on constructions which optimal only in domains. We consider extensions Principal Component Analysis (PCA) manifolds. Classic approaches seek a curve passing through mean that optimizes criteria interest. The requirements solution both is must pass tend imply methods work...

10.1109/tpami.2015.2496166 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2015-10-29

Deep Diffeomorphic Transformer Networks

OPENALEX - Publications

Nicki Skafte Detlefsen Oren Freifeld Søren Hauberg

Spatial Transformer layers allow neural networks, at least in principle, to be invariant large spatial transformations image data. The model has, however, seen limited uptake as most practical implementations support only that are too restricted, e.g. affine or homographic maps, and/or destructive such thin plate splines. We investigate the use of flexible diffeomorphic within networks and demonstrate significant performance gains can attained over currently-used models. learned found both...

10.1109/cvpr.2018.00463 article EN 2018-06-01

Identifying metric structures of deep latent variable models

OPENALEX - Publications

Stas Syrota Yevgen Zainchkovskyy Juan Xi Benjamin Bloem-Reddy Søren Hauberg

Deep latent variable models learn condensed representations of data that, hopefully, reflect the inner workings studied phenomena. Unfortunately, these are not statistically identifiable, meaning they cannot be uniquely determined. Domain experts, therefore, need to tread carefully when interpreting these. Current solutions limit lack identifiability through additional constraints on model, e.g. by requiring labeled training data, or restricting expressivity model. We change goal: instead...

10.48550/arxiv.2502.13757 preprint EN arXiv (Cornell University) 2025-02-19

Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

OPENALEX - Publications

Philipp Hennig Søren Hauberg

We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns joint Gaussian process posterior over solution. Such methods have concrete in statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved virtually all computations. The formulation permits marginalising uncertainty such less sensitive to inaccuracies. This leads new algorithms mean computations principal geodesic analysis....

10.48550/arxiv.1306.0308 preprint EN other-oa arXiv (Cornell University) 2013-01-01

Transformations Based on Continuous Piecewise-Affine Velocity Fields

OPENALEX - Publications

Oren Freifeld Søren Hauberg Kayhan Batmanghelich Jonn W. Fisher

We propose novel finite-dimensional spaces of well-behaved <inline-formula><tex-math notation="LaTeX">$\mathbb {R}^n\rightarrow \mathbb {R}^n$</tex-math></inline-formula> transformations. The latter are obtained by (fast and highly-accurate) integration continuous piecewise-affine velocity fields. proposed method is simple yet highly expressive, effortlessly handles optional constraints (e.g., volume preservation and/or boundary conditions), supports convenient modeling choices such as...

10.1109/tpami.2016.2646685 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2017-01-11

Bayesian Triplet Loss: Uncertainty Quantification in Image Retrieval

OPENALEX - Publications

Frederik Warburg Martin Bak Jørgensen Javier Civera Søren Hauberg

Uncertainty quantification in image retrieval is crucial for downstream decisions, yet it remains a challenging and largely unexplored problem. Current methods estimating uncertainties are poorly calibrated, computationally expensive, or based on heuristics. We present new method that views embeddings as stochastic features rather than deterministic features. Our two main contributions (1) likelihood matches the triplet constraint evaluates probability of an anchor being closer to positive...

10.1109/iccv48922.2021.01194 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Scalable Robust Principal Component Analysis Using Grassmann Averages

OPENALEX - Publications

Søren Hauberg Aasa Feragen Raffi Enficiaud Michael J. Black

In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with size. While principal component analysis (PCA) can reduce size, scalable solutions exist, it well-known that arbitrarily corrupt results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note in a zero-mean dataset, each observation spans one-dimensional subspace, giving point on Grassmann manifold. show average subspace corresponds leading...

10.1109/tpami.2015.2511743 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2015-12-23

Coming Soon ...