Johannes Ballé

ORCID: 0000-0003-0769-8985
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Image and Signal Denoising Methods
  • Advanced Image Processing Techniques
  • Advanced Data Compression Techniques
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Vision and Imaging
  • Image and Video Quality Assessment
  • Image Enhancement Techniques
  • Video Coding and Compression Technologies
  • Advanced Image Fusion Techniques
  • Neural Networks and Applications
  • Advanced Neural Network Applications
  • Model Reduction and Neural Networks
  • Visual Attention and Saliency Detection
  • Wireless Communication Security Techniques
  • Visual perception and processing mechanisms
  • Image Retrieval and Classification Techniques
  • Adversarial Robustness in Machine Learning
  • Blind Source Separation Techniques
  • Neural dynamics and brain function
  • Privacy-Preserving Technologies in Data
  • Advanced Image and Video Retrieval Techniques
  • Cell Image Analysis Techniques
  • stochastic dynamics and bifurcation
  • Experimental Learning in Engineering
  • Teaching and Learning Programming

Google (United States)
2018-2024

New York University
2015-2023

Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute
2023

Texas State University
2023

Institut Universitaire de France
2023

Alibaba Group (China)
2023

Google (Switzerland)
2022

Howard Hughes Medical Institute
2014-2017

Courant Institute of Mathematical Sciences
2014-2017

RWTH Aachen University
2006-2012

We describe an image compression method, consisting of a nonlinear analysis transformation, uniform quantizer, and synthesis transformation. The transforms are constructed in three successive stages convolutional linear filters activation functions. Unlike most neural networks, the joint nonlinearity is chosen to implement form local gain control, inspired by those used model biological neurons. Using variant stochastic gradient descent, we jointly optimize entire for rate-distortion...

10.48550/arxiv.1611.01704 preprint EN other-oa arXiv (Cornell University) 2016-01-01

We describe an end-to-end trainable model for image compression based on variational autoencoders. The incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This relates side information, concept universal virtually all modern codecs, but largely unexplored using artificial neural networks (ANNs). Unlike existing autoencoder methods, our trains complex prior jointly with underlying autoencoder. demonstrate that this leads state-of-the-art when...

10.48550/arxiv.1802.01436 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Recent models for learned image compression are based on autoencoders, learning approximately invertible mappings from pixels to a quantized latent representation. These combined with an entropy model, prior the representation that can be used standard arithmetic coding algorithms yield compressed bitstream. Recently, hierarchical have been introduced as way exploit more structure in latents than simple fully factorized priors, improving performance while maintaining end-to-end optimization....

10.48550/arxiv.1809.02736 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We introduce a parametric nonlinear transformation that is well-suited for Gaussianizing data from natural images. The are linearly transformed, and each component then normalized by pooled activity measure, computed exponentiating weighted sum of rectified exponentiated components constant. optimize the parameters full (linear transform, exponents, weights, constant) over database images, directly minimizing negentropy responses. optimized substantially Gaussianizes data, achieving...

10.48550/arxiv.1511.06281 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Despite considerable progress on end-to-end optimized deep networks for image compression, video coding remains a challenging task. Recently proposed methods learned compression use optical flow and bilinear warping motion compensation show competitive rate-distortion performance relative to hand-engineered codecs like H.264 HEVC. However, these learning-based rely complex architectures training schemes including the of pre-trained networks, sequential sub-networks, adaptive rate control,...

10.1109/cvpr42600.2020.00853 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

We introduce a general framework for end-to-end optimization of the rate-distortion performance nonlinear transform codes assuming scalar quantization. The can be used to optimize any differentiable pair analysis and synthesis transforms in combination with perceptual metric. As an example, we consider code built from linear followed by form multi-dimensional local gain control. Distortion is measured state-of-the-art When optimized over large database images, this representation offers...

10.1109/pcs.2016.7906310 article EN 2016-01-01

We review a class of methods that can be collected under the name nonlinear transform coding (NTC), which over past few years have become competitive with best linear codecs for images, and superseded them in terms rate-distortion performance established perceptual quality metrics such as MS-SSIM. assess empirical NTC help simple example sources, optimal vector quantizer is easier to estimate than natural data sources. To this end, we introduce novel variant entropy-constrained quantization....

10.1109/jstsp.2020.3034501 article EN cc-by IEEE Journal of Selected Topics in Signal Processing 2020-10-28

In today's teaching and learning approaches for first-semester students, practical courses more often complement traditional theoretical lectures. This element allows an early insight into the real world of engineering, augments student motivation, enables students to acquire soft skills early. paper describes a new freshman introduction course which has been established within Bachelor Science curriculum Electrical Engineering Information Technology RWTH Aachen University, Germany. The is...

10.1109/te.2009.2017272 article EN IEEE Transactions on Education 2009-08-21

We present an image quality metric based on the transformations associated with early visual system: local luminance subtraction and gain control. Images are decomposed using a Laplacian pyramid, which subtracts estimate of mean at multiple scales. Each pyramid coefficient is then divided by amplitude (weighted sum absolute values neighbors), where weights optimized for prediction (undistorted) images from separate database. define distorted image, relative to its undistorted original, as...

10.2352/issn.2470-1173.2016.16.hvei-103 article EN Electronic Imaging 2016-02-14

We assess the performance of two techniques in context nonlinear transform coding with artificial neural networks, Sadam and GDN. Both have been success- fully used state-of-the-art image compression methods, but their has not individually assessed to this point. Together, stabilize training procedure transforms increase capacity approximate (unknown) rate-distortion optimal functions. Besides comparing established alternatives, we detail implementation both methods provide open-source code...

10.1109/pcs.2018.8456272 article EN 2018-06-01

We develop a framework for rendering photographic images, taking into account display limitations, so as to optimize perceptual similarity between the rendered image and original scene. formulate this constrained optimization problem, in which we minimize measure of dissimilarity, Normalized Laplacian Pyramid Distance (NLPD), mimics early stage transformations human visual system. When images acquired with higher dynamic range than that display, find optimized solution boosts contrast...

10.1364/josaa.34.001511 article EN Journal of the Optical Society of America A 2017-08-10

Pre-trained convolutional neural networks (CNNs) are powerful off-the-shelf feature generators and have been shown to perform very well on a variety of tasks. Unfortunately, the generated features high dimensional expensive store: potentially hundreds thousands floats per example when processing videos. Traditional entropy based lossless compression methods little help as they do not yield desired level compression, while general purpose lossy energy compaction (e.g. PCA followed by...

10.1109/icip40778.2020.9190860 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2020-09-30

Image compression using neural networks have reached or exceeded non-neural methods (such as JPEG, WebP, BPG). While these are state of the art in ratedistortion performance, computational feasibility models remains a challenge. We apply automatic network optimization techniques to reduce complexity popular architecture used image compression, analyze decoder execution runtime and explore trade-offs between two distortion metrics, rate-distortion performance run-time design research more...

10.48550/arxiv.1912.08771 preprint EN other-oa arXiv (Cornell University) 2019-01-01

In this paper, we investigate the use of linear, parametric models static and dynamic texture in context conventional transform coding images video. We propose a hybrid approach incorporating both texture-specific methods for improvement efficiency. Regarding (i.e., purely spatial) texture, show that Gaussian Markov random fields (GMRFs) can be used analysis/synthesis certain class texture. The properties model allow us to derive optimal classification, analysis, quantization synthesis. For...

10.1109/jstsp.2011.2166246 article EN IEEE Journal of Selected Topics in Signal Processing 2011-08-31

Some forms of novel visual media enable the viewer to explore a 3D scene from essentially arbitrary viewpoints, by interpolating between discrete set original views. Compared 2D imagery, these types applications require much larger amounts storage space, which we seek reduce. Existing approaches for compressing scenes are often based on separation compression and rendering: each views is compressed using traditional image formats; receiver decompresses then performs rendering. We unify steps...

10.1109/pcs50896.2021.9477505 article EN 2021-06-01

Efficient intra prediction is an important aspect of video coding with high compression efficiency. H.264/AVC applies directional from neighboring pixels on adjustable block size for local decorrelation. In this paper, we present extended scheme in the context that comprises two additional methods exploiting self-similar properties encoded texture. A new macroblock type implemented, allowing flexible selection available sub-partitions macroblock. Depending content sequence, substantial gains...

10.1109/icip.2007.4379529 article EN 2007-01-01

We develop a method for comparing hierarchical image representations in terms of their ability to explain perceptual sensitivity humans. Specifically, we utilize Fisher information establish model-derived prediction local perturbations an image. For given image, compute the eigenvectors matrix with largest and smallest eigenvalues, corresponding model-predicted most- least-noticeable distortions, respectively. human subjects, then measure amount each distortion that can be reliably detected...

10.48550/arxiv.1710.02266 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as Wyner-Ziv problem, is special case distributed coding. To this day, real-world applications problem have neither been fully developed nor heavily investigated. propose data-driven method based on machine learning that leverages universal function approximation capability artificial neural networks. find our network-based scheme re-discovers some...

10.1109/isit54713.2023.10206542 article EN 2022 IEEE International Symposium on Information Theory (ISIT) 2023-06-25

We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality other. discuss its metric properties. Pairs of images are close under illustrate utility. In particular, we generate random have high to reference image in location smoothly transition an independent realization as moves away from this point. represents generalization synthesis prior work texture generation, models early...

10.1109/ciss59072.2024.10480168 article EN 2024-03-13
Coming Soon ...