Victor Lempitsky

ORCID: 0000-0003-4118-710X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Advanced Image and Video Retrieval Techniques
  • Generative Adversarial Networks and Image Synthesis
  • 3D Shape Modeling and Analysis
  • Computer Graphics and Visualization Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced Neural Network Applications
  • Face recognition and analysis
  • Robotics and Sensor-Based Localization
  • Image Retrieval and Classification Techniques
  • Advanced Image Processing Techniques
  • Human Pose and Action Recognition
  • Medical Image Segmentation Techniques
  • Cell Image Analysis Techniques
  • Video Surveillance and Tracking Methods
  • Image Processing Techniques and Applications
  • Image and Object Detection Techniques
  • Digital Media Forensic Detection
  • Image Enhancement Techniques
  • Image and Signal Denoising Methods
  • 3D Surveying and Cultural Heritage
  • Digital Imaging for Blood Diseases
  • Anomaly Detection Techniques and Applications
  • Adversarial Robustness in Machine Learning
  • Remote-Sensing Image Classification

Skolkovo Institute of Science and Technology
2013-2022

Samsung (Russia)
2019-2022

Yandex (Russia)
2012-2022

National Academy of Sciences of Armenia
2022

Samsung (United States)
2018-2021

Samsung (South Korea)
2020-2021

Samsung (United Kingdom)
2019

Institut national de recherche en informatique et en automatique
2016

Moscow Institute of Physics and Technology
2015-2016

Massachusetts Institute of Technology
2015

It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change architecture results significant qualitative improvement generated images. The is limited to swapping batch normalization with instance normalization, and apply latter both at training testing times. resulting can be used train high-performance architectures for real-time image generation. code will made available on github https://github.com/DmitryUlyanov/texture_nets. Full...

10.48550/arxiv.1607.08022 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Top-performing deep architectures are trained on massive amounts of labeled data. In the absence data for a certain task, domain adaptation often provides an attractive option given that similar nature but from different (e.g. synthetic images) available. Here, we propose new approach to in can be large amount source and unlabeled target (no target-domain is necessary). As training progresses, promotes emergence "deep" features (i) discriminative main learning task (ii) invariant with...

10.48550/arxiv.1409.7495 preprint EN other-oa arXiv (Cornell University) 2014-01-01

Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to ability learn realistic priors from large number of example images. In this paper, we show that, on the contrary, structure generator network sufficient capture great deal low-level statistics prior any learning. order do so, that randomly-initialized neural can be used as handcrafted with results in standard inverse problems such denoising,...

10.1109/cvpr.2018.00984 article EN 2018-06-01

We present a new deep learning architecture (called Kdnetwork) that is designed for 3D model recognition tasks and works with unstructured point clouds. The performs multiplicative transformations shares parameters of these according to the subdivisions clouds imposed onto them by kdtrees. Unlike currently dominant convolutional architectures usually require rasterization on uniform twodimensional or three-dimensional grids, Kd-networks do not rely such grids in any way therefore avoid poor...

10.1109/iccv.2017.99 article EN 2017-10-01

A large number of novel encodings for bag visual words models have been proposed in the past two years to improve on standard histogram quantized local features. Examples include locality-constrained linear encoding [23], improved Fisher [17], super vector [27], and kernel codebook [20]. While several authors reported very good results challenging PASCAL VOC classification data by means these new techniques, differences feature computation learning algorithms, missing details description...

10.5244/c.25.76 article EN 2011-01-01

The recent work of Gatys et al., who characterized the style an image by statistics convolutional neural network filters, ignited a renewed interest in texture generation and stylization problems. While their technique uses slow optimization process, recently several authors have proposed to learn generator networks that can produce similar outputs one quick forward pass. are promising, they still inferior visual quality diversity compared generation-by-optimization. In this work, we advance...

10.1109/cvpr.2017.437 article EN 2017-07-01

Several recent works have shown that image descriptors produced by deep convolutional neural networks provide state-of-the-art performance for classification and retrieval problems. It also has been the activations from layers can be interpreted as local features describing particular regions. These aggregated using aggregating methods developed (e.g. Fisher vectors), thus providing new powerful global descriptor. In this paper we investigate possible ways to aggregate produce compact...

10.1109/iccv.2015.150 article EN 2015-12-01

Several recent works have shown how highly realistic human head images can be obtained by training convolutional neural networks to generate them. In order create a personalized talking model, these require on large dataset of single person. However, in many practical scenarios, such models need learned from few image views person, potentially even image. Here, we present system with few-shot capability. It performs lengthy meta-learning videos, and after that is able frame few- one-shot...

10.1109/iccv.2019.00955 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Abstract—The paper introduces Hough forests, which are random forests adapted to perform a generalized transform in an efficient way. Compared previous Hough-based systems such as implicit shape models, improve the performance of for object detection on categorical level. At same time, their flexibility permits extensions new domains tracking and action recognition. can be regarded task-adapted codebooks local appearance that allow fast supervised training matching at test time. They achieve...

10.1109/tpami.2011.70 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2011-04-06

Modern image inpainting systems, despite the significant progress, often struggle with large missing areas, complex geometric structures, and high-resolution images. We find that one of main reasons for is lack an effective receptive field in both network loss function. To alleviate this issue, we propose a new method called mask (LaMa). LaMa based on i) architecture uses fast Fourier convolutions (FFCs), which have image-wide field; ii) high perceptual loss; iii) training masks, unlocks...

10.1109/wacv51458.2022.00323 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022-01-01

We present a method for the detection of instances an object class, such as cars or pedestrians, in natural images. Similarly to some previous works, this is accomplished via generalized Hough transform, where detections individual parts cast probabilistic votes possible locations centroid whole object; hypotheses then correspond maxima image that accumulates from all parts. However, whereas methods detect using generative codebooks part appearances, we take more discriminative approach...

10.1109/cvpr.2009.5206740 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009-06-01

We revisit the idea of brain damage, i.e. pruning coefficients a neural network, and suggest how damage can be modified used to speedup convolutional layers in ConvNets. The approach uses fact that many efficient implementations reduce generalized convolutions matrix multiplications. suggested process prunes kernel tensor group-wise fashion. After such pruning, reduced multiplications thinned dense matrices, which leads speedup. investigate different ways add prunning learning process, show...

10.1109/cvpr.2016.280 article EN 2016-06-01

Gatys et al. recently demonstrated that deep networks can generate beautiful textures and stylized images from a single texture example. However, their methods requires slow memory-consuming optimization process. We propose here an alternative approach moves the computational burden to learning stage. Given example of texture, our trains compact feed-forward convolutional multiple samples same arbitrary size transfer artistic style given image any other image. The resulting are remarkably...

10.48550/arxiv.1603.03417 preprint EN other-oa arXiv (Cornell University) 2016-01-01

User-provided object bounding box is a simple and popular interaction paradigm considered by many existing interactive image segmentation frameworks. However, these frameworks tend to exploit the provided merely exclude its exterior from consideration sometimes initialize energy minimization. In this paper, we discuss how can be further used impose powerful topological prior, which prevents solution excessive shrinking ensures that user-provided bounds in sufficiently tight way. The prior...

10.1109/iccv.2009.5459262 article EN 2009-09-01

We present two novel solutions for multi-view 3D human pose estimation based on new learnable triangulation methods that combine information from multiple 2D views. The first (baseline) solution is a basic differentiable algebraic with an addition of confidence weights estimated the input images. second method volumetric aggregation intermediate backbone feature maps. aggregated volume then refined via convolutions produce final joint heatmaps and allow implicit modelling prior. Crucially,...

10.1109/iccv.2019.00781 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

We introduce a new compression scheme for high-dimensional vectors that approximates the using sums of M codewords coming from different codebooks. show proposed permits efficient distance and scalar product computations between compressed uncompressed vectors. further suggest vector encoding codebook learning algorithms can minimize coding error within scheme. In experiments, we demonstrate be used instead or together with quantization. Compared to quantization its optimized versions,...

10.1109/cvpr.2014.124 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01
Coming Soon ...