Taras Khakhulin

ORCID: 0000-0003-4753-4811
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Generative Adversarial Networks and Image Synthesis
  • Computer Graphics and Visualization Techniques
  • Topic Modeling
  • Advanced Text Analysis Techniques
  • Digital Media Forensic Detection
  • Natural Language Processing Techniques
  • Advanced Image Processing Techniques
  • Robotics and Sensor-Based Localization
  • Cell Image Analysis Techniques
  • 3D Shape Modeling and Analysis
  • Face recognition and analysis
  • Human Pose and Action Recognition
  • Web Data Mining and Analysis
  • Image and Video Stabilization
  • Image Enhancement Techniques
  • Speech and dialogue systems
  • Reinforcement Learning in Robotics
  • Advanced Neural Network Applications
  • Parallel Computing and Optimization Techniques
  • Quantum many-body systems
  • Constraint Satisfaction and Optimization
  • Sentiment Analysis and Opinion Mining
  • Quantum Computing Algorithms and Architecture
  • Image Processing Techniques and Applications

Samsung (Russia)
2020-2023

Skolkovo Institute of Science and Technology
2018-2023

Synthace (United Kingdom)
2023

Moscow Institute of Physics and Technology
2018

Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images a coarse-to-fine manner. Here, we present new architecture for generators, where the color value at each pixel is computed independently given of random latent vector and coordinate that pixel. No or similar operations propagate information across pixels are involved during synthesis. We analyze modeling capabilities such generators when...

10.1109/cvpr46437.2021.01405 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Representing human performance at high-fidelity is an essential building block in diverse applications, such as film production, computer games or videoconferencing. To close the gap to production-level quality, we introduce HumanRF, a 4D dynamic neural scene representation that captures full-body appearance motion from multi-view video input, and enables playback novel, unseen viewpoints. Our novel acts encoding fine details high compression rates by factorizing space-time into temporal...

10.1145/3592415 article EN ACM Transactions on Graphics 2023-07-26

Mikhail Burtsev, Alexander Seliverstov, Rafael Airapetyan, Arkhipov, Dilyara Baymurzina, Nickolay Bushkov, Olga Gureenkova, Taras Khakhulin, Yuri Kuratov, Denis Kuznetsov, Alexey Litinsky, Varvara Logacheva, Lymar, Valentin Malykh, Maxim Petrov, Vadim Polulyakh, Leonid Pugachev, Sorokin, Maria Vikhreva, Marat Zaynutdinov. Proceedings of ACL 2018, System Demonstrations. 2018.

10.18653/v1/p18-4021 article EN cc-by 2018-01-01

In this work, we advance the neural head avatar technology to megapixel resolution while focusing on particularly challenging task of cross-driving synthesis, i.e., when appearance driving image is substantially different from animated source image. We propose a set new architectures and training methods that can leverage both medium-resolution video data high-resolution achieve desired levels rendered quality generalization novel views motion. demonstrate suggested produce convincing...

10.1145/3503161.3547838 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present high-resolution translation (HiDT) model this HiDT combines generative image-to-image and new upsampling scheme that allows to apply at resolution. The demonstrates competitive results terms of both commonly used GAN metrics human evaluation. Importantly, good performance comes as result...

10.1109/cvpr42600.2020.00751 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

We present a new method for lightweight novel-view synthesis that generalizes to an arbitrary forward-facing scene. Recent approaches are computationally expensive, require per-scene optimization, or produce memory-expensive representation. start by representing the scene with set of fronto-parallel semitransparent planes and afterwards convert them deformable layers in end-to-end manner. Additionally, we employ feed-forward refinement procedure corrects estimated representation aggregating...

10.1109/wacv56688.2023.00429 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

Representing scenes with multiple semitransparent colored layers has been a popular and successful choice for real-time novel view synthesis. Existing approaches infer colors transparency values over regularly spaced of planar or spherical shape. In this work, we introduce new synthesis approach based on scene-adapted geometry. Our infers such representations from stereo pairs in two stages. The first stage produces the geometry small number data-adaptive given pair views. second color these...

10.1109/cvpr52688.2022.00849 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We suggest a new language-independent architecture of robust word vectors (RoVe). It is designed to alleviate the issue typos, which are common in almost any user-generated content, and hinder automatic text processing. Our model morphologically motivated, allows it deal with unseen forms rich languages. present results on number Natural Language Processing (NLP) tasks languages for variety related architectures show that proposed typo-proof.

10.18653/v1/w18-6108 article EN cc-by 2018-01-01

Tensor networks are the main building blocks in a wide variety of computational sciences, ranging from many-body theory and quantum computing to probability machine learning. Here we propose parallel algorithm for contraction tensor using probabilistic graphical models. Our approach is based on heuristic solution $\mu$-treewidth deletion problem graph theory. We apply resulting simulation random circuits discuss extensions general network contractions.

10.1103/physreva.102.062614 article EN Physical review. A/Physical review, A 2020-12-28

Aspect extraction from user reviews is one of the sources to make dialog systems, which are on rise now. A typical a conversation system has no time check spelling or grammar in his her utterances. Due that utterances contain typos and errors, so noise robustness should be considered as significant feature an aspect model. We analyze noise-robustness state-of-the-art Attention-Based Extraction technique propose extensions for this model, lead more robust behaviour presence typos....

10.1109/ic-aiai.2018.8674450 article EN 2018-10-01

We propose a Reinforcement Learning based approach to approximately solve the Tree Decomposition (TD) problem. TD is combinatorial problem, which central analysis of graph minor structure and computational complexity, as well in algorithms probabilistic inference, register allocation, other practical tasks. Recently, it has been shown that problems can be successively solved by learned heuristics. However, majority existing works do not address question generalization learning-based...

10.48550/arxiv.1910.08371 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present high-resolution translation (HiDT) model this HiDT combines generative image-to-image and new upsampling scheme that allows to apply at resolution. The demonstrates competitive results terms of both commonly used GAN metrics human evaluation. Importantly, good performance comes as result...

10.48550/arxiv.2003.08791 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Existing image generator networks rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images a coarse-to-fine manner. Here, we present new architecture for generators, where the color value at each pixel is computed independently given of random latent vector and coordinate that pixel. No or similar operations propagate information across pixels are involved during synthesis. We analyze modeling capabilities such generators when...

10.48550/arxiv.2011.13775 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Representing scenes with multiple semi-transparent colored layers has been a popular and successful choice for real-time novel view synthesis. Existing approaches infer colors transparency values over regularly-spaced of planar or spherical shape. In this work, we introduce new synthesis approach based on scene-adapted geometry. Our infers such representations from stereo pairs in two stages. The first stage the geometry small number data-adaptive given pair views. second color these...

10.48550/arxiv.2201.05023 preprint EN cc-by arXiv (Cornell University) 2022-01-01

We present a system for realistic one-shot mesh-based human head avatars creation, ROME short. Using single photograph, our model estimates person-specific mesh and the associated neural texture, which encodes both local photometric geometric details. The resulting are rigged can be rendered using network, is trained alongside texture estimators on dataset of in-the-wild videos. In experiments, we observe that performs competitively in terms geometry recovery quality renders, especially...

10.48550/arxiv.2206.08343 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

We present a new method for lightweight novel-view synthesis that generalizes to an arbitrary forward-facing scene. Recent approaches are computationally expensive, require per-scene optimization, or produce memory-expensive representation. start by representing the scene with set of fronto-parallel semitransparent planes and afterward convert them deformable layers in end-to-end manner. Additionally, we employ feed-forward refinement procedure corrects estimated representation aggregating...

10.48550/arxiv.2210.01602 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

In this work, we advance the neural head avatar technology to megapixel resolution while focusing on particularly challenging task of cross-driving synthesis, i.e., when appearance driving image is substantially different from animated source image. We propose a set new architectures and training methods that can leverage both medium-resolution video data high-resolution achieve desired levels rendered quality generalization novel views motion. demonstrate suggested produce convincing...

10.48550/arxiv.2207.07621 preprint EN cc-by-nc-sa arXiv (Cornell University) 2022-01-01
Coming Soon ...