- Advanced Optical Imaging Technologies
- Image and Video Quality Assessment
- Advanced Vision and Imaging
- Advanced Image Processing Techniques
- Visual perception and processing mechanisms
- Advanced Image and Video Retrieval Techniques
- Image Retrieval and Classification Techniques
- Advanced Data Compression Techniques
- Image Enhancement Techniques
- Gaze Tracking and Assistive Technology
- Advanced Image Fusion Techniques
- Virtual Reality Applications and Impacts
- Image and Signal Denoising Methods
- Video Coding and Compression Technologies
- Multimedia Communication and Technology
- Digital Holography and Microscopy
- Visual Attention and Saliency Detection
- Generative Adversarial Networks and Image Synthesis
- Internet of Things and Social Network Interactions
- Ophthalmology and Eye Disorders
- Glaucoma and retinal disorders
- Advanced optical system design
- Augmented Reality Applications
- Video Analysis and Summarization
- Surface Roughness and Optical Measurements
Moscow Power Engineering Institute
2025
Huawei Technologies (Germany)
2019-2024
Huawei German Research Center
2021-2024
Tampere University
2007-2015
International Society for Optics and Photonics
2015
Society for Imaging Science and Technology
2015
Tampere University of Applied Sciences
2007-2014
Signal Processing (United States)
2008
Entropy estimation is essential for the performance of learned image compression. It has been demonstrated that a transformer-based entropy model critical importance achieving high compression ratio, however, at expense significant computational effort. In this work, we introduce Efficient Contextformer (eContextformer) – computationally efficient autoregressive context The eContextformer efficiently fuses patch-wise, checkered, and channel-wise grouping techniques parallel modeling,...
This paper aims at providing an overview of the core technologies enabling delivery 3-D Media to next-generation mobile devices. To succeed in design corresponding system, a profound knowledge about human visual system and cues that form perception depth, combined with understanding user requirements for designing experience media, are required. These aspects addressed first related critical parts generic within novel user-centered research framework. Next-generation devices characterized...
We identify, categorize and simulate artifacts which might occur during delivery stereoscopic video to mobile devices. consider the stages of 3D dataflow: content creation, conversion desired format (multiview or source-plus-depth), coding/decoding, transmission, visualization on display. Human vision works by assessing various depth cues - accommodation, binocular cues, pictorial motion parallax. As a consequence any artifact modifies these impairs quality scene. The perceptibility each can...
An accommodation-free displays, also known as Maxwellian keep the displayed image sharp regardless of viewer's focal distance. However, they typically suffer from a small eye-box and limited effective field view (FOV) which requires careful alignment before viewer can see image. This paper presents high-quality head mounted display (aHMD) based on pixel beam scanning for direct forming retina. It has an enlarged FOV easy viewing by replicating points with array splitters. A prototype aHMD is...
Autostereoscopic displays utilizing slanted lenticular sheets produce specific artifacts. These artifacts affect the perception of a 3D scene, and are caused by process which can be modeled as inter-channel crosstalk. We propose methodology for measuring such crosstalk arbitrary multiview display. The measured data might used optimizing image sets given
Image restoration is a critical component of image processing pipelines and for low-level computer vision tasks. Conventional approaches are mostly based on hand-crafted priors. The inter-channel correlation color images not fully exploited. Motivated by the special characteristics (higher red/green green/blue channels than red/blue) in general (green channel always shows best quality among three components) distorted images, this paper, three-stage convolutional neural network (CNN)...
We perform comparative analysis of the visual quality multiple 3D displays - seven portable ones, and a large television set. discuss two groups parameters that influence perceived mobile displays. The first group is related with optical displays, such as crosstalk or size sweet spots. second includes content parameters, objective subjective comfort disparity range, suitable for given display. identify eight important to be measured, each parameter we present measurement methodology, give...
Mobile 3D television is a new form of media experience, which combines the freedom mobility with greater realism presenting visual scenes in 3D. Achieving this combination challenging task as viewing experience has to be achieved limited resources mobile delivery channel such bandwidth and power constrained handheld player. This challenge sets need for tight optimization overall 3DTV system. Presence depth compression artifacts played video are two major factors that influence viewer's...
Abstract— Multi‐view displays employ an optical layer which distributes the light of underlying TFT‐LCD panel in different directions. Certain properties create specific artifacts, such as ghost images, moiré patterns, and masking. The was modeled image‐processing channel, display parameters related with model were identified, are importantfor design algorithms for artifact mitigation. identified interleaving pattern, angular visibility, frequency throughput display. A methodology deriving...
The aim of this study is two-fold; first, to compare how certain visual aids contribute in depth estimation tasks on a portable autostereoscopic display; and second, these cues impact perceived quality. These were studied quantitative subjective using display controlled laboratory environment. Test participants evaluated object depths three-dimensional images, where either 2D cues, 3D or their combinations provided. was conducted three different compression levels order image quality affects...
The use of 3D video is growing in several fields such as entertainment, military simulations, medical applications. However, the process recording, transmitting, and processing prone to errors thus producing artifacts that may affect perceived quality. Nowadays a challenging task definition new metric able predict quality with low computational complexity order be used real-time research this field very active due analysis influence stereoscopic cues. In paper we present novel based on...
We present receiver-side components of the mobile 3DTV technology being developed. focus on DVB-T/H receiving module, 3D video decoder and player, auto-stereoscopic display. These three have been integrated within OMAP 3430 EVM. developed software decapsulator which decapsulates MPE-FEC tables out MPEG2 transform streams into RTP to feed H.264 based modified decode side-by-side stereo video. An interface card has a parallax barrier display system. provide details about implementation these...
We propose an approach for optimizing the visual quality of a multiview 3D display single viewer. The combines eye-position tracking system with on-the-fly optimization image content. algorithm uses video input from pair off-the-shelf web-cameras and employs fast robust face facial feature detection algorithms to provide features subsequent stereo matching distance estimation. Based on measurements having user's eyes position, following improvements are achieved: continuous head parallax...
In this paper, we address the problem of anti-aliasing filtering images to be displayed on auto-stereoscopic displays. Auto-stereoscopic displays are constructed create 3D visual effect by no special glasses but utilizing extra optical layer cast different directions. The topology such is a compromise between number views generated and spatial resolution per view being fraction full 2D resolution. Usually, achieved slanted non-rectangular sub-sampling grids causing however corresponding...
While mobile 3DTV system components such as stereo video compression techniques, transmission channels and auto-stereoscopic displays are available with good level of maturity, their joint work crucially depend on the quality user acceptance. We address these two key factors by rigorously in
In this paper, we describe a system for optimized visualization of stereo images on mobile platform. The utilizes front camera, and face eye tracking to find the position observer's eyes. Depending position, left right views targeting corresponding eyes are maintained properly based measured optical characteristics used parallax-barrier 3D display.
We present a system for 3D visualisation, which combines user-tracking, used by displays with steerable optics, generation of multiple views, typical fixed optical filter. Instead eye-tracking, the user- tracking approach, we propose less computationally demanding head tracking, based on face detection. investigate if precise delivery different images to each eye observer can be handled optics multiview display, and continuous parallax achieved.
Multiview displays suffer from two common artifacts - Moiré, caused by aliasing, and ghosting crosstalk. By measuring the angular brightness function of each TFT element we create so-called mask, which allows us to simulate display output for a given input image. We consider multiview as image processing channel model distortions signal. test using set signals with various frequency components input, analyzing in domain. derive bandpass region display, where introduced are under certain...
Multi-view autostereoscopic displays can be modelled as a multirate system. The design of such involves construction compromise between the amount different views and spatial resolution each view. Images to visualised on these are prone aliasing errors. Careful antialiasing requires knowledge about display frequency response, which is determined mainly by view sub-sampling topology but it also influenced some other, generally non-linear, aliasing-causing effects. In this work, methodology...
Nowadays, there are many metrics for overall Quality of Experience (QoE), both those with Full Reference (FR), such as Peak Signal-to-Noise Ratio (PSNR) or Structural Similarity (SSIM), and No (NR), Video Indicators (VQI), which successfully used in video processing systems to evaluate videos whose quality is degraded by different scenarios. However, they not suitable sequences recognition tasks (Target Recognition Videos, TRV). Therefore, correctly estimating the performance pipeline manual...
In the realm of modern video processing systems, traditional metrics such as Peak Signal-to-Noise Ratio and Structural Similarity are often insufficient for evaluating videos intended recognition tasks, like object or license plate recognition. Recognizing need specialized assessment in this domain, study introduces a novel approach tailored to Automatic License Plate Recognition (ALPR). We developed robust evaluation framework using dataset with ground truth coordinates ALPR. This includes...