NFDI4DS | UHH-SEMS - Publication Details

Esa Rahtu

ORCID: 0000-0001-8767-0864

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5088180438

Research Areas

Advanced Vision and Imaging
Advanced Image and Video Retrieval Techniques
Robotics and Sensor-Based Localization
Advanced Image Processing Techniques
Advanced Neural Network Applications
Image Processing Techniques and Applications
Image Retrieval and Classification Techniques
Video Analysis and Summarization
Visual Attention and Saliency Detection
Human Pose and Action Recognition
Music and Audio Processing
Indoor and Outdoor Localization Technologies
Face recognition and analysis
Multimodal Machine Learning Applications
Domain Adaptation and Few-Shot Learning
Generative Adversarial Networks and Image Synthesis
Speech and Audio Processing
Anomaly Detection Techniques and Applications
3D Surveying and Cultural Heritage
Optical measurement and interference techniques
Image Enhancement Techniques
Video Surveillance and Tracking Methods
Computer Graphics and Visualization Techniques
Image and Signal Denoising Methods
Osteoarthritis Treatment and Mechanisms

Tampere University
2017-2025

Nokia (Finland)
2023

Tampere University of Applied Sciences
2017-2020

Czech Technical University in Prague
2020

Tampere University
2017-2019

ETH Zurich
2019

Signal Processing (United States)
2018-2019

University of Oulu
2008-2017

Lund University
2011

Statistics Finland
2010

Fine-Grained Visual Classification of Aircraft

OPENALEX - Publications

Subhransu Maji Esa Rahtu Juho Kannala Matthew B. Blaschko Andrea Vedaldi

This paper introduces FGVC-Aircraft, a new dataset containing 10,000 images of aircraft spanning 100 models, organised in three-level hierarchy. At the finer level, differences between models are often subtle but always visually measurable, making visual recognition challenging possible. A benchmark is obtained by defining corresponding classification tasks and evaluation protocols, baseline results presented. The construction this was made possible work enthusiasts, strategy that can extend...

10.48550/arxiv.1306.5151 preprint EN other-oa arXiv (Cornell University) 2013-01-01

Automatic Knee Osteoarthritis Diagnosis from Plain Radiographs: A Deep Learning-Based Approach

OPENALEX - Publications

Aleksei Tiulpin Jérôme Thevenot Esa Rahtu Petri Lehenkari Simo Saarakkala

Knee osteoarthritis (OA) is the most common musculoskeletal disorder. OA diagnosis currently conducted by assessing symptoms and evaluating plain radiographs, but this process suffers from subjectivity. In study, we present a new transparent computer-aided method based on Deep Siamese Convolutional Neural Network to automatically score knee severity according Kellgren-Lawrence grading scale. We trained our using data solely Multicenter Osteoarthritis Study validated it randomly selected...

10.1038/s41598-018-20132-7 article EN cc-by Scientific Reports 2018-01-23

Siamese network features for image matching

OPENALEX - Publications

Iaroslav Melekhov Juho Kannala Esa Rahtu

Finding matching images across large datasets plays a key role in many computer vision applications such as structure-from-motion (SfM), multi-view 3D reconstruction, image retrieval, and image-based localisation. In this paper, we propose finding non-matching pairs of by representing them with neural network based feature vectors, whose similarity is measured Euclidean distance. The vectors are obtained convolutional networks which learnt from labeled examples using contrastive loss...

10.1109/icpr.2016.7899663 article EN 2016-12-01

Recognition of blurred faces using Local Phase Quantization

OPENALEX - Publications

Timo Ahonen Esa Rahtu Ville Ojansivu Janne Heikkilä

In this paper, recognition of blurred faces using the recently introduced Local Phase Quantization (LPQ) operator is proposed. LPQ based on quantizing Fourier transform phase in local neighborhoods. The can be shown to a blur invariant property under certain commonly fulfilled conditions. face image analysis, histograms labels computed within regions are used as descriptor similarly widely Binary Pattern (LBP) methodology for description. experimental results CMU PIE and FRGC 1.0.4 datasets...

10.1109/icpr.2008.4761847 article EN Proceedings - International Conference on Pattern Recognition/Proceedings/International Conference on Pattern Recognition 2008-12-01

Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data

OPENALEX - Publications

Aleksei Tiulpin Stefan Klein Sita Bierma‐Zeinstra Jérôme Thevenot Esa Rahtu and 3 more

Abstract Knee osteoarthritis (OA) is the most common musculoskeletal disease without a cure, and current treatment options are limited to symptomatic relief. Prediction of OA progression very challenging timely issue, it could, if resolved, accelerate modifying drug development ultimately help prevent millions total joint replacement surgeries performed annually. Here, we present multi-modal machine learning-based prediction model that utilises raw radiographic data, clinical examination...

10.1038/s41598-019-56527-3 article EN cc-by Scientific Reports 2019-12-27

Image-Based Localization Using Hourglass Networks

OPENALEX - Publications

Iaroslav Melekhov Juha Ylioinas Juho Kannala Esa Rahtu

In this paper, we propose an encoder-decoder convolutional neural network (CNN) architecture for estimating camera pose (orientation and location) from a single RGB-image. The has hourglass shape consisting of chain convolution up-convolution layers followed by regression part. are introduced to preserve the fine-grained information input image. Following common practice, train our model in end-to-end manner utilizing transfer learning large scale classification data. experiments demonstrate...

10.1109/iccvw.2017.107 article EN 2017-10-01

Multi-modal Dense Video Captioning

OPENALEX - Publications

V. Lashin Esa Rahtu

Dense video captioning is a task of localizing interesting events from an untrimmed and producing textual description (captions) for each localized event. Most the previous works in dense are solely based on visual information completely ignore audio track. However, audio, speech, particular, vital cues human observer understanding environment. In this paper, we present new approach that able to utilize any number modalities event description. Specifically, show how speech may improve model....

10.1109/cvprw50498.2020.00487 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020-06-01

DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing

OPENALEX - Publications

Matias Turkulainen Xuqian Ren Iaroslav Melekhov Otto Seiskari Esa Rahtu and 1 more

10.1109/wacv61041.2025.00241 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025-02-26

Learning a category independent object detection cascade

OPENALEX - Publications

Esa Rahtu Juho Kannala Matthew B. Blaschko

Cascades are a popular framework to speed up object detection systems. Here we focus on the first layers of category independent cascade in which sample large number windows from an objectness prior, and then discriminatively learn filter these candidate by order magnitude. We make contributions design that substantially improve over state art: (i) our novel prior gives much higher recall than competing methods, (ii) propose features give high performance with very low computational cost,...

10.1109/iccv.2011.6126351 article EN International Conference on Computer Vision 2011-11-01

Generating Object Segmentation Proposals Using Global and Local Search

OPENALEX - Publications

Pekka Rantalankila Juho Kannala Esa Rahtu

We present a method for generating object segmentation proposals from groups of superpixels. The goal is to propose accurate segmentations all objects an image. proposed hypotheses can be used as input detection systems and thereby improve efficiency by replacing exhaustive search. are generated in class-independent manner therefore the computational cost approach independent number classes. Our combines both global local search space sets implemented greedily merging adjacent pairs...

10.1109/cvpr.2014.310 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

Identification of tumor epithelium and stroma in tissue microarrays using texture analysis

OPENALEX - Publications

Nina Linder Juho Konsti Riku Turkki Esa Rahtu Mikael Lundin and 5 more

The aim of the study was to assess whether texture analysis is feasible for automated identification epithelium and stroma in digitized tumor tissue microarrays (TMAs). Texture based on local binary patterns (LBP) has previously been used successfully applications such as face recognition industrial machine vision. TMAs with samples from 643 patients colorectal cancer were using a whole slide scanner areas representing annotated images. Well-defined images (n = 41) 39) training support...

10.1186/1746-1596-7-22 article EN cc-by Diagnostic Pathology 2012-03-02

Rethinking the Evaluation of Video Summaries

OPENALEX - Publications

Mayu Otani Yuta Nakashima Esa Rahtu Janne Heikkilä

Video summarization is a technique to create short skim of the original video while preserving main stories/content. There exists substantial interest in automatizing this process due rapid growth available material. The recent progress has been facilitated by public benchmark datasets, which enable easy and fair comparison methods. Currently established evaluation protocol compare generated summary with respect set reference summaries provided dataset. In paper, we will provide in-depth...

10.1109/cvpr.2019.00778 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Local phase quantization for blur-insensitive image analysis

OPENALEX - Publications

Esa Rahtu Janne Heikkilä Ville Ojansivu Timo Ahonen

10.1016/j.imavis.2012.04.001 article EN Image and Vision Computing 2012-05-15

A Malaria Diagnostic Tool Based on Computer Vision Screening and Visualization of Plasmodium falciparum Candidate Areas in Digitized Blood Smears

OPENALEX - Publications

Nina Linder Riku Turkki Margarita Walliander Andreas Mårtensson Vinod Diwan and 4 more

Introduction Microscopy is the gold standard for diagnosis of malaria, however, manual evaluation blood films highly dependent on skilled personnel in a time-consuming, error-prone and repetitive process. In this study we propose method using computer vision detection visualization only diagnostically most relevant sample regions digitized smears. Methods Giemsa-stained thin with P. falciparum ring-stage trophozoites (n = 27) uninfected controls 20) were digitally scanned an oil immersion...

10.1371/journal.pone.0104855 article EN cc-by PLoS ONE 2014-08-21

DGC-Net: Dense Geometric Correspondence Network

OPENALEX - Publications

Iaroslav Melekhov Aleksei Tiulpin Torsten Sattler Marc Pollefeys Esa Rahtu and 1 more

This paper addresses the challenge of dense pixel correspondence estimation between two images. problem is closely related to optical flow task where ConvNets (CNNs) have recently achieved significant progress. While methods produce very accurate results for small translation and limited appearance variation scenarios, they hardly deal with strong geometric transformations that we consider in this work. In paper, propose a coarse-to-fine CNN-based framework can leverage advantages approaches...

10.1109/wacv.2019.00115 article EN 2019-01-01

Summarization of User-Generated Sports Video by Using Deep Action Recognition Features

OPENALEX - Publications

Antonio Tejero-de-Pablos Yuta Nakashima Tomokazu Sato Naokazu Yokoya Marko Linna and 1 more

Automatically generating a summary of sports video poses the challenge detecting interesting moments, or highlights, game. Traditional summarization methods leverage editing conventions broadcast that facilitate extraction high-level semantics. However, user-generated videos are not edited and, thus, traditional suitable to generate summary. In order solve this problem, paper proposes novel method uses players' actions as cue determine highlights original video. A deep neural-network-based...

10.1109/tmm.2018.2794265 article EN IEEE Transactions on Multimedia 2018-01-15

Image Coding For Machines: an End-To-End Learned Approach

OPENALEX - Publications

Nam Le Honglei Zhang Francesco Cricri Ramin G. Youvalari Esa Rahtu

Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given dramatic explosion in number generated per day, a question arises: how much better would image codec targeting machine-consumption perform against state-of-the-art codecs human-consumption? In this paper, we propose machines which is neural network (NN) based and end-to-end learned. particular, set...

10.1109/icassp39728.2021.9414465 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

ICface: Interpretable and Controllable Face Reenactment Using GANs

OPENALEX - Publications

Soumya Tripathy Juho Kannala Esa Rahtu

This paper presents a generic face animator that is able to control the pose and expressions of given image. The animation driven by human interpretable signals consisting head angles Action Unit (AU) values. information can be obtained from multiple sources including external driving videos manual controls. Due nature signal, one easily mix between (e.g. image expression another) apply selective postproduction editing. proposed implemented as two stage neural network model learned in...

10.1109/wacv45572.2020.9093474 article EN 2020-03-01

Coming Soon ...