Andrea Vedaldi

ORCID: 0000-0003-1374-2858
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Advanced Neural Network Applications
  • Advanced Vision and Imaging
  • Human Pose and Action Recognition
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Generative Adversarial Networks and Image Synthesis
  • Video Surveillance and Tracking Methods
  • 3D Shape Modeling and Analysis
  • Robotics and Sensor-Based Localization
  • Image Retrieval and Classification Techniques
  • Computer Graphics and Visualization Techniques
  • Anomaly Detection Techniques and Applications
  • Face recognition and analysis
  • COVID-19 diagnosis using AI
  • Machine Learning and Data Classification
  • Adversarial Robustness in Machine Learning
  • Handwritten Text Recognition Techniques
  • Cell Image Analysis Techniques
  • Medical Image Segmentation Techniques
  • 3D Surveying and Cultural Heritage
  • Remote-Sensing Image Classification
  • Visual Attention and Saliency Detection
  • Image Processing Techniques and Applications
  • Natural Language Processing Techniques

University of Oxford
2015-2024

Science Oxford
2011-2024

Oxford Research Group
2012-2023

Meta (Israel)
2019-2021

University College London
2021

Landscape Research Group
2021

University of Edinburgh
2018

Naver (South Korea)
2017

Xerox (United States)
2016

Miami University
2014

This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets). We consider two techniques, based on computing gradient class score with respect to input image. The first one generates an image, which maximises [Erhan et al., 2009], thus visualising notion class, captured by a ConvNet. second technique computes saliency map, specific given and class. show that such maps can be employed for weakly supervised object segmentation...

10.48550/arxiv.1312.6034 preprint EN other-oa arXiv (Cornell University) 2013-01-01

The goal of this paper is face recognition – from either a single photograph or set faces tracked in video. Recent progress area has been due to two factors: (i) end learning for the task using convolutional neural network (CNN), and (ii) availability very large scale training datasets. We make contributions: first, we show how dataset (2.6M images, over 2.6K people) can be assembled by combination automation human loop, discuss trade off between data purity time; second, traverse through...

10.5244/c.29.41 article EN 2015-01-01

The latest generation of Convolutional Neural Networks (CNN) have achieved impressive results in challenging benchmarks on image recognition and object detection, significantly raising the interest community these methods. Nevertheless, it is still unclear how different CNN methods compare with each other previous state-of-the-art shallow representations such as Bag-of-Visual-Words Improved Fisher Vector. This paper conducts a rigorous evaluation new techniques, exploring deep architectures...

10.5244/c.28.6 preprint EN 2014-01-01

It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change architecture results significant qualitative improvement generated images. The is limited to swapping batch normalization with instance normalization, and apply latter both at training testing times. resulting can be used train high-performance architectures for real-time image generation. code will made available on github https://github.com/DmitryUlyanov/texture_nets. Full...

10.48550/arxiv.1607.08022 preprint EN other-oa arXiv (Cornell University) 2016-01-01

VLFeat is an open and portable library of computer vision algorithms. It aims at facilitating fast prototyping reproducible research for scientists students. includes rigorous implementations common building blocks such as feature detectors, extractors, (hierarchical) k-means clustering, randomized kd-tree matching, super-pixelization. The source code interfaces are fully documented. integrates directly with MATLAB, a popular language research.

10.1145/1873951.1874249 article EN Proceedings of the 30th ACM International Conference on Multimedia 2010-10-25

MatConvNet is an open source implementation of Convolutional Neural Networks (CNNs) with a deep integration in the MATLAB environment. The toolbox designed emphasis on simplicity and flexibility. It exposes building blocks CNNs as easy-to-use functions, providing routines for computing convolutions filter banks, feature pooling, normalisation, much more. can be easily extended, often using only code, allowing fast prototyping new CNN architectures. At same time, it supports efficient...

10.1145/2733373.2807412 article EN 2015-10-13

Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component almost any image understanding system. Nevertheless, our them remains limited. In this paper we conduct direct analysis the visual information contained in representations by asking following question: given an encoding image, which extent is it possible reconstruct itself? To answer question contribute general framework invert representations. We show that method can...

10.1109/cvpr.2015.7299155 article EN 2015-06-01

Patterns and textures are key characteristics of many natural objects: a shirt can be striped, the wings butterfly veined, skin an animal scaly. Aiming at supporting this dimension in image understanding, we address problem describing with semantic attributes. We identify vocabulary forty-seven texture terms use them to describe large dataset patterns collected "in wild". The resulting Describable Textures Dataset (DTD) is basis seek best representation for recognizing describable attributes...

10.1109/cvpr.2014.461 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to ability learn realistic priors from large number of example images. In this paper, we show that, on the contrary, structure generator network sufficient capture great deal low-level statistics prior any learning. order do so, that randomly-initialized neural can be used as handcrafted with results in standard inverse problems such denoising,...

10.1109/cvpr.2018.00984 article EN 2018-06-01

The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It well suited object tracking because its formulation in the Fourier domain provides fast solution, enabling detector be re-trained once per frame. Previous works use Filter, however, have adopted features were either manually designed or trained for different task. This work first overcome this limitation by interpreting learner, which has closed-form as...

10.1109/cvpr.2017.531 article EN 2017-07-01

In this paper we introduce a new method for text detection in natural images. The comprises two contributions: First, fast and scalable engine to generate synthetic images of clutter. This overlays existing background way, accounting the local 3D scene geometry. Second, use train Fully-Convolutional Regression Network (FCRN) which efficiently performs bounding-box regression at all locations multiple scales an image. We discuss relation FCRN recently-introduced YOLO detector, as well other...

10.1109/cvpr.2016.254 article EN 2016-06-01

As machine learning algorithms are increasingly applied to high impact yet risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how arrived at their predictions. In recent years, a number of image saliency methods have been developed summarize where highly complex neural networks "look" in an for evidence However, these techniques limited by heuristic nature and architectural constraints. this paper, we make two main contributions: First,...

10.1109/iccv.2017.371 preprint EN 2017-10-01

This paper introduces FGVC-Aircraft, a new dataset containing 10,000 images of aircraft spanning 100 models, organised in three-level hierarchy. At the finer level, differences between models are often subtle but always visually measurable, making visual recognition challenging possible. A benchmark is obtained by defining corresponding classification tasks and evaluation protocols, baseline results presented. The construction this was made possible work enthusiasts, strategy that can extend...

10.48550/arxiv.1306.5151 preprint EN other-oa arXiv (Cornell University) 2013-01-01

The focus of this paper is speeding up the application convolutional neural networks. While delivering impressive results across a range computer vision and machine learning tasks, these networks are computationally demanding, limiting their deployability. Convolutional layers generally consume bulk processing time, so in work we present two simple schemes for drastically layers. This achieved by exploiting cross-channel or filter redundancy to construct low rank basis filters that rank-1...

10.5244/c.28.88 article EN 2014-01-01

We investigate the fine grained object categorization problem of determining breed animal from an image. To this end we introduce a new annotated dataset pets covering 37 different breeds cats and dogs. The visual is very challenging as these animals, particularly cats, are deformable there can be quite subtle differences between breeds. make number contributions: first, model to classify pet automatically combines shape, captured by part detecting face, appearance, bag-of-words that...

10.1109/cvpr.2012.6248092 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

A large number of novel encodings for bag visual words models have been proposed in the past two years to improve on standard histogram quantized local features. Examples include locality-constrained linear encoding [23], improved Fisher [17], super vector [27], and kernel codebook [20]. While several authors reported very good results challenging PASCAL VOC classification data by means these new techniques, differences feature computation learning algorithms, missing details description...

10.5244/c.25.76 article EN 2011-01-01

Weakly supervised learning of object detection is an important problem in image understanding that still does not have a satisfactory solution. In this paper, we address by exploiting the power deep convolutional neural networks pre-trained on large-scale image-level classification tasks. We propose weakly architecture modifies one such network to operate at level regions, performing simultaneously region selection and classification. Trained as classifier, implicitly learns detectors are...

10.1109/cvpr.2016.311 article EN 2016-06-01

The recent work of Gatys et al., who characterized the style an image by statistics convolutional neural network filters, ignited a renewed interest in texture generation and stylization problems. While their technique uses slow optimization process, recently several authors have proposed to learn generator networks that can produce similar outputs one quick forward pass. are promising, they still inferior visual quality diversity compared generation-by-optimization. In this work, we advance...

10.1109/cvpr.2017.437 article EN 2017-07-01

Our objective is to obtain a state-of-the art object category detector by employing state-of-the-art image classifier search for the in all possible sub-windows. We use multiple kernel learning of Varma and Ray (ICCV 2007) learn an optimal combination exponential χ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> kernels, each which captures different feature channel. features include distribution edges, dense sparse visual words,...

10.1109/iccv.2009.5459183 article EN 2009-09-01

We present a novel clustering objective that learns neural network classifier from scratch, given only unlabelled data samples. The model discovers clusters accurately match semantic classes, achieving state-of-the-art results in eight unsupervised benchmarks spanning image classification and segmentation. These include STL10, an variant of ImageNet, CIFAR10, where we significantly beat the accuracy our closest competitors by 6.6 9.5 absolute percentage points respectively. method is not...

10.1109/iccv.2019.00996 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01
Coming Soon ...