NFDI4DS | UHH-SEMS - Publication Details

Anton van den Hengel

ORCID: 0000-0003-3027-8364

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5028024287

Research Areas

Domain Adaptation and Few-Shot Learning
Advanced Image and Video Retrieval Techniques
Multimodal Machine Learning Applications
Video Surveillance and Tracking Methods
Advanced Vision and Imaging
Human Pose and Action Recognition
Anomaly Detection Techniques and Applications
Advanced Neural Network Applications
Sparse and Compressive Sensing Techniques
Robotics and Sensor-Based Localization
Image and Object Detection Techniques
Advanced Image Processing Techniques
Topic Modeling
Image Processing Techniques and Applications
Face and Expression Recognition
Image Retrieval and Classification Techniques
Machine Learning and Algorithms
Optical measurement and interference techniques
3D Surveying and Cultural Heritage
Machine Learning and Data Classification
Generative Adversarial Networks and Image Synthesis
Image and Signal Denoising Methods
Computer Graphics and Visualization Techniques
Image Enhancement Techniques
Network Security and Intrusion Detection

Australian Centre for Robotic Vision
2016-2025

The University of Adelaide
2016-2025

Amazon (Germany)
2022-2024

Rochester Institute of Technology
2020

Vision Australia
2017

Australian Research Council
2015

Image-Based Recommendations on Styles and Substitutes

OPENALEX - Publications

Julian McAuley Christopher Targett Qinfeng Shi Anton van den Hengel

Humans inevitably develop a sense of the relationships between objects, some which are based on their appearance. Some pairs objects might be seen as being alternatives to each other (such two jeans), while others may complementary pair jeans and matching shirt). This information guides many choices that people make, from buying clothes interactions with other. We seek here model this human Our approach is not fine-grained modeling user annotations but rather capturing largest dataset...

10.1145/2766462.2767755 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2015-08-04

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

OPENALEX - Publications

Zifeng Wu Chunhua Shen Anton van den Hengel

10.1016/j.patcog.2019.01.006 article EN Pattern Recognition 2019-01-06

Memorizing Normality to Detect Anomaly: Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection

OPENALEX - Publications

Dong Gong Lingqiao Liu Vuong Le Budhaditya Saha Moussa Reda Mansour and 2 more

Deep autoencoder has been extensively used for anomaly detection. Training on the normal data, is expected to produce higher reconstruction error abnormal inputs than ones, which adopted as a criterion identifying anomalies. However, this assumption does not always hold in practice. It observed that sometimes "generalizes" so well it can also reconstruct anomalies well, leading miss detection of To mitigate drawback based detector, we propose augment with memory module and develop an...

10.1109/iccv.2019.00179 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

OPENALEX - Publications

Peter Anderson Qi Wu Damien Teney Jake Bruce Mark Johnson and 4 more

A robot that can carry out a natural-language instruction has been dream since before the Jetsons cartoon series imagined life of leisure mediated by fleet attentive helpers. It is remains stubbornly distant. However, recent advances in vision and language methods have made incredible progress closely related areas. This significant because interpreting navigation on basis what it sees carrying process similar to Visual Question Answering. Both tasks be interpreted as visually grounded...

10.1109/cvpr.2018.00387 article EN 2018-06-01

Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation

OPENALEX - Publications

Guosheng Lin Chunhua Shen Anton van den Hengel Ian Reid

Recent advances in semantic image segmentation have mostly been achieved by training deep convolutional neural networks (CNNs). We show how to improve through the use of contextual information, specifically, we explore 'patch-patch' context between regions, and 'patch-background' context. For learning from patch-patch context, formulate Conditional Random Fields (CRFs) with CNN-based pairwise potential functions capture correlations neighboring patches. Efficient piecewise proposed...

10.1109/cvpr.2016.348 article EN 2016-06-01

A survey of appearance models in visual object tracking

OPENALEX - Publications

Xi Li Weiming Hu Chunhua Shen Zhongfei Zhang Anthony Dick and 1 more

Visual object tracking is a significant computer vision task which can be applied to many domains, such as visual surveillance, human interaction, and video compression. Despite extensive research on this topic, it still suffers from difficulties in handling complex appearance changes caused by factors illumination variation, partial occlusion, shape deformation, camera motion. Therefore, effective modeling of the 2D tracked objects key issue for success tracker. In literature, researchers...

10.1145/2508037.2508039 article EN ACM Transactions on Intelligent Systems and Technology 2013-09-01

Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs

OPENALEX - Publications

Bo Li Chunhua Shen Yuchao Dai Anton van den Hengel Mingyi He

Predicting the depth (or surface normal) of a scene from single monocular color images is challenging task. This paper tackles this and essentially underdetermined problem by regression on deep convolutional neural network (DCNN) features, combined with post-processing refining step using conditional random fields (CRF). Our framework works at two levels, super-pixel level pixel level. First, we design DCNN model to learn mapping multi-scale image patches or normal values Second, estimated...

10.1109/cvpr.2015.7298715 article EN 2015-06-01

REFUGE Challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs

OPENALEX - Publications

José Ignacio Orlando Huazhu Fu João Barbosa‐Breda Karel Van Keer Deepti R. Bathula and 26 more

10.1016/j.media.2019.101570 article EN Medical Image Analysis 2019-10-08

What Value Do Explicit High Level Concepts Have in Vision to Language Problems?

OPENALEX - Publications

Qi Wu Chunhua Shen Lingqiao Liu Anthony Dick Anton van den Hengel

Much recent progress in Vision-to-Language (V2L) problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to directly from image features text. In this paper we investigate whether direct succeeds due to, or despite, the fact that it avoids explicit representation information. We propose method incorporating concepts into successful CNN-RNN...

10.1109/cvpr.2016.29 article EN 2016-06-01

Graph-Structured Representations for Visual Question Answering

OPENALEX - Publications

Damien Teney Lingqiao Liu Anton van den Hengel

This paper proposes to improve visual question answering (VQA) with structured representations of both scene contents and questions. A key challenge in VQA is require joint reasoning over the text domains. The predominant CNN/LSTM-based approach limited by monolithic vector that largely ignore structure question. CNN feature vectors cannot effectively capture situations as simple multiple object instances, LSTMs process questions series words, which do not reflect true complexity language...

10.1109/cvpr.2017.344 article EN 2017-07-01

Fast Supervised Hashing with Decision Trees for High-Dimensional Data

OPENALEX - Publications

Guosheng Lin Chunhua Shen Qinfeng Shi Anton van den Hengel David Suter

Supervised hashing aims to map the original features compact binary codes that are able preserve label based similarity in Hamming space. Non-linear hash functions have demonstrated advantage over linear ones due their powerful generalization capability. In literature, kernel typically used achieve non-linearity hashing, which encouraging retrieval performance at price of slow evaluation and training time. Here we propose use boosted decision trees for achieving fast train evaluate, hence...

10.1109/cvpr.2014.253 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

FVQA: Fact-Based Visual Question Answering

OPENALEX - Publications

Peng Wang Qi Wu Chunhua Shen Anthony Dick Anton van den Hengel

Visual Question Answering (VQA) has attracted much attention in both computer vision and natural language processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, models built upon them, have focused on questions which are answerable by direct analysis question image alone. The set such that require no external information to answer is interesting, but very limited. It excludes common sense, or basic...

10.1109/tpami.2017.2754246 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2017-09-19

From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur

OPENALEX - Publications

Dong Gong Jie Yang Lingqiao Liu Yanning Zhang Ian Reid and 3 more

Removing pixel-wise heterogeneous motion blur is challenging due to the ill-posed nature of problem. The predominant solution estimate kernel by adding a prior, but extensive literature on subject indicates difficulty in identifying prior which suitably informative, and general. Rather than imposing based theory, we propose instead learn one from data. Learning over latent image would require modeling all possible content. critical observation underpinning our approach, however, that...

10.1109/cvpr.2017.405 preprint EN 2017-07-01

Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

OPENALEX - Publications

Damien Teney Peter Anderson Xiaodong He Anton van den Hengel

Deep Learning has had a transformative impact on Computer Vision, but for all of the success there is also significant cost. This that models and procedures used are so complex intertwined it often impossible to distinguish individual design engineering choices each model embodies. ambiguity diverts progress in field, leads situation where developing state-of-the-art as much an art science. As step towards addressing this problem we present massive exploration effects myriad architectural...

10.1109/cvpr.2018.00444 article EN 2018-06-01

Visual question answering: A survey of methods and datasets

OPENALEX - Publications

Qi Wu Damien Teney Peng Wang Chunhua Shen Anthony Dick and 1 more

10.1016/j.cviu.2017.05.001 article EN Computer Vision and Image Understanding 2017-05-05

Learning to rank in person re-identification with metric ensembles

OPENALEX - Publications

Sakrapee Paisitkriangkrai Chunhua Shen Anton van den Hengel

We propose an effective structured learning based approach to the problem of person re-identification which outperforms current state-of-the-art on most benchmark data sets evaluated. Our framework is built basis multiple low-level hand-crafted and high-level visual features. then formulate two optimization algorithms, directly optimize evaluation measures commonly used in re-identification, also known as Cumulative Matching Characteristic (CMC) curve. new practical many real-world...

10.1109/cvpr.2015.7298794 article EN 2015-06-01

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

OPENALEX - Publications

Qi Wu Chunhua Shen Peng Wang Anthony Dick Anton van den Hengel

Much of the recent progress in Vision-to-Language problems has been achieved through a combination Convolutional Neural Networks (CNNs) and Recurrent (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to directly from image features text. In this paper we first propose method incorporating concepts into successful CNN-RNN approach, show that it achieves significant improvement on state-of-the-art both captioning visual question answering. We...

10.1109/tpami.2017.2708709 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2017-05-26

Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge from External Sources

OPENALEX - Publications

Qi Wu Peng Wang Chunhua Shen Anthony Dick Anton van den Hengel

We propose a method for visual question answering which combines an internal representation of the content image with information extracted from general knowledge base to answer broad range image-based questions. This allows more complex questions be answered using predominant neural network-based approach than has previously been possible. It particularly asked about contents image, even when itself does not contain whole answer. The constructs textual semantic and merges it sourced base,...

10.1109/cvpr.2016.500 preprint EN 2016-06-01

Deep Anomaly Detection with Deviation Networks

OPENALEX - Publications

Guansong Pang Chunhua Shen Anton van den Hengel

Although deep learning has been applied to successfully address many data mining problems, relatively limited work done on for anomaly detection. Existing detection methods, which focus new feature representations enable downstream perform indirect optimization of scores, leading data-inefficient and suboptimal scoring. Also, they are typically designed as unsupervised due the lack large-scale labeled data. As a result, difficult leverage prior knowledge (e.g., few anomalies) when such...

10.1145/3292500.3330871 article EN 2019-07-25

Is face recognition really a Compressive Sensing problem?

OPENALEX - Publications

Qinfeng Shi Anders Eriksson Anton van den Hengel Chunhua Shen

Compressive Sensing has become one of the standard methods face recognition within literature. We show, however, that sparsity assumption which underpins much this work is not supported by data. This lack in data means compressive sensing approach cannot be guaranteed to recover exact signal, and therefore sparse approximations may deliver robustness or performance desired. In vein we show a simple ℓ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/cvpr.2011.5995556 article EN 2011-06-01

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks

OPENALEX - Publications

Peng Wang Qi Wu Jiewei Cao Chunhua Shen Lianli Gao and 1 more

The task in referring expression comprehension is to localize the object instance an image described by a phrased natural language. As language-to-vision matching task, key this problem learn discriminative feature that can adapt used. To avoid ambiguity, normally tends describe not only properties of referent itself, but also its relationships neighbourhood. capture and exploit important information we propose graph-based, language-guided attention mechanism. Being composed node component...

10.1109/cvpr.2019.00206 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

OPENALEX - Publications

Zifeng Wu Chunhua Shen Anton van den Hengel

The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of network. Recently, however, evidence amassing simply may not be best way to increase performance, particularly given other limitations. Investigations into residual have also suggested they in fact operating as single network, but rather an ensemble many relatively shallow networks. We examine these issues, and doing so arrive at new interpretation...

10.48550/arxiv.1611.10080 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Attention-Guided Network for Ghost-Free High Dynamic Range Imaging

OPENALEX - Publications

Qingsen Yan Dong Gong Qinfeng Shi Anton van den Hengel Chunhua Shen and 2 more

Ghosting artifacts caused by moving objects or misalignments is a key challenge in high dynamic range (HDR) imaging for scenes. Previous methods first register the input low (LDR) images using optical flow before merging them, which are error-prone and cause ghosts results. A very recent work tries to bypass flows via deep network with skip-connections, however, still suffers from ghosting severe movement. To avoid source, we propose novel attention-guided end-to-end neural (AHDRNet) produce...

10.1109/cvpr.2019.00185 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Effective semantic pixel labelling with convolutional networks and Conditional Random Fields

OPENALEX - Publications

Sakrapee Paisitkriangkrai Jamie Sherrah Pranam Janney Anton van den Hengel

Large amounts of available training data and increasing computing power have led to the recent success deep convolutional neural networks (CNN) on a large number applications. In this paper, we propose an effective semantic pixel labelling using CNN features, hand-crafted features Conditional Random Fields (CRFs). Both are applied dense image patches produce per-pixel class probabilities. The CRF infers that smooths regions while respecting edges present in imagery. method is ISPRS 2D...

10.1109/cvprw.2015.7301381 article EN 2015-06-01

Coming Soon ...