- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Visual Attention and Saliency Detection
- Advanced Vision and Imaging
- Retinal Imaging and Analysis
- 3D Shape Modeling and Analysis
- Human Pose and Action Recognition
- Medical Image Segmentation Techniques
- Domain Adaptation and Few-Shot Learning
- Robotics and Sensor-Based Localization
- 3D Surveying and Cultural Heritage
- Radiomics and Machine Learning in Medical Imaging
- Retinal and Macular Surgery
- Medical Imaging and Analysis
- Glaucoma and retinal disorders
- Automated Road and Building Extraction
- COVID-19 diagnosis using AI
- AI in cancer detection
- Adversarial Robustness in Machine Learning
- Intraocular Surgery and Lenses
- Video Analysis and Summarization
- Multimodal Machine Learning Applications
- Computational Physics and Python Applications
- Hepatocellular Carcinoma Treatment and Prognosis
- Digital Imaging for Blood Diseases
Google (United States)
2023-2025
DeepMind (United Kingdom)
2025
Google (Switzerland)
2022
ETH Zurich
2016-2022
Board of the Swiss Federal Institutes of Technology
2019-2022
This paper tackles the task of semi-supervised video object segmentation, i.e., separation an from background in a video, given mask first frame. We present One-Shot Video Object Segmentation (OSVOS), based on fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned ImageNet, foreground and finally learning appearance single annotated test sequence (hence one-shot). Although all frames are processed independently, results...
In this work, we report the set-up and results of Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI) 2017 Conferences Medical Image Computing Computer-Assisted Intervention (MICCAI) 2018. The image dataset is diverse contains primary secondary tumors varied sizes appearances various lesion-to-background levels (hyper-/hypo-dense), created collaboration seven hospitals research institutions. Seventy-five...
This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise segmentation for images and videos. We do so by adding extra channel image a convolutional neural network (CNN), which contains Gaussian centered each points. The CNN learns transform this information into that matches those demonstrate usefulness approach guided (grabcut-style), interactive segmentation, video dense annotation. show we most results date, also...
Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency redundancy consecutive frames. When smoothness is suddenly broken, such as when an object occluded, or some frames are missing a sequence, result of these can deteriorate significantly. This paper explores orthogonal approach each frame independently, i.e., disregarding information. In particular, it tackles task semi-supervised segmentation: separation...
In this work, we report the set-up and results of Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI) 2017 Conferences Medical Image Computing Computer-Assisted Intervention (MICCAI) 2018. The image dataset is diverse contains primary secondary tumors varied sizes appearances various lesion-to-background levels (hyper-/hypo-dense), created collaboration seven hospitals research institutions. Seventy-five...
We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for multi-scale contour detection uses novel sparse boundary representation hierarchical segmentation; gives significant leap in performance over the state-of-the-art, generalizes very well to unseen categories datasets....
In this work we address task interference in universal networks by considering that a network is trained on multiple tasks, but performs one at time, an approach refer to as "single-tasking tasks". The thus modifies its behaviour through task-dependent feature adaptation, or attention. This gives the ability accentuate features are adapted task, while shunning irrelevant ones. We further reduce forcing gradients be statistically indistinguishable adversarial training, ensuring common...
We present the 2019 DAVIS Challenge on Video Object Segmentation, third edition of series, a public competition designed for task Segmentation (VOS). In addition to original semi-supervised track and interactive introduced in previous edition, new unsupervised multi-object will be featured this year. newly track, participants are asked provide non-overlapping object proposals each image, along with an identifier linking them between frames (i.e. video proposals), without any test-time human...
We address the task of aligning CAD models to a video sequence complex scene containing multiple objects. Our method can process arbitrary videos and fully automatically recover 9 DoF pose for each object appearing in it, thus them common 3D coordinate frame. The core idea our is integrate neural network predictions from individual frames with temporally global, multi-view constraint optimization formulation. This integration resolves scale depth ambiguities per-frame predictions, generally...
Computer vision and robotics are being increasingly applied in medical interventions. Especially interventions where extreme precision is required, they could make a difference. One such application robot-assisted retinal microsurgery. In recent works, conducted under stereo-microscope, with robot-controlled surgical tool. The complementarity of computer has, however, not yet been fully exploited. order to improve the robot control, we interested three-dimensional (3-D) reconstruction...
A fully automatic technique for segmenting the liver and localizing its unhealthy tissues is a convenient tool in order to diagnose hepatic diseases assess response according treatments. In this work we propose method segment lesions from Computed Tomography (CT) scans using Convolutional Neural Networks (CNNs), that have proven good results variety of computer vision tasks, including medical imaging. The network segments consists cascaded architecture, which first focuses on region it....
This paper tackles the task of estimating topology road networks from aerial images. Building on top a global model that performs dense semantical classification pixels image, we design Convolutional Neural Network (CNN) predicts local connectivity among central pixel an input patch and its border points. By iterating this sweep whole image infer network, inspired by human delineating complex network with tip their finger. We perform extensive comprehensive qualitative quantitative...
Recent advances in neural reconstruction enable high-quality 3D object from casually captured image collections. Current techniques mostly analyze their progress on relatively simple collections where Structure-from-Motion (SfM) can provide ground-truth (GT) camera poses. We note that SfM tend to fail in-the-wild such as search results with varying backgrounds and illuminations. To systematic research casual captures, we propose NAVI: a new dataset of category-agnostic objects scans along...
We propose a method for annotating videos of complex multi-object scenes with globally-consistent 3D representation the objects. annotate each object CAD model from database, and place it in coordinate frame scene 9-DoF pose transformation. Our is semi-automatic works on commonly-available RGB videos, without requiring depth sensor. Many steps are performed automatically, tasks by humans simple, well-specified, require only limited reasoning 3D. This makes them feasible crowd-sourcing has...