- Multimodal Machine Learning Applications
- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Domain Adaptation and Few-Shot Learning
- Disaster Management and Resilience
- Visual Attention and Saliency Detection
- Quantum-Dot Cellular Automata
- Public Relations and Crisis Communication
- Smart Agriculture and AI
- Advanced Memory and Neural Computing
- Slime Mold and Myxomycetes Research
- Video Analysis and Summarization
- COVID-19 diagnosis using AI
- Anomaly Detection Techniques and Applications
- Data Visualization and Analytics
- Topic Modeling
- Autonomous Vehicle Technology and Safety
- Image Retrieval and Classification Techniques
- Semiconductor materials and devices
- Video Surveillance and Tracking Methods
- Generative Adversarial Networks and Image Synthesis
- Image Enhancement Techniques
- Evacuation and Crowd Dynamics
- Sentiment Analysis and Opinion Mining
- Human Pose and Action Recognition
Technical University of Denmark
2022-2024
Pioneer (United States)
2024
Compute Canada
2022
Massachusetts Institute of Technology
2019-2021
University of Edinburgh
2014-2017
Democritus University of Thrace
2010-2016
Manually annotating object bounding boxes is central to building computer vision datasets, and it very time consuming (annotating ILSVRC [53] took 35s for one high-quality box [62]). It involves clicking on imaginary comers of a tight around the object. This difficult as these are often outside actual several adjustments required obtain box. We propose extreme instead: we ask annotator click four physical points object: top, bottom, left- right-most points. task more natural easy find....
Training object class detectors typically requires a large set of images in which objects are annotated by boundingboxes. However, manually drawing bounding-boxes is very time consuming. We propose new scheme for training only annotators to verify produced automatically the learning algorithm. Our iterates between re-training detector, re-localizing images, and human verification. use verification signal both improve reduce search space re-localisation, makes these steps different what...
Training object class detectors typically requires a large set of images with objects annotated by bounding boxes. However, manually drawing boxes is very time consuming. In this paper we greatly reduce annotation proposing center-click annotations: ask annotators to click on the center an imaginary box which tightly encloses instance. We then incorporate these clicks into existing Multiple Instance Learning techniques for weakly supervised localization, jointly localize over all training...
We address the problem of estimating image difficulty defined as human response time for solving a visual search task. collect annotations PASCAL VOC 2012 data set through crowd-sourcing platform. then analyze what interpretable properties can have an impact on difficulty, and how accurate are those predicting difficulty. Next, we build regression model based deep features learned with state art convolutional neural networks show better results ground-truth scores produced by annotators. Our...
In this paper, we are interested in modeling a how-to instructional procedure, such as cooking recipe, with meaningful and rich high-level representation. Specifically, propose to represent recipes food images programs. Programs provide structured repre-sentation of the task, capturing semantics se-quential relationships actions form graph. This allows them be easily manipulated by users executed agents. To end, build model that is trained learn joint embedding between via self-supervision...
Natural disasters, such as floods, tornadoes, or wildfires, are increasingly pervasive the Earth undergoes global warming. It is difficult to predict when and where an incident will occur, so timely emergency response critical saving lives of those endangered by destructive events. Fortunately, technology can play a role in these situations. Social media posts be used low-latency data source understand progression aftermath disaster, yet parsing this tedious without automated methods. Prior...
In all the living organisms, self-preservation behaviour is almost universal. Even most simple of like slime mould, typically under intense selective pressure to evolve a response ensure their evolution and safety in best possible way. On other hand, evacuation place can be easily characterized as one stressful situations for individuals taking part on it. Taking inspiration from mould behaviour, we are introducing computational bio-inspired model crowd model. Cellular Automata (CA) were...
A food recipe is an ordered set of instructions for preparing a particular dish. From visual perspective, every instruction step can be seen as way to change the appearance dish by adding extra objects (e.g., ingredient) or changing existing ones cooking dish). In this paper, we aim teach machine how make pizza building generative model that mirrors step-by-step procedure. To do so, learn composable module operations which are able either add remove ingredient. Each operator designed...
Quantum-dot fabrication and characterization is a well-established technology, which used in photonics, quantum optics, nanoelectronics. Four quantum-dots placed at the corners of square form unit cell, can hold bit information serve as basis for quantum-dot cellular automata (QCA) nanoelectronic circuits. Although several basic QCA circuits have been designed, fabricated, tested, proving that functional, fast low-power circuits, nanoelectronics still remain its infancy. One reasons this...
Video Object Segmentation (VOS) is crucial for several applications, from video editing to data generation. Training a VOS model requires an abundance of manually labeled training videos. The de-facto traditional way annotating objects humans draw detailed segmentation masks on the target at each frame. This annotation process, however, tedious and time-consuming. To reduce this cost, in paper, we propose EVA-VOS, human-in-the-loop framework object segmentation. Unlike approach, introduce...
Mitigating biases in generative AI and, particularly text-to-image models, is of high importance given their growing implications society. The biased datasets used for training pose challenges ensuring the responsible development these and mitigation through hard prompting or embedding alteration, are most common present solutions. Our work introduces a novel approach to achieve diverse inclusive synthetic images by learning direction latent space solely modifying initial Gaussian noise...
Manually annotating object segmentation masks is very time-consuming. While interactive methods offer a more efficient alternative, they become unaffordable at large scale because the cost grows linearly with number of annotated masks. In this paper, we propose highly annotation scheme for building datasets At scale, images contain many instances similar appearance. We exploit these similarities by using hierarchical clustering on mask predictions made model. that efficiently searches...
In this paper, we are interested in addressing the problem of damage assessment for vehicles, such as cars. This task requires not only detecting location and extent but also identifying damaged part. To train a computer vision system semantic part segmentation images, need to manually annotate images with costly pixel annotations both categories types. overcome need, propose use synthetic data these models. Synthetic can provide samples high variability, pixel-accurate annotations,...
As the global population ages, number of fall-related incidents is on rise. Effective fall detection systems, specifically in healthcare sector, are crucial to mitigate risks associated with such events. This study evaluates role visual context, including background objects, accuracy classifiers. We present a segmentation pipeline semi-automatically separate individuals and objects images. Well-established models like ResNet-18, EfficientNetV2-S, Swin-Small trained evaluated. During...