- Advanced Neuroimaging Techniques and Applications
- Multimodal Machine Learning Applications
- Medical Imaging and Analysis
- Fetal and Pediatric Neurological Disorders
- Medical Image Segmentation Techniques
- Domain Adaptation and Few-Shot Learning
- Generative Adversarial Networks and Image Synthesis
- Image Retrieval and Classification Techniques
- Smart Agriculture and AI
- Advanced MRI Techniques and Applications
- Advanced Image and Video Retrieval Techniques
- Robotics and Sensor-Based Localization
- Natural Language Processing Techniques
- Handwritten Text Recognition Techniques
- MRI in cancer diagnosis
- Advanced Neural Network Applications
- Data Management and Algorithms
- Functional Brain Connectivity Studies
- Machine Learning and Data Classification
- Cerebrospinal fluid and hydrocephalus
- Hand Gesture Recognition Systems
- Philosophy and History of Science
- Brain Tumor Detection and Classification
- Cancer-related molecular mechanisms research
- Insect Pheromone Research and Control
Fondazione Bruno Kessler
2020-2023
Italian Institute of Technology
2020-2023
University of Trento
2020-2023
Institut national de recherche en informatique et en automatique
2023
Kessler Foundation
2020
Politecnico di Milano
2017-2018
Recent advances in vision language models (VLM) have been driven by contrastive such as CLIP, which learn to associate visual information with their corresponding text descriptions. However, these limitations understanding complex compositional scenes involving multiple objects and spatial relationships. To address challenges, we propose a novel approach that diverges from commonly used strategies, rely on the design of hard-negative augmentations. Instead, our work focuses integrating...
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially available, motivating a recent line of work on semi-supervised methods inspired by self-supervised principles. In this paper, we propose conceptually simple yet empirically powerful approach turn clustering-based such as SwAV or DINO into learners. More precisely, introduce multi-task framework merging supervised...
Virtual delineation of white matter bundles in the human brain is paramount importance for multiple applications, such as pre-surgical planning and connectomics. A substantial body literature related to methods that automatically segment from diffusion Magnetic Resonance Imaging (dMRI) data indirectly, by exploiting either idea connectivity between regions or geometry fiber paths obtained with tractography techniques, or, directly, through information volumetric data. Despite remarkable...
Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them visual domain. From having a assistant that could guide us through unfamiliar environments generative models produce images using only high-level text description, vision-language model (VLM) applications will significantly impact our relationship with technology. However, there are many challenges need be addressed improve reliability those models. While language is discrete,...
Field robotics is a fast developing research field, in particular precision agriculture gaining popularity due to the high return productivity and reduced pollution impact on environment. The GRAPE project an ECHORD++ robotic experiment aimed at use of mobile robot for automatic pheromone dispenser distribution vineyards, reduce pesticide thanks mate disruption. This work describes autonomous navigation system such robot. For specific scenario real state art does not exists, so we adapted...
Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these still struggle produce images that consistent with input prompt, oftentimes failing capture object quantities, relations and attributes properly. Existing solutions improve prompt-image consistency suffer from following challenges: (1) they require model fine-tuning, (2) only focus on...
Building world models that accurately and comprehensively represent the real is utmost aspiration for conditional image generative as it would enable their use simulators. For these to be successful models, they should not only excel at quality prompt-image consistency but also ensure high representation diversity. However, current research in mostly focuses on creative applications are predominantly concerned with human preferences of aesthetics. We note have inference time mechanisms - or...
Contrastive learning has emerged as an efficient framework to learn multimodal representations. CLIP, a seminal work in this area, achieved impressive results by training on paired image-text data using the contrastive loss. Recent claims improvements over CLIP additional non-contrastive losses inspired from self-supervised learning. However, it is sometimes hard disentangle contribution of these other implementation details, e.g., augmentation or regularization techniques, used train model....
The aim of this work is to improve the virtual dissection Inferior Frontal Occipital Fasciculus (IFOF) by combining a recent insight on white matter anatomy from ex-vivo and data driven approach with deep learning model. Current methods tract are not robust respect false positives neglecting neuroanatomical waypoints given tract, like stem. In we design model segment stem IFOF show how can be improved. proposed method validated Human Connectome Project dataset, where expert neuroanatomists...
Text-to-image diffusion models have been shown to suffer from sample-level memorization, possibly reproducing near-perfect replica of images that they are trained on, which may be undesirable. To remedy this issue, we develop the first differentially private (DP) retrieval-augmented generation algorithm is capable generating high-quality image samples while providing provable privacy guarantees. Specifically, assume access a text-to-image model on small amount public data, and design DP...
Learning good representations involves capturing the diverse ways in which data samples relate. Contrastive loss - an objective matching related underlies methods from self-supervised to multimodal learning. losses, however, can be viewed more broadly as modifying a similarity graph indicate how should relate embedding space. This view reveals shortcoming contrastive learning: is binary, only one sample positive sample. Crucially, similarities \textit{across} are ignored. Based on this...
As the use of text-to-image generative models increases, so does adoption automatic benchmarking methods used in their evaluation. However, while metrics and datasets abound, there are few unified libraries that provide a framework for performing evaluations across many metrics. Furthermore, rapid introduction increasingly robust requires evaluation remain flexible to new Finally, remains gap synthesizing order deliver actionable takeaways about model performance. To enable unified,...
Abstract Virtual delineation of white matter bundles in the human brain is paramount importance for multiple applications, such as pre-surgical planning and connectomics. A substantial body literature related to methods that automatically segment from diffusion Magnetic Resonance Imaging (dMRI) data indirectly, by exploiting either idea connectivity between regions or geometry fiber paths obtained with tractography techniques, or, directly, through information volumetric data. Despite...
The research in biometric recognition using hand shape has been somewhat stagnating the last decade. Meanwhile, computer vision and machine learning have experienced a paradigm shift with renaissance of deep learning, which set new state-of-the-art many related fields. Inspired by successful applications for other modalities, we propose novel approach to 3D from RGB-D data based on geometric techniques. We show how train our model synthetic retain performance real samples during test time....
Data augmentation has become a crucial component to train state-of-the-art visual representation models. However, handcrafting combinations of transformations that lead improved performances is laborious task, which can result in visually unrealistic samples. To overcome these limitations, recent works have explored the use generative models as learnable data tools, showing promising results narrow application domains, e.g., few-shot learning and low-data medical imaging. In this paper, we...
Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially available, motivating a recent line of work on semi-supervised methods inspired by self-supervised principles. In this paper, we propose conceptually simple yet empirically powerful approach turn clustering-based such as SwAV or DINO into learners. More precisely, introduce multi-task framework merging supervised...