Lamberto Ballan

ORCID: 0000-0003-0819-851X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Image and Video Retrieval Techniques
  • Multimodal Machine Learning Applications
  • Human Pose and Action Recognition
  • Video Analysis and Summarization
  • Video Surveillance and Tracking Methods
  • Image Retrieval and Classification Techniques
  • Anomaly Detection Techniques and Applications
  • Autonomous Vehicle Technology and Safety
  • Topic Modeling
  • Digital Media Forensic Detection
  • Natural Language Processing Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced Vision and Imaging
  • Image Processing Techniques and Applications
  • Digital Imaging for Blood Diseases
  • Advanced Steganography and Watermarking Techniques
  • Human Mobility and Location-Based Analysis
  • Cell Image Analysis Techniques
  • Handwritten Text Recognition Techniques
  • Music and Audio Processing
  • Robotics and Sensor-Based Localization
  • Robotic Path Planning Algorithms
  • Visual Attention and Saliency Detection
  • Traffic Prediction and Management Techniques
  • Social Robot Interaction and HRI

University of Padua
2005-2024

Civita
2017-2024

University of Florence
2009-2022

National Research Council
2022

Amazon (United States)
2022

University of Modena and Reggio Emilia
2022

Universidad de Las Palmas de Gran Canaria
2022

Marche Polytechnic University
2022

Webb Institute
2022

Technical University of Munich
2022

One of the principal problems in image forensics is determining if a particular authentic or not. This can be crucial task when images are used as basic evidence to influence judgment like, for example, court law. To carry out such forensic analysis, various technological instruments have been developed literature. In this paper, problem detecting an has forged investigated; particular, attention paid case which area copied and then pasted onto another zone create duplication cancel...

10.1109/tifs.2011.2129512 article EN IEEE Transactions on Information Forensics and Security 2011-03-17

Human motion and behaviour in crowded spaces is influenced by several factors, such as the dynamics of other moving agents scene, well static elements that might be perceived points attraction or obstacles. In this work, we present a new model for human trajectory prediction which able to take advantage both human-human human-space interactions. The future humans, are generated observing their past positions interactions with surroundings. To end, propose “context-aware” recurrent neural...

10.1109/icpr.2018.8545447 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2018-08-01

When an attacker wants to falsify image, in most of cases she/he will perform a JPEG recompression. Different techniques have been developed based on diverse theoretical assumptions but very effective solutions not yet. Recently, machine learning approaches started appear the field image forensics solve tasks such as acquisition source identification and forgery detection. In this last case, aim ahead would be get trained neural network able, given to-be-checked reliably localize forged...

10.1109/cvprw.2017.233 article EN 2017-07-01

Some images that are difficult to recognize on their own may become more clear in the context of a neighborhood related with similar social-network metadata. We build this intuition improve multilabel image annotation. Our model uses metadata nonparametrically generate neighborhoods using Jaccard similarities, then deep neural network blend visual information from and its neighbors. Prior work typically models parametrically, contrast, our nonparametric treatment allows perform well even...

10.1109/iccv.2015.525 preprint EN 2015-12-01

Mimicking human ability to forecast future positions or interpret complex interactions in urban scenarios, such as streets, shopping malls squares, is essential develop socially compliant robots self-driving cars. Autonomous systems may gain advantage on anticipating motion avoid collisions naturally behave alongside people. To foresee plausible trajectories, we construct an LSTM (long short-term memory)-based model considering three fundamental factors: people interactions, past...

10.1109/iccvw.2019.00314 article EN 2019-10-01

Automatic image annotation is still an important open problem in multimedia and computer vision. The success of media sharing websites has led to the availability large collections images tagged with human-provided labels. Many approaches previously proposed literature do not accurately capture intricate dependencies between content annotations. We propose a learning procedure based on Kernel Canonical Correlation Analysis which finds mapping visual textual words by projecting them into...

10.1145/2578726.2578728 article EN 2014-04-01

In open set recognition, a classifier has to detect unknown classes that are not known at training time. order recognize new categories, the project input samples of in very compact and separated regions features space for discriminating classes. Recently proposed Capsule Networks have shown outperform alternatives many fields, particularly image however they been fully applied yet open-set recognition. capsule networks, scalar neurons replaced by vectors or matrices, whose entries represent...

10.1109/iccv48922.2021.00017 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Human trajectory forecasting is a key component of autonomous vehicles, social-aware robots and advanced video-surveillance applications. This challenging task typically requires knowledge about past motion, the environment likely destination areas. In this context, multi-modality fundamental aspect its effective modeling can be beneficial to any architecture. Inferring accurate trajectories nevertheless challenging, due inherently uncertain nature future. To overcome these difficulties,...

10.1109/cvprw56347.2022.00282 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2022-06-01

Accurate prediction of future human positions is an essential task for modern video-surveillance systems. Current state-of-the-art models usually rely on a "history" past tracked locations (e.g., 3 to 5 seconds) predict plausible sequence up the next seconds). We feel that this common schema neglects critical traits realistic applications: as collection input trajectories involves machine perception (i.e., detection and tracking), incorrect fragmentation errors may accumulate in crowded...

10.1109/cvpr52688.2022.00644 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We contribute, through this paper, to the design of a novel variational framework able match and recognize multiple instances reference logos in image archives. Reference test images are seen as constellations local features (interest points, regions, etc.) matched by minimizing an energy function mixing: 1) fidelity term that measures quality feature matching, 2) neighborhood criterion captures co-occurrence/geometry, 3) regularization controls smoothness matching solution. also introduce...

10.1109/tip.2012.2226046 article EN IEEE Transactions on Image Processing 2012-10-22

Some images that are difficult to recognize on their own may become more clear in the context of a neighborhood related with similar social-network metadata. We build this intuition improve multilabel image annotation. Our model uses metadata nonparametrically generate neighborhoods using Jaccard similarities, then deep neural network blend visual information from and its neighbors. Prior work typically models parametrically, contrast, our nonparametric treatment allows perform well even...

10.48550/arxiv.1508.07647 preprint EN other-oa arXiv (Cornell University) 2015-01-01

In this paper we describe a system for detection and retrieval of trademarks appearing in sports videos. We propose compact representation video frame content based on SIFT feature points. This can be used to robustly detect, localize, retrieve as they appear variety different types. Classification is performed by matching set descriptors each trademark instance against the features detected video. Localization through robust clustering matched points frame. Experimental results are...

10.1145/1290082.1290096 article EN 2007-09-24

An approach for automatic annotation and retrieval of video content uses semantic concept classifiers ontologies to permit expanded queries synonyms specializations.

10.1109/mmul.2010.4 article EN IEEE Multimedia 2010-01-26

In many application scenarios digital images play a basic role and often it is important to assess if their content realistic or has been manipulated mislead watcher's opinion. Image forensics tools provide answers similar questions. This paper, in particular, focuses on the problem of detecting feigned image created by cloning an area onto another zone make duplication cancel something awkward. The proposed method based SIFT features allows both understand which are points involved...

10.1109/icassp.2010.5495485 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2010-01-01

Automated colonoscopy reporting holds great potential for enhancing quality control and improving cost-effectiveness of procedures. A major challenge lies in the automated identification, tracking, re-association (ReID) polyps tracklets across full-procedure videos. This is essential precise polyp counting enables computation key metrics, such as Adenoma Detection Rate (ADR) Polyps Per Colonoscopy (PPC). However, ReID challenging due to variations appearance, frequent disappearance from...

10.48550/arxiv.2502.10054 preprint EN arXiv (Cornell University) 2025-02-14

Recognition and classification of human actions for annotation unconstrained video sequences has proven to be challenging because the variations in environment, appearance actors, modalities which same action is performed by different persons, speed duration, points view from event observed. This variability reflects difficulty defining effective descriptors deriving appropriate codebooks categorization. In this paper, we propose a novel solution classify videos. It improves on previous...

10.1109/tmm.2012.2191268 article EN IEEE Transactions on Multimedia 2012-03-19

Human capability to anticipate near future from visual observations and non-verbal cues is essential for developing intelligent systems that need interact with people. Several research areas, such as human-robot interaction (HRI), assisted living or autonomous driving foresee events avoid crashes help Egocentric scenarios are classic examples where action anticipation applied due their numerous applications. Such challenging task demands capture model domain's hidden structure reduce...

10.1109/icpr48806.2021.9412660 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2021-01-10

In this paper, we present a novel approach to incrementally learn an Abstract Model of unknown environment, and show how agent can reuse the learned model for tackling Object Goal Navigation task. The is finite state machine in which each abstraction as perceived by certain position orientation. perceptions are high-dimensional sensory data (e.g., RGB-D images), reached exploiting image segmentation Taskonomy bank. learning accomplished executing actions, observing state, updating with...

10.1109/cvpr52688.2022.01445 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01
Coming Soon ...