- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Anomaly Detection Techniques and Applications
- Image Retrieval and Classification Techniques
- Video Analysis and Summarization
- Multimodal Machine Learning Applications
- Gait Recognition and Analysis
- Human Mobility and Location-Based Analysis
- Automated Road and Building Extraction
- Face recognition and analysis
- Domain Adaptation and Few-Shot Learning
- Information Retrieval and Search Behavior
- Image and Object Detection Techniques
- Sports Analytics and Performance
Tencent (China)
2021-2025
Amazon (Germany)
2023
Person re-identification (ReID) has gained an impressive progress in recent years. However, the occlusion is still a common and challenging problem for ReID methods. Several mainstream methods utilize extra cues (e.g., human pose information) to distinguish parts from obstacles alleviate problem. Although achieving inspiring progress, these severely rely on fine-grained cues, are sensitive estimation error cues. In this paper, we show that existing may degrade if information sparse or noisy....
The SoccerNet 2022 challenges were the second annual video understanding organized by team. In 2022, composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving timestamps in long untrimmed videos, (2) replay grounding, live moment an shown a replay, (3) pitch localization, detecting line and goal part elements, (4) camera calibration, dedicated to intrinsic extrinsic parameters, (5) player re-identification, same players across multiple views, (6) object tracking, tracking...
Text-based Person Search (TPS), is targeted on retrieving pedestrians to match text descriptions instead of query images. Recent Vision-Language Pre-training (VLP) models can bring transferable knowledge downstream TPS tasks, resulting in more efficient performance gains. However, existing methods improved by VLP only utilize pre-trained visual encoders, neglecting the corresponding textual representation and breaking significant modality alignment learned from large-scale pre-training. In...
Existing deep learning approaches for person re-identification (Re-ID) mostly rely on large-scale and well-annotated training data. However, human-annotated labels are prone to label noise in real-world applications. Previous Re-ID works mainly focus random noise, which doesn't properly reflect the characteristic of practical process. In this work, we find visual ambiguity is more common reasonable assumption annotation Re-ID. To handle kind propose a simple effective robust framework,...
Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss model decreases on selected batch, with no regards to performance samples outside batch. It will inevitably cause over-fit data in dominant position (e.g., head imbalanced class, easy or noisy samples). The latest resampling methods address issue by designing specific criterion select trains generalize more certain type hard samples, tail data), which is not adaptive inconsistent real...
Text-based image retrieval has seen considerable progress in recent years. However, the performance of existing methods suffers real life since user is likely to provide an incomplete description image, which often leads results filled with false positives that fit description. In this work, we introduce partial-query problem and extensively analyze its influence on text-based retrieval. Previous interactive tackle by passively receiving users' feedback supplement query iteratively,...
Person search is a challenging task that involves detecting and retrieving individuals from large set of un-cropped scene images. Existing person applications are mostly trained deployed in the same-origin scenarios. However, collecting annotating training samples for each often difficult due to limitation resources labor cost. Moreover, large-scale intra-domain data generally not legally available common developers, regulation privacy public security. Leveraging easily accessible User...
Although Person Re-Identification has made impressive progress, difficult cases like occlusion, change of view-pointand similar clothing still bring great challenges. Besides overall visual features, matching and comparing detailed information is also essential for tackling these This paper proposes two key recognition patterns to better utilize the detail pedestrian images, that most existing methods are unable satisfy. Firstly, Visual Clue Alignment requires model select align decisive...
Person search is a challenging task which aims to achieve joint pedestrian detection and person re-identification (ReID). Previous works have made significant advances under fully weakly supervised settings. However, existing methods ignore the generalization ability of models. In this paper, we take further step present Domain Adaptive Search (DAPS), generalize model from labeled source domain unlabeled target domain. Two major challenges arises new setting: one how simultaneously solve...
Although Person Re-Identification has made impressive progress, difficult cases like occlusion, change of view-point, and similar clothing still bring great challenges. In order to tackle these challenges, extracting discriminative feature representation is crucial. Most the existing methods focus on ReID features from individual images separately. However, when matching two images, we propose that a query image should be dynamically adjusted based contextual information gallery it matches....
The SoccerNet 2023 challenges were the third annual video understanding organized by team. For this edition, composed of seven vision-based tasks split into three main themes. first theme, broadcast understanding, is high-level related to describing events occurring in broadcasts: (1) action spotting, focusing on retrieving all timestamps global actions soccer, (2) ball soccer change state, and (3) dense captioning, with natural language anchored timestamps. second field relates single task...
Current training objectives of existing person Re-IDentification (ReID) models only ensure that the loss model decreases on selected batch, with no regards to performance samples outside batch. It will inevitably cause over-fit data in dominant position (e.g., head imbalanced class, easy or noisy samples). %We call sample updates towards generalizing more a generalizable sample. The latest resampling methods address issue by designing specific criterion select trains generalize certain type...
This is a technical report for CVPR 2021 AliProducts Challenge. Challenge competition proposed studying the large-scale and fine-grained commodity image recognition problem encountered by worldleading ecommerce companies. The product simultaneously meets challenge of noisy annotations, imbalanced (long-tailed) data distribution classification. In our solution, we adopt stateof-the-art model architectures both CNNs Transformer, including ResNeSt, EfficientNetV2, DeiT. We found that iterative...
Text-based image retrieval has seen considerable progress in recent years. However, the performance of existing methods suffers real life since user is likely to provide an incomplete description image, which often leads results filled with false positives that fit description. In this work, we introduce partial-query problem and extensively analyze its influence on text-based retrieval. Previous interactive tackle by passively receiving users' feedback supplement query iteratively,...