- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Fire Detection and Safety Systems
- Advanced Vision and Imaging
- Advanced Neural Network Applications
- Topic Modeling
- Visual Attention and Saliency Detection
- Image and Object Detection Techniques
- Anomaly Detection Techniques and Applications
- Advanced Image and Video Retrieval Techniques
- Image Enhancement Techniques
- Infrared Target Detection Methodologies
- Adversarial Robustness in Machine Learning
- Face and Expression Recognition
- Advanced Measurement and Detection Methods
- Face recognition and analysis
- Radiomics and Machine Learning in Medical Imaging
- Impact of Light on Environment and Health
- Industrial Vision Systems and Defect Detection
- Expert finding and Q&A systems
- Medical Image Segmentation Techniques
- Text and Document Classification Technologies
- Image Processing and 3D Reconstruction
- Traditional Chinese Medicine Analysis
- Domain Adaptation and Few-Shot Learning
Institute of Automation
2015-2024
Chinese Academy of Sciences
2013-2024
University of Chinese Academy of Sciences
2018-2024
Shandong Institute of Automation
2013-2024
Beijing Academy of Artificial Intelligence
2020-2023
Shanghai Jiao Tong University
2021-2023
Shenyang University of Technology
2021
Lanzhou University
2020
Research Institute of Highway
2020
Bundesministerium für Klimaschutz, Umwelt, Energie, Mobilität, Innovation und Technologie
2020
The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results 62 are presented. number tested makes VOT 2015 the largest benchmark on tracking to date. For each participating tracker, a short description is provided in appendix. Features VOT2015 go beyond its VOT2014 predecessor are: (i) new dataset twice as large with full annotation targets by rotated bounding boxes and...
Offline training for object tracking has recently shown great potentials in balancing accuracy and speed. However, it is still difficult to adapt an offline trained model a target tracked online. This work presents Residual Attentional Siamese Network (RASNet) high performance tracking. The RASNet reformulates the correlation filter within framework, introduces different kinds of attention mechanisms without updating In particular, by exploiting general attention, adapted residual channel...
The Visual Object Tracking challenge VOT2017 is the fifth annual tracker benchmarking activity organized by VOT initiative. Results of 51 trackers are presented; many state-of-the-art published at major computer vision conferences or journals in recent years. evaluation included standard and other popular methodologies a new "real-time" experiment simulating situation where processes images as if provided continuously running sensor. Performance tested typically far exceeds baselines. source...
Visual tracking has attracted a significant attention in the last few decades. The recent surge number of publications on tracking-related problems have made it almost impossible to follow developments field. One reasons is that there lack commonly accepted annotated data-sets and standardized evaluation protocols would allow objective comparison different methods. To address this issue, Object Tracking (VOT) workshop was organized conjunction with ICCV2013. Researchers from academia as well...
Discriminant Correlation Filters (DCF) based methods now become a kind of dominant approach to online object tracking. The features used in these methods, however, are either on hand-crafted like HoGs, or convolutional trained independently from other tasks image classification. In this work, we present an end-to-end lightweight network architecture, namely DCFNet, learn the and perform correlation tracking process simultaneously. Specifically, treat DCF as special filter layer added Siamese...
These days, physiological signals have been studied more broadly for emotion recognition to realize emotional intelligence in human-computer interaction. However, due the complexity of emotions and individual differences responses, how design reliable effective models has become an important issue. In this article, we propose a regularized deep fusion framework based on multimodal signals. After extracting features from different types signals, construct ensemble dense embeddings using...
The advent of foundation models (FMs) as an emerging suite AI techniques has struck a wave opportunities in computational healthcare. interactive nature these models, guided by pre-training data and human instructions, ignited data-centric paradigm that emphasizes better characterization, quality, scale. In healthcare AI, obtaining processing high-quality clinical records been longstanding challenge, ranging from quantity, annotation, patient privacy, ethics. this survey, we investigate wide...
Efficient and robust grasp pose detection is vital for robotic manipulation. For general 6 DoF grasping, conventional methods treat all points in a scene equally usually adopt uniform sampling to select candidates. However, we discover that ignoring where greatly harms the speed accuracy of current methods. In this paper, propose "graspness", quality based on geometry cues distinguishes graspable area cluttered scenes. A look-ahead searching method proposed measuring graspness statistical...
Recently, sparse representation has been introduced for robust object tracking. By representing the sparsely, i.e., using only a few templates via L1-norm minimization, these so-called L1-trackers exhibit promising tracking results. In this work, we address template building and updating problem in L1-tracking approaches, which not fully studied. We propose to perform updating, new perspective, as an online incremental dictionary learning problem, is efficiently solved through optimization...
Boosted by large and standardized benchmark datasets, visual object tracking has made great progress in recent years brought about many new trackers. Among these trackers, correlation filter based schema exhibits impressive robustness accuracy. In this work, we present a fully functional algorithm which is able to simultaneously model target appearance changes from spatial displacements, scale variations, rotation transformations. The proposed tracker first represents the exhaustive template...
Open- vocabulary object detection aims to detect novel categories beyond the training set. The advanced open- two-stage detectors employ instance-level visual-to- visual knowledge distillation align space of detector with semantic Pre-trained Visual-Language Model (PVLM). However, in more efficient one-stage detector, absence class-agnostic proposals hinders distil-lation on unseen objects, leading severe performance degradation. In this paper, we propose a hierarchical visual-language...
An appearance model adaptable to changes in object is critical visual tracking. In this paper, we treat an image patch as a two-order tensor which preserves the original structure. We design two graphs for characterizing intrinsic local geometrical structure of samples and background. Graph embedding used reduce dimensions tensors while preserving graphs. Then, discriminant space constructed. prove propositions finding transformation matrices are map tensor-based graph space. order encode...
Scale adaptation is crucial to object tracking as the visual size of target changes continuously. Many existing algorithms, however, simply ignore scale either for consideration efficiency or lack principle ways estimation. In this work, we present an efficient and effective adaptive algorithm by proposing a correlation filter based tracker in joint spatial space. We find that exhaustive template searching space can be well modeled block-circulant matrix. With properties matrices, prove...
This work presents a novel end-to-end trainable CNN model for high performance visual object tracking. It learns both low-level fine-grained representations and high-level semantic embedding space in mutual reinforced way, multi-task learning strategy is proposed to perform the correlation analysis on from levels. In particular, fully convolutional encoder-decoder network designed reconstruct original features projections preserve all geometric information. Moreover, filter layer working...
Siamese trackers are shown to be vulnerable adversarial attacks recently. However, the existing attack methods craft perturbations for each video independently, which comes at a non-negligible computational cost. In this paper, we show existence of universal that can enable targeted attack, e.g., forcing tracker follow ground-truth trajectory with specified offsets, video-agnostic and free from inference in network. Specifically, by adding translucent perturbation template image <italic...
Recently, the transformer has enabled speed-oriented trackers to approach state-of-the-art (SOTA) performance with high-speed thanks smaller input size or lighter feature extraction backbone, though they still substantially lag behind their corresponding performance-oriented versions. In this paper, we demonstrate that it is possible narrow even close gap while achieving high tracking speed based on size. To end, non-uniformly resize cropped image have a resolution of area where target more...
Online learning is crucial to robust visual object tracking as it can provide high discrimination power in the presence of background distractors. However, there are two contradictory factors affecting its successful deployment on real platform: issue due challenges vanilla gradient descent, which does not guarantee good convergence; robustness over-fitting resulting from excessive update with limited memory size (the oldest samples discarded). Despite many dedicated techniques proposed...