- Human Pose and Action Recognition
- Video Surveillance and Tracking Methods
- Video Analysis and Summarization
- Advanced Image and Video Retrieval Techniques
- Multimodal Machine Learning Applications
- Music and Audio Processing
- Anomaly Detection Techniques and Applications
- Gait Recognition and Analysis
- Advanced Chemical Sensor Technologies
- Advanced Neural Network Applications
- Robotics and Sensor-Based Localization
- Visual Attention and Saliency Detection
- Automated Road and Building Extraction
- Biometric Identification and Security
- Remote Sensing and LiDAR Applications
- Phonocardiography and Auscultation Techniques
- Face recognition and analysis
- COVID-19 diagnosis using AI
- Spectroscopy and Chemometric Analyses
- Diverse Aspects of Tourism Research
- Inertial Sensor and Navigation
- Retinal Imaging and Analysis
- Advanced Sensor and Energy Harvesting Materials
- Advanced Control Systems Optimization
- Structural Engineering and Materials Analysis
Nvidia (United States)
2025
Beijing Institute of Technology
2024
Shanghai Tongji Urban Planning and Design Institute
2024
Chongqing University of Technology
2023-2024
Qingdao University
2024
Shandong University of Science and Technology
2024
Sichuan University
2020-2023
Sesame Workshop
2023
Qilu University of Technology
2023
Shanghai Jiao Tong University
2012-2022
Camouflaged object detection (COD) aims to identify objects that are perfectly embedded in their environment, which has various downstream applications fields such as medicine, art, and agriculture. However, it is an extremely challenging task spot camouflaged with the perception ability of human eyes. Hence, we claim goal COD not just mimic visual a single RGB domain, but go beyond biological vision. We then introduce frequency domain additional clue better detect from backgrounds. To well...
Combining multiple low-level visual features is a proven and effective strategy for range of computer vision tasks. However, limited attention has been paid to combining such with information from other modalities, as audio videotext, large scale analysis web videos. In our work, we rigorously analyze combine set that capture appearance, color, motion, audio-visual co-occurrence patterns in We also evaluate the utility high-level (i.e., semantic) obtained detecting scene, object, action...
Current state-of-the-art systems for visual content analysis require large training sets each class of interest, and performance degrades rapidly with fewer examples. In this paper, we present a general framework the zeroshot learning problem performing high-level event detection no exemplars, using only textual descriptions. This task goes beyond traditional zero-shot adapting given set classes data to unseen classes. We leverage video image collections free-form text descriptions from...
The ABC/2 method is usually applied to evaluate intracerebral hemorrhage (ICH) volume on computed tomography (CT), although it might be inaccurate and not applicable in estimating extradural or subdural (EDH, SDH) due their irregular hematoma shapes. This study aimed deep framework optimized for the segmentation quantification of ICH, EDH, SDH.The training datasets were 3,000 images retrospectively collected from a collaborating hospital (Hospital A) segmented by Dense U-Net framework. Three...
Emphasizing self-improvement, simulation-driven refinement, and reduced human oversight for autonomous machines development.
Pneumoconiosis staging has been a very challenging task, both for certified radiologists and computer-aided detection algorithms. Although deep learning shown proven advantages in the of pneumoconiosis, it remains pneumoconiosis due to stage ambiguity noisy samples caused by misdiagnosis when they are used training models. In this article, we propose fully paradigm that comprises segmentation procedure procedure. The extracts lung fields chest radiographs through an Asymmetric...
Vectorized high-definition (HD) map is essential for autonomous driving, providing detailed and precise environmental information advanced perception planning. However, current vectorization methods often exhibit deviations, the existing evaluation metric lacks sufficient sensitivity to detect these deviations. To address limitations, we propose integrating philosophy of rasterization into vectorization. Specifically, introduce a new rasterization-based metric, which has superior better...
This paper presents a novel method for traffic sign detection and visibility evaluation from mobile Light Detection Ranging (LiDAR) point clouds the corresponding images. Our algorithm involves two steps. Firstly, based on high retro-reflectivity of MLS is designed in complicated road scenes. To solve spatial features signs, we also create geo-referenced relations between signs roads according to normal ground. Secondly, propose estimation evaluate level combination visual appearance...
In this paper, we propose a novel deep end-to-end network to automatically learn the spatial-temporal fusion features for video-based person re-identification. Specifically, proposed consists of CNN and RNN jointly both spatial temporal input image sequences. The is optimized by utilizing siamese softmax losses simultaneously pull instances same closer push different persons apart. Our trained on full-body part-body sequences respectively complementary representations from holistic local...
UAV remote sensing has been widely used in emergency rescue, disaster relief, environmental monitoring, urban planning, and so on. Image recognition image location monitoring become an academic hotspot the field of computer vision. Convolution neural network model is most commonly processing model. Compared with traditional artificial model, convolution more hidden layers. Its unique pooling operations have higher efficiency processing. It incomparable advantages other forms two-dimensional...
In this paper, we propose a graph correspondence transfer (GCT) approach for person re-identification. Unlike existing methods, the GCT model formulates re-identification as an off-line matching and on-line transferring problem. specific, during training, aims to learn set of templates from positive training pairs with various pose-pair configurations via patch-wise matching. During testing, each pair test samples, select few most similar references, correspondences these references feature...
Face recognition using a single sample per person is challenging problem in computer vision. In this scenario, due to the lack of training samples, it difficult distinguish between inter-class variations caused by identity and intra-class external factors such as illumination, pose, etc. To address problem, we propose scheme improve rate both generating additional samples enrich intra-variation eliminating extract invariant features. Firstly, 3D face modeling module proposed recover...
Crowd video retrieval is an important problem in surveillance management the era of big data, e.g., indexing and browsing. In this paper, we address issue from motion-level perspective by using hand-drawn sketches as queries. Motion sketch based crowd naturally suffers challenges representation. We tackle them leveraging motion structure coding algorithm to extract robust structure-preserved descriptors. For indexing, use decomposition separate sub-motion vector fields with typical patterns...