- Human Pose and Action Recognition
- Video Surveillance and Tracking Methods
- Multimodal Machine Learning Applications
- Anomaly Detection Techniques and Applications
- Advanced Image and Video Retrieval Techniques
- Video Analysis and Summarization
- Image Retrieval and Classification Techniques
- Recommender Systems and Techniques
- Auction Theory and Applications
- Blockchain Technology Applications and Security
- Gait Recognition and Analysis
- Fire Detection and Safety Systems
- Image Enhancement Techniques
- Digital Rights Management and Security
- EEG and Brain-Computer Interfaces
- Generative Adversarial Networks and Image Synthesis
- Face and Expression Recognition
- Diabetic Foot Ulcer Assessment and Management
- Advanced Neural Network Applications
- Neuroscience and Neural Engineering
- Image Processing and 3D Reconstruction
- Advanced Bandit Algorithms Research
- Music and Audio Processing
- Advanced Graph Neural Networks
- Domain Adaptation and Few-Shot Learning
Tianjin haihe hospital
2024
The University of Tokyo
2019-2023
Wuhan University of Science and Technology
2021-2023
Xi'an University of Architecture and Technology
2023
Meizu (China)
2023
Tiangong University
2023
Qufu Normal University
2023
Huaiyin Institute of Technology
2022-2023
Southwest University
2022
Tokyo University of Information Sciences
2021-2022
In this paper, we propose a self-supervised contrastive learning method to learn video feature representations. traditional methods, constraints from anchor, positive, and negative data pairs are used train the model. such case, different samplings of same treated as positives, clips videos negatives. Because spatio-temporal information is important for representation, set temporal more strictly by introducing intra-negative samples. addition samples videos, extended breaking relations in...
Conventional video summarization approaches based on reinforcement learning have the problem that reward can only be received after whole summary is generated. Such kind of sparse and it makes hard to converge. Another labelling each shot tedious costly, which usually prohibits construction large-scale datasets. To solve these problems, we propose a weakly supervised hierarchical framework, decomposes task into several subtasks enhance quality. This framework consists manager network worker...
We propose a self-supervised method to learn feature representations from videos. A standard approach in traditional methods uses positive-negative data pairs train with contrastive learning strategy. In such case, different modalities of the same video are treated as positives and clips negatives. Because spatio-temporal information is important for representation, we extend negative samples by introducing intra-negative samples, which transformed anchor breaking temporal relations clips....
Recently, 3D convolutional networks yield good performance in action recognition. However, optical flow stream is still needed to ensure better performance, the cost of which very high. In this paper, we propose a fast but effective way extract motion features from videos utilizing residual frames as input data ConvNets. By replacing traditional stacked RGB with ones, 20.5% and 12.5% points improvements over top-1 accuracy can be achieved on UCF101 HMDB51 datasets when trained scratch....
Recently, 3D convolutional networks yield good performance in action recognition. However, an optical flow stream is still needed for motion representation to ensure better performance, whose cost very high. In this paper, we propose a cheap but effective way extract features from videos utilizing residual frames as the input data ConvNets. By replacing traditional stacked RGB with ones, 35.6% and 26.6% points improvements over top-1 accuracy can be achieved on UCF101 HMDB51 datasets when...
Existing works mainly focus on crowd and ignore the confusion regions which contain extremely similar appearance to in background, while counting needs face these two sides at same time. To address this issue, we propose a novel end-to-end trainable region discriminating erasing network called CDENet. Specifically, CDENet is composed of modules mining module (CRM) guided (GEM). CRM consists basic density estimation (BDE) network, aware bridge network. The BDE first generates primary map,...
Recommendation system for tourist spots has very high potential value including social and economic benefits. The traditional clustering algorithms were usually used to build a recommendation system. However, have the risk on falling into local minimums, which may decrease final performance heavily. Few works focused their research few systems consider population attributes information fitting user implicit preference. To address problem, we our work designing novel spots. First new dataset...
Extracting effective deep features to represent content and style information is the key universal transfer. Most existing algorithms use VGG19 as feature extractor, which incurs a high computational cost impedes real-time transfer on high-resolution images. In this work, we propose lightweight alternative architecture - ArtNet, based GoogLeNet, later pruned by novel channel pruning method named Zero-channel Pruning specially designed for approaches. Besides, theoretically sound sandwich...
Abstract Human rely profoundly on tactile feedback from fingertips to interact with the environment, whereas most hand prostheses used in clinics provide no feedback. In this study we demonstrate feasibility use a display glove that can be worn by unilateral amputee remaining healthy prosthesis. The main benefit is users could easily distinguish for each finger, even without training. claimed advantage supported preliminary tests subjects. This approach may lead development of effective and...
Recently, pretext-task based methods are proposed one after another in self-supervised video feature learning. Meanwhile, contrastive learning also yield good performance. Usually, new can beat previous ones as claimed that they could capture "better" temporal information. However, there exist setting differences among them and it is hard to conclude which better. It would be much more convincing comparison if these have reached closer their performance limits possible. In this paper, we...
Malware is becoming a worldwide epidemic. Artificial Immune System self-adaptive method for malware detection. However the scalability and coverage problems reduced detection efficiency of an System. In order to solve these problems, this paper proposed model called Collaborative model, independent immune bodies in different computers were organized by virtual structure Body. could share detectors with each other, improve efficiency. A collaborative module was added every body communication...
This Work has been Retracted by ACM because one or more of the authors this were proven to have known believed that contained incorrect and/or falsified results prior publication and violated anonymity independence review process for their paper "3D-based video recognition acceleration leveraging temporal locality" Proceedings 46th International Symposium on Computer Architecture (ISCA '19). Association Computing Machinery, New York, NY, USA, 79-90.
Abstract When the traditional collaborative filtering algori- thm is applied to drug recommendation, recommendation effect not good due sparsity of data. In view above problems, this paper proposes a algorithm based on user behavior and semantics (UBDS-CF). Firstly, we construct purchasing matrix users drugs, use weighted cosine similarity calculate basic between drugs; then category label similarity, extract feature vector function text using word model; main together constitute semantic...
Abstract Creating impressive video content such as movies and advertisements is a very important yet challenging task in business that requires both sense of creativity lot experience. Even professionals cannot necessarily invoke the impressions emotions they have aimed at. Many are created then disappear without giving large impact on viewers. This paper presents large-scale dataset television (TV) consists 14,490 videos. The each recognition rate interestingness from results questionnaires...
Recently, 3D convolutional networks (3D ConvNets) yield good performance in action recognition. However, optical flow stream is still needed to ensure better performance, the cost of which very high. In this paper, we propose a fast but effective way extract motion features from videos utilizing residual frames as input data ConvNets. By replacing traditional stacked RGB with ones, 35.6% and 26.6% points improvements over top-l accuracy can be obtained on UCF101 HMDB51 datasets when...