- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Video Surveillance and Tracking Methods
- Image Retrieval and Classification Techniques
- Human Pose and Action Recognition
- Advanced Vision and Imaging
- Adversarial Robustness in Machine Learning
- Anomaly Detection Techniques and Applications
- Robotics and Sensor-Based Localization
- Domain Adaptation and Few-Shot Learning
- Video Analysis and Summarization
- Text and Document Classification Technologies
- Music and Audio Processing
- Face and Expression Recognition
- Multimodal Machine Learning Applications
- Stochastic Gradient Optimization Techniques
- Statistical Methods and Inference
- Distributed Sensor Networks and Detection Algorithms
- Caching and Content Delivery
- Fire Detection and Safety Systems
- Medical Image Segmentation Techniques
- Remote-Sensing Image Classification
- Image and Signal Denoising Methods
- Machine Learning and Data Classification
- Mobile Agent-Based Network Management
Nanjing Audit University
2023
Pingdingshan University
2011-2022
Didi Chuxing (China)
2019-2021
Nanjing University of Posts and Telecommunications
2021
Rice University
2019-2020
University of Helsinki
2020
University of Science and Technology Liaoning
2020
Wuhan Ship Development & Design Institute
2014
Harbin Institute of Technology
2008-2013
Shanghai University
2011-2012
Emotion recognition in user-generated videos plays an important role human-centered computing. Existing methods mainly employ traditional two-stage shallow pipeline, i.e. extracting visual and/or audio features and training classifiers. In this paper, we propose to recognize video emotions end-to-end manner based on convolutional neural networks (CNNs). Specifically, develop a deep Visual-Audio Attention Network (VAANet), novel architecture that integrates spatial, channel-wise, temporal...
State-of-the-art convolutional neural networks (CNNs) yield record-breaking predictive performance, yet at the cost of high-energy-consumption inference, that prohibits their widely deployments in resource-constrained Internet Things (IoT) applications. We propose a dual dynamic inference (DDI) framework highlights following aspects: 1) we integrate both input-dependent and resource-dependent mechanisms under unified order to fit varying IoT resource requirements practice. DDI is able...
Convolutional neural networks (CNNs) have been increasingly deployed to edge devices. Hence, many efforts made towards efficient CNN inference in resource-constrained platforms. This paper attempts explore an orthogonal direction: how conduct more energy-efficient training of CNNs, so as enable on-device training. We strive reduce the energy cost during training, by dropping unnecessary computations from three complementary levels: stochastic mini-batch on data level; selective layer update...
Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a growing demand for DNN chips. However, designing chips is non-trivial because: (1) mainstream DNNs millions of parameters and operations; (2) the large design space due to numerous choices dataflows, processing elements, memory hierarchy, etc.; (3) an algorithm/hardware co-design needed allow same functionality different decomposition, which would require hardware IPs meet application specifications. Therefore, take long time...
While increasingly deep networks are still in general desired for achieving state-of-the-art performance, many specific inputs a simpler network might already suffice. Existing works exploited this observation by learning to skip convolutional layers an input-dependent manner. However, we argue their binary decision scheme, i.e., either fully executing or completely bypassing one layer input, can be enhanced introducing finer-grained, “softer” decisions. We therefore propose Dynamic...
Domain adaptation (DA) is a technique that transfers predictive models trained on labeled source domain to an unlabeled target domain, with the core difficulty of resolving distributional shift between domains. Currently, most popular DA algorithms are based matching (DM). However in practice, realistic shifts (RDS) may violate their basic assumptions and as result these methods will fail. In this paper, order devise robust algorithms, we first systematically analyze limitations DM methods,...
The U-Net has become the most popular structure in medical image segmentation recent years. Although its performance for is outstanding, a large number of experiments demonstrate that classical network architecture seems to be insufficient when size targets changes and imbalance happens between target background different forms segmentation. To improve architecture, we develop new named densely connected (DenseUNet) this article. proposed DenseUNet adopts dense block feature extraction...
Most previous works on video indexing and recommendation were only based the content of itself, without considering affective analysis viewers, which is an efficient important way to reflect viewers' attitudes, feelings evaluations videos. In this paper, we propose a novel method index recommend videos analysis, mainly facial expression recognition viewers. We first build classifier by embedding process building compositional Haar-like features into hidden conditional random fields (HCRFs)....
Sentiment analysis of user-generated reviews or comments on products and services in social networks can help enterprises to analyze the feedback from customers take corresponding actions for improvement. To mitigate large-scale annotations target domain, domain adaptation (DA) provides an alternate solution by learning a transferable model other labeled source domains. Existing multi-source (MDA) methods either fail extract some discriminative features that are related sentiment, neglect...
Coal is one of the main energy sources in China. The country attaches great importance to development coal mining industry, and production on rise. At same time, mine safety accidents are becoming more frequent, paying attention accidents. underground environment complex, noisy uneven, there will be problems such as occlusion high false detection rate during video monitoring. In order ensure personnel, moving target tracking based monitoring information significance for production. purpose...
In this paper, we address the problem of vehicle turn-counts by class at multiple intersections, which is greatly challenged inaccurate detection and tracking results caused heavy weather, occlusion, illumination variations, background clutter, etc. Therefore, complexity calls for an integrated solution that robustly extracts as much visual information possible efficiently combines it through sequential feedback cycles. We propose such algorithm, effectively detection, modeling, tracking,...
Fusion of visual content with textual information is an effective way for both content-based and keyword-based image retrieval. However, the performance & fusion affected greatly by data noise redundancy in text (such as surrounding HTML pages) intra-class diversity) aspects. This paper presents a manifold-based cross-media optimization scheme to achieve within unified framework. Cross-Media manifold co-training mechanism between Keyword-based Metric Space Vision-Based proposed creatively...
The main purpose of image enhancement technology is to improve the quality better assist those activities daily life that are widely dependent on it like healthcare, industries, education, and surveillance. Due influence complex environments, there risks insufficient detail low contrast in some images. Existing algorithms prone overexposure improper processing. This paper attempts treatment effect Phase Stretch Transform (PST) information medium frequencies. For this purpose, an algorithm...
The activated sludge (AS) process is a biological treatment of wastewater used in sewage plants, which settling AS vitally important for treatment. In AS, however, bulking caused by filamentous bacteria will significantly reduce the capacity sludge. Traditionally, physicochemical method has been to monitor status or performance sludge, while it very compromising means modern digital quality control when image processing and analysis technology determine order avoid disadvantages deficient...
Machine vision is an important branch of the rapid development modern artificial intelligence, and it a key technology to convert image information monitoring targets into digital signals. However, due wide range machine applications, this research focuses on its application in video surveillance. In era detection tracking moving objects have always been issue The simulation human realized by combining relevant functions computer acquisition device, which enables ability recognize...
Texture classification analysis and play an important role in the domain of content-based image retrieval, segmentation, scene recognition image/video analysis. This paper proposes a novel robust texture descriptor on variance rotation, scale illumination, which combines dominant orientation multifractal base Gabor filter. The orientations are extracted corresponding Gaussian scales to handle rotation variance, then illumination invariant spectrum (MFS) is produced based multi-scale filters...
In this paper we propose a novel approach for wide-baseline image mosaicing which integrates MSER and Hessian-Affine detectors. are both robust detectors stereo matching they can be integrated owing to their availability in the structured scenes rich-textured separately. However, output shape of them is different, so cannot directly integrated. We use an affine covariant construction method unify shape. At same time, introduce standard elliptic equation ellipse parameters. The axial length...
State-of-the-art convolutional neural networks (CNNs) yield record-breaking predictive performance, yet at the cost of high-energy-consumption inference, that prohibits their widely deployments in resource-constrained Internet Things (IoT) applications. We propose a dual dynamic inference (DDI) framework highlights following aspects: 1) we integrate both input-dependent and resource-dependent mechanisms under unified order to fit varying IoT resource requirements practice. DDI is able...
Bloom Filter is a space-efficient probabilistic data structure for checking the membership of elements in set. Given multiple sets, standard not sufficient when looking items to which an element or set input belong. An example case searching documents with keywords large text corpus, essentially matching problem where single keywords, and result possible candidate documents. This article solves by proposing two efficient Multifilters called Matrix Vector, generalize Filter. Both structures...
Does there exist a compact set of keywords that can completely and effectively cover the image annotation problem by expanding from it? In this paper, we answer question presenting complete framework for annotation, which is motivated existence semantic ontology. To generate set, propose cross model optimization strategy both textual visual information topic decomposition, based on so-called Bipartite LSA model, minimize multimodal error energy functions in probabilistic Latent Semantic...