- Advanced Neural Network Applications
- Face and Expression Recognition
- Emotion and Mood Recognition
- Human Pose and Action Recognition
- Speech and Audio Processing
- Multimodal Machine Learning Applications
- Traditional Chinese Medicine Studies
- Video Surveillance and Tracking Methods
- Advanced Image and Video Retrieval Techniques
- Domain Adaptation and Few-Shot Learning
- Respiratory and Cough-Related Research
- Face recognition and analysis
- Voice and Speech Disorders
- Robot Manipulation and Learning
- Speech Recognition and Synthesis
- Infant Health and Development
- 3D Shape Modeling and Analysis
- Reinforcement Learning in Robotics
- Image Retrieval and Classification Techniques
- Biomedical Text Mining and Ontologies
- Gait Recognition and Analysis
- Advanced Vision and Imaging
- Advanced Data Compression Techniques
- Text and Document Classification Technologies
- Music and Audio Processing
Tongji University
2016-2025
Shanghai Institute of Computing Technology
2024
Jiangsu Province Hospital
2023
Nanjing Medical University
2023
Nanjing University
2008-2011
Shanghai Dianji University
2009
Zhejiang University
2004-2008
Shanghai University of Engineering Science
2008
Person re-identification (Re-ID) has achieved great improvement with deep learning and a large amount of labelled training data. However, it remains challenging task for adapting model trained in source domain data to target only unlabelled available. In this work, we develop self-training method progressive augmentation framework (PAST) promote the performance progressively on dataset. Specially, our PAST consists two stages, namely, conservative stage promoting stage. The captures local...
To date, instance segmentation is dominated by two-stage methods, as pioneered Mask R-CNN. In contrast, one-stage alternatives cannot compete with R-CNN in mask AP, mainly due to the difficulty of compactly representing masks, making design methods very challenging. this work, we propose a simple single-shot framework, termed encoding based (MEInst). Instead predicting two-dimensional directly, MEInst distills it into compact and fixed-dimensional representation vector, which allows task be...
Vehicle instance retrieval (IR) often requires one to recognize the fine-grained visual differences between vehicles. Besides holistic appearance of vehicles which is easily affected by viewpoint variation and distortion, vehicle parts also provide crucial cues differentiate near-identical Motivated these observations, we introduce a <i>Part-Guided Attention Network</i> (PGAN) pinpoint prominent part regions effectively combine global local information for discriminative feature learning....
Cough is an essential symptom in respiratory diseases. In the measurement of cough severity, accurate and objective monitor expected by disease society. This paper aims to introduce a better performed algorithm, pretrained deep neural network (DNN), classification problem, which key step monitor. The models are built from two steps, pretrain fine-tuning, followed Hidden Markov Model (HMM) decoder capture tamporal information audio signals. By unsupervised pretraining belief network, good...
Person search aims to localize and identify a specific person from gallery of images. Recent methods can be categorized into two groups, i.e., two-step end-to-end approaches. The former views as independent tasks achieves dominant results using separately trained detection re-identification (Re-ID) models. latter performs in an fashion. Although the approaches yield higher inference efficiency, they largely lag behind those counterparts terms accuracy. In this paper, we argue that gap...
Precisely and automatically detecting the cough sound is of vital clinical importance. Nevertheless, due to privacy protection considerations, transmitting raw audio data cloud not permitted, therefore there a great demand for an efficient, accurate, low-cost solution at edge device. To address this challenge, we propose semi-custom software-hardware co-design methodology help build detection system. Specifically, first design scalable compact convolutional neural network (CNN) structure...
This paper presents an emotion recognition system from clean and noisy speech. Geodesic distance was adopted to preserve the intrinsic geometry of emotional Based on geodesic estimation, enhanced Lipschitz embedding developed embed 64-dimensional acoustic features into a six-dimensional space. In order avoid problems brought by noise reduction, speech performed directly. Linear discriminant analysis (LDA), principal component (PCA) feature selection sequential forward (SFS) with support...
Cough detection and assessment have crucial clinical value for respiratory diseases. Subjective assessments are widely adopted in measurement nowadays, but they neither accurate nor reliable. An automatic objective system cough is strongly expected. Automatic from audio signal has been studied by peer works. But still facing some difficulties like unsatisfactory accuracy or lacking large scale validation. In this paper, deep neural networks (DNN) applied to model acoustic features detection....
When detecting of emotions from music, many features are extracted the original music data.However, there redundant or irrelevant features, which will reduce performance classification models.Considering feature problems, we propose an embedded selection method, called Multi-label Embedded Feature Selection (MEFS), to improve by selecting features.MEFS embeds classifier and considers label correlation.Other three representative multi-label methods, known as LP-Chi, max avg, together with...
Generative Adversarial Networks (GAN) have attracted much research attention recently, leading to impressive results for natural image generation. However, date little success was observed in using GAN generated images improving classification tasks. Here we attempt explore, the context of car license plate recognition, whether it is possible generate synthetic training data improve recognition accuracy. With a carefully-designed pipeline, show that answer affirmative. First, large-scale set...
Cough Recognition is a valuable classification problem in healthcare. Generally, feature representation contributes lot to the overall classifying performance. In this paper, novel extraction method, Gammatone Cepstral Coefficients (GTCC), investigated for cough recognition. The accuracy of GTCC comparing with MFCC evaluated on designed dataset following 10 fold cross-validation schemes. Considering imbalance that dataset, weighted SVM applied as base classifier. results indicate surpass...
Hypertension is one of the major causes heart cerebrovascular diseases. With a good accumulation hypertension clinical data on hand, research hypertension's ZHENG differentiation an important and attractive topic, as Traditional Chinese Medicine (TCM) lies primarily in “treatment based differentiation.” From view mining, modeled classification problem. In this paper, ML-kNN—a multilabel learning model—is used model for hypertension. Feature-level information fusion also further utilization...
Cough is a common symptom in respiratory diseases. To provide valuable clinical information for cough diagnosis and monitoring, objectively evaluating the quantity intensity of based on detection by pattern recognition technologies needed. aims to extract boundaries events from an audio stream. From spectral visualisation, it found that energy spectrum signal spreads widely whole frequency band, which very different speech signal. However, almost all feature extraction methods previous work...
3D part assembly is a promising task in computer vision and robotics, focusing on assembling parts together by predicting their 6-DoF poses. Like most shape understanding tasks, existing methods primarily address this memorizing the poses of during training process, leading to inaccuracies complex assemblies poor generalization novel categories. In order essentially improve performance, structure knowledge target indispensable before assembling, which abstracts potential composition...
There is a long history of coronary heart disease (CHD) diagnosis and treatment in Chinese medicine (CM), but formalized description CM knowledge still unavailable. This study aims to analyze set clinical data, which important urgent.Relative associated density (RAD) was used the one-way links between symptoms or syndromes both. RAD results were further symptom selection.Analysis dataset CHD revealed some significant relationships, not only also syndromes. Using select based on different...
Deploying Convolutional Neural Network (CNN)-based applications to mobile platforms can be challenging due the conflict between restricted computing capacity of devices and heavy computational overhead running a CNN. quantization is promising way alleviating this problem. However, network result in accuracy degradation especially case with compact CNN architectures that are designed for applications. This paper presents novel efficient mixed-precision pipeline, called MBFQuant. It redefines...
In previous systems of speech emotion recognition, supervised learning are frequently employed to train classifiers on lots labeled examples. However, the labeling abundant data requires much time and many human efforts. This paper presents an enhanced co-training algorithm utilize a large amount unlabeled utterances for building semi-supervised system. It uses two conditionally independent attribute views(i.e. temporal features statistic features) examples augment smaller set Our...