- Music and Audio Processing
- Speech and Audio Processing
- Music Technology and Sound Studies
- Time Series Analysis and Forecasting
- Speech Recognition and Synthesis
- Diverse Musicological Studies
- Video Analysis and Summarization
- Digital Media Forensic Detection
- Anomaly Detection Techniques and Applications
- Image Processing and 3D Reconstruction
- Advanced Image and Video Retrieval Techniques
- Blind Source Separation Techniques
- Image Retrieval and Classification Techniques
- Advanced Steganography and Watermarking Techniques
- Neuroscience and Music Perception
- Advanced Text Analysis Techniques
- Complex Systems and Time Series Analysis
- Coding theory and cryptography
- Multimodal Machine Learning Applications
- Video Coding and Compression Technologies
- Network Security and Intrusion Detection
- Cryptography and Residue Arithmetic
- Image and Signal Denoising Methods
- Advanced Data Compression Techniques
- Data Management and Algorithms
Fudan University
2016-2025
North China University of Technology
2021-2024
Nissin Kogyo (Japan)
2023
North China Electric Power University
2009-2022
Chinese Academy of Sciences
2011-2021
University of Chinese Academy of Sciences
2021
Aerospace Information Research Institute
2021
Northeast Petroleum University
2021
Suzhou Vocational Institute of Industrial Technology
2020
Hangzhou Dianzi University
2006-2020
Many algorithms have been proposed for the problem of time series classification. However, it is clear that one-nearest-neighbor with Dynamic Time Warping (DTW) distance exceptionally difficult to beat. This approach has one weakness, however; computationally too demanding many realtime applications. One way mitigate this speed up DTW calculations. Nonetheless, there a limit how much can help. In work, we propose an additional technique, numerosity reduction, DTW. While idea reduction...
More and more network traffic data have brought great challenge to traditional intrusion detection system. The performance is tightly related selected features classifiers, but feature selection algorithms classification can't perform well in massive environment. Also the raw are imbalanced, which has a serious impact on results. In this paper, we propose novel model utilizing convolutional neural networks (CNNs). We use CNN select from set automatically, cost function weight coefficient of...
The problem of time series classification has attracted great interest in the last decade. However current research assumes existence large amounts labeled training data. In reality, such data may be very difficult or expensive to obtain. For example, it require and expertise cardiologists, space launch technicians, other domain specialists. As many domains, there are often copious unlabeled available. PhysioBank archive contains gigabytes ECG this work we propose a semi-supervised technique...
The matching of two-dimensional shapes is an important problem with applications in domains as diverse biometrics, industry, medicine and anthropology. distance measure used must be invariant to many distortions, including scale, offset, noise, partial occlusion, etc. Most these distortions are relatively easy handle, either the representation data or similarity used. However rotation invariance seems uniquely difficult. Current approaches typically try achieve data, at expense...
Automatic sleep staging methods usually extract hand-crafted features or network trained from signals recorded by polysomnography (PSG), and then estimate the stages various classifiers. In this study, we propose a classification approach based on hierarchical neural to process multi-channel PSG for improving performance of automatic five-class staging. The proposed contains two stages: comprehensive feature learning stage sequence stage. first is used obtain matrix fusing features. A...
The increasing interest in time series data mining the last decade has resulted introduction of a variety similarity measures, representations, and algorithms. Surprisingly, this massive research effort had little impact on real world applications. Real practitioners who work with daily basis rarely take advantage wealth tools that community made available. In work, we attempt to address problem by introducing simple parameter-light tool allows users efficiently navigate through large...
Sleep stage classification is a fundamental but cumbersome task in sleep analysis. To score the automatically, this study presents method based on two-stage neural network. The feature learning as first can fuse network trained features with traditional hand-crafted features. A recurrent (RNN) second fully utilized for temporal information between epochs and obtaining results. solve serious sample imbalance problem, novel pre-training process combined data augmentation was introduced....
Over the past three decades, there has been a great deal of research on shape analysis, focusing mostly indexing, clustering, and classification. In this work, we introduce new problem finding discords, most unusual shapes in collection. We motivate by considering utility discords diverse domains including zoology, anthropology, medicine. While brute force search algorithm quadratic time complexity, avoid using locality-sensitive hashing to estimate similarity between which enables us...
Automatic target detection in satellite images remains a challenging problem. The main difficulties lie the cooccurrence of variations type, pose, and size huge image. In this paper, we propose new airplane approach based on visual saliency computation symmetry detection. advantages are twofold. First, perform stably obtaining location orientation information. Second, independent pose size, map computed only once. This saves large amount computational time but does not miss any targets....
Separating singing voice from music accompaniment can be of interest for many applications such as melody extraction, singer identification, lyrics alignment and recognition, content-based retrieval. In this paper, a novel algorithm separation in monaural mixtures is proposed. The consists two stages, where non-negative matrix factorization (NMF) applied to decompose the mixture spectrograms with long short windows respectively. A spectral discontinuity thresholding method devised...
The multi-scale object detection, especially small is still a challenging task. This paper proposes an improved detection network based on single shot multibox detector (SSD), and the named as SSD-MSN. SSD-MSN can learn more rich features of objects from enlarged areas, which are clipped raw image. extra contributed to improving performance. includes two subnets: area proposal (APN) network, namely SSD detector. APN used select proposals containing one or areas. predict classification...
While recent advancements in reasoning optimization have significantly enhanced the capabilities of large language models (LLMs), existing efforts to improve been limited solving mathematical problems and focusing on visual graphical inputs, neglecting broader applications general video understanding.This paper proposes video-SALMONN-o1, first open-source reasoning-enhanced audio-visual LLM designed for understanding tasks. To enhance its abilities, we develop a reasoning-intensive dataset...
Musical audio is generally composed of three physical properties: frequency, time and magnitude. Interestingly, human auditory periphery also provides neural codes for each these dimensions to perceive music. Inspired by intrinsic characteristics, a frequency-temporal attention network proposed mimic singing melody extraction. In particular, the model contains modules selective fusion module corresponding properties. The frequency used select same activation bands as did in cochlear temporal...
Density estimation is a costly operation for computing distribution information of data sets underlying many important mining applications, such as clustering and biased sampling. However, traditional density methods are inapplicable streaming data, which continuously arriving large volume because their request linear storage square size calculation. The shortcoming limits the application existing effective algorithms on streams, problem an emergency applications challenge research. In this...
With the advances of machine learning technologies, data-driven feature extraction and sequence modeling approaches are being widely explored for automatic chord recognition tasks. Currently, there is a bottleneck in amount enough annotated data training robust acoustic models, as hand-annotating time-synchronized labels requires professional musical skills considerable labor. To cope with this limitation, paper, we propose convolutional neural network (CNN) based deep extractor, which...
Singing voice detection or vocal is a classification task that determines whether given audio segment contains singing voices. This plays very important role in vocal-related music information retrieval tasks, such as singer identification. Although humans can easily distinguish between and nonsinging parts, it still difficult for machines to do so. Most existing methods focus on feature engineering with classifiers, which rely the experience of algorithm designer. In recent years, deep...