Wei Li

ORCID: 0000-0002-4486-8341
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Music and Audio Processing
  • Speech and Audio Processing
  • Music Technology and Sound Studies
  • Time Series Analysis and Forecasting
  • Speech Recognition and Synthesis
  • Diverse Musicological Studies
  • Video Analysis and Summarization
  • Digital Media Forensic Detection
  • Anomaly Detection Techniques and Applications
  • Image Processing and 3D Reconstruction
  • Advanced Image and Video Retrieval Techniques
  • Blind Source Separation Techniques
  • Image Retrieval and Classification Techniques
  • Advanced Steganography and Watermarking Techniques
  • Neuroscience and Music Perception
  • Advanced Text Analysis Techniques
  • Complex Systems and Time Series Analysis
  • Coding theory and cryptography
  • Multimodal Machine Learning Applications
  • Video Coding and Compression Technologies
  • Network Security and Intrusion Detection
  • Cryptography and Residue Arithmetic
  • Image and Signal Denoising Methods
  • Advanced Data Compression Techniques
  • Data Management and Algorithms

Fudan University
2016-2025

North China University of Technology
2021-2024

Nissin Kogyo (Japan)
2023

North China Electric Power University
2009-2022

Chinese Academy of Sciences
2011-2021

University of Chinese Academy of Sciences
2021

Aerospace Information Research Institute
2021

Northeast Petroleum University
2021

Suzhou Vocational Institute of Industrial Technology
2020

Hangzhou Dianzi University
2006-2020

Many algorithms have been proposed for the problem of time series classification. However, it is clear that one-nearest-neighbor with Dynamic Time Warping (DTW) distance exceptionally difficult to beat. This approach has one weakness, however; computationally too demanding many realtime applications. One way mitigate this speed up DTW calculations. Nonetheless, there a limit how much can help. In work, we propose an additional technique, numerosity reduction, DTW. While idea reduction...

10.1145/1143844.1143974 article EN 2006-01-01

More and more network traffic data have brought great challenge to traditional intrusion detection system. The performance is tightly related selected features classifiers, but feature selection algorithms classification can't perform well in massive environment. Also the raw are imbalanced, which has a serious impact on results. In this paper, we propose novel model utilizing convolutional neural networks (CNNs). We use CNN select from set automatically, cost function weight coefficient of...

10.1109/access.2018.2868993 article EN cc-by-nc-nd IEEE Access 2018-01-01

The problem of time series classification has attracted great interest in the last decade. However current research assumes existence large amounts labeled training data. In reality, such data may be very difficult or expensive to obtain. For example, it require and expertise cardiologists, space launch technicians, other domain specialists. As many domains, there are often copious unlabeled available. PhysioBank archive contains gigabytes ECG this work we propose a semi-supervised technique...

10.1145/1150402.1150498 article EN 2006-08-20

The matching of two-dimensional shapes is an important problem with applications in domains as diverse biometrics, industry, medicine and anthropology. distance measure used must be invariant to many distortions, including scale, offset, noise, partial occlusion, etc. Most these distortions are relatively easy handle, either the representation data or similarity used. However rotation invariance seems uniquely difficult. Current approaches typically try achieve data, at expense...

10.5555/1182635.1164203 article EN Very Large Data Bases 2006-09-01

Automatic sleep staging methods usually extract hand-crafted features or network trained from signals recorded by polysomnography (PSG), and then estimate the stages various classifiers. In this study, we propose a classification approach based on hierarchical neural to process multi-channel PSG for improving performance of automatic five-class staging. The proposed contains two stages: comprehensive feature learning stage sequence stage. first is used obtain matrix fusing features. A...

10.1109/jbhi.2019.2937558 article EN IEEE Journal of Biomedical and Health Informatics 2019-08-27

The increasing interest in time series data mining the last decade has resulted introduction of a variety similarity measures, representations, and algorithms. Surprisingly, this massive research effort had little impact on real world applications. Real practitioners who work with daily basis rarely take advantage wealth tools that community made available. In work, we attempt to address problem by introducing simple parameter-light tool allows users efficiently navigate through large...

10.1137/1.9781611972757.55 article EN 2005-01-09

Sleep stage classification is a fundamental but cumbersome task in sleep analysis. To score the automatically, this study presents method based on two-stage neural network. The feature learning as first can fuse network trained features with traditional hand-crafted features. A recurrent (RNN) second fully utilized for temporal information between epochs and obtaining results. solve serious sample imbalance problem, novel pre-training process combined data augmentation was introduced....

10.1109/access.2019.2933814 article EN cc-by IEEE Access 2019-01-01

Over the past three decades, there has been a great deal of research on shape analysis, focusing mostly indexing, clustering, and classification. In this work, we introduce new problem finding discords, most unusual shapes in collection. We motivate by considering utility discords diverse domains including zoology, anthropology, medicine. While brute force search algorithm quadratic time complexity, avoid using locality-sensitive hashing to estimate similarity between which enables us...

10.1109/icdm.2006.138 article EN Proceedings 2006-12-01

Automatic target detection in satellite images remains a challenging problem. The main difficulties lie the cooccurrence of variations type, pose, and size huge image. In this paper, we propose new airplane approach based on visual saliency computation symmetry detection. advantages are twofold. First, perform stably obtaining location orientation information. Second, independent pose size, map computed only once. This saves large amount computational time but does not miss any targets....

10.1109/icip.2011.6116259 article EN 2011-09-01

Separating singing voice from music accompaniment can be of interest for many applications such as melody extraction, singer identification, lyrics alignment and recognition, content-based retrieval. In this paper, a novel algorithm separation in monaural mixtures is proposed. The consists two stages, where non-negative matrix factorization (NMF) applied to decompose the mixture spectrograms with long short windows respectively. A spectral discontinuity thresholding method devised...

10.1109/tasl.2013.2266773 article EN IEEE Transactions on Audio Speech and Language Processing 2013-06-06

10.1007/s11042-017-4456-9 article EN Multimedia Tools and Applications 2017-02-09

The multi-scale object detection, especially small is still a challenging task. This paper proposes an improved detection network based on single shot multibox detector (SSD), and the named as SSD-MSN. SSD-MSN can learn more rich features of objects from enlarged areas, which are clipped raw image. extra contributed to improving performance. includes two subnets: area proposal (APN) network, namely SSD detector. APN used select proposals containing one or areas. predict classification...

10.1109/access.2019.2923016 article EN cc-by-nc-nd IEEE Access 2019-01-01

While recent advancements in reasoning optimization have significantly enhanced the capabilities of large language models (LLMs), existing efforts to improve been limited solving mathematical problems and focusing on visual graphical inputs, neglecting broader applications general video understanding.This paper proposes video-SALMONN-o1, first open-source reasoning-enhanced audio-visual LLM designed for understanding tasks. To enhance its abilities, we develop a reasoning-intensive dataset...

10.48550/arxiv.2502.11775 preprint EN arXiv (Cornell University) 2025-02-17

10.5334/tismir.194 article EN cc-by Transactions of the International Society for Music Information Retrieval 2025-01-01

Musical audio is generally composed of three physical properties: frequency, time and magnitude. Interestingly, human auditory periphery also provides neural codes for each these dimensions to perceive music. Inspired by intrinsic characteristics, a frequency-temporal attention network proposed mimic singing melody extraction. In particular, the model contains modules selective fusion module corresponding properties. The frequency used select same activation bands as did in cochlear temporal...

10.1109/icassp39728.2021.9413444 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Density estimation is a costly operation for computing distribution information of data sets underlying many important mining applications, such as clustering and biased sampling. However, traditional density methods are inapplicable streaming data, which continuously arriving large volume because their request linear storage square size calculation. The shortcoming limits the application existing effective algorithms on streams, problem an emergency applications challenge research. In this...

10.1109/dasfaa.2003.1192393 article EN 2003-01-01

With the advances of machine learning technologies, data-driven feature extraction and sequence modeling approaches are being widely explored for automatic chord recognition tasks. Currently, there is a bottleneck in amount enough annotated data training robust acoustic models, as hand-annotating time-synchronized labels requires professional musical skills considerable labor. To cope with this limitation, paper, we propose convolutional neural network (CNN) based deep extractor, which...

10.1109/taslp.2018.2879399 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-11-05

Singing voice detection or vocal is a classification task that determines whether given audio segment contains singing voices. This plays very important role in vocal-related music information retrieval tasks, such as singer identification. Although humans can easily distinguish between and nonsinging parts, it still difficult for machines to do so. Most existing methods focus on feature engineering with classifiers, which rely the experience of algorithm designer. In recent years, deep...

10.3390/electronics9091458 article EN Electronics 2020-09-07
Coming Soon ...