- Music and Audio Processing
- Topic Modeling
- Music Technology and Sound Studies
- Speech and Audio Processing
- Speech Recognition and Synthesis
- Diverse Musicological Studies
- Radiomics and Machine Learning in Medical Imaging
- Machine Learning in Healthcare
- Innovative Human-Technology Interaction
- Imbalanced Data Classification Techniques
- Advanced Adaptive Filtering Techniques
- Retinal Imaging and Analysis
- Neuroscience and Music Perception
- Digital Imaging for Blood Diseases
- Time Series Analysis and Forecasting
- Video Analysis and Summarization
Queen Mary University of London
2022-2024
ARES (United States)
2022
Cochlear (France)
2020-2021
Seoul National University
2010-2019
Deep learning has enabled remarkable advances in style transfer across various domains, offering new possibilities for creative content generation. However, the realm of symbolic music, generating controllable and expressive performance-level transfers complete musical works remains challenging due to limited datasets, especially genres such as jazz, lack unified models that can handle multiple music generation tasks. This paper presents ImprovNet, a transformer-based architecture generates...
Most of existing audio fingerprinting systems have limitations to be used for high-specific retrieval at scale. In this work, we generate a low-dimensional representation from short unit segment audio, and couple fingerprint with fast maximum inner-product search. To end, present contrastive learning framework that derives the segment-level search objective. Each update in training uses batch consisting set pseudo labels, randomly selected original samples, their augmented replicas. These...
In this paper, we propose a new approach to cover song identification using CNN (convolutional neural network). Most previous studies extract the feature vectors that characterize relation from pair of songs and used it compute (dis)similarity between two songs. Based on observation there is meaningful pattern can be learned, have reformulated problem in machine learning framework. To do this, first build as an input cross-similarity matrix generated We then construct data set composed pairs...
In this paper, we propose a novel method to search for precise locations of paired note onset and offset in singing voice signal. comparison with the existing detection algorithms, our approach differs two key respects. First, employ Correntropy, generalized correlation function inspired from Reyni's entropy, as capture instantaneous flux while preserving insensitiveness outliers. Next, peak picking algorithm is specially designed function. By calculating fitness pre-defined inverse...
Most of the previous approaches to lyrics-to-audio alignment used a pre-developed automatic speech recognition (ASR) system that innately suffered from several difficulties adapt model individual singers. A significant aspect missing in works is self-learnability repetitive vowel patterns singing voice, where part more consistent than consonant part. Based on this, our first learns discriminative subspace sequences, based weighted symmetric non-negative matrix factorization (WS-NMF), by...
In this paper, we propose a cover song identification algorithm using convolutional neural network (CNN). We first train the CNN model to classify any non-/cover relationship, by feeding cross-similarity matrix that is generated from pair of songs as an input. Our main idea use output-the cover-probabilities one all other candidate songs-as new representation vector for measuring distance between songs. Based on this, present searches applying several ranking methods: 1. sorting without...
This paper provides an outline of the algorithms submitted for WSDM Cup 2019 Spotify Sequential Skip Prediction Challenge (team name: mimbres). In challenge, complete information including acoustic features and user interaction logs first half a listening session is provided. Our goal to predict whether individual tracks in second will be skipped or not, only given features. We proposed two different kinds that were based on metric learning sequence learning. The experimental results showed...
Multi-instrument music transcription aims to convert polyphonic recordings into musical scores assigned each instrument. This task is challenging for modeling as it requires simultaneously identifying multiple instruments and transcribing their pitch precise timing, the lack of fully annotated data adds training difficulties. paper introduces YourMT3+, a suite models enhanced multi-instrument based on recent language token decoding approach MT3. We strengthen its encoder by adopting...
This paper presents MixPlore, a framework for live digital performance inspired by cocktail mixology. It aims to maximise the pleasure of mixology, presenting as plentiful art medium through which people can fully enjoy new synesthetic content created integration bartending and musical creation. For this, we tangible user interfaces utilising cup, tin, glass, muddler, costume display, etc. Making cocktails is followed including music composition. At end every repertoire, performer resulting...