Liping Chen

ORCID: 0000-0003-1902-0215
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech Recognition and Synthesis
  • Speech and Audio Processing
  • Music and Audio Processing
  • Advanced Computational Techniques and Applications
  • Natural Language Processing Techniques
  • Advanced Algorithms and Applications
  • Web Data Mining and Analysis
  • Advanced Computing and Algorithms
  • Catalytic Cross-Coupling Reactions
  • Winter Sports Injuries and Performance
  • Real-time simulation and control systems
  • Complex Network Analysis Techniques
  • Multi-Agent Systems and Negotiation
  • Digital Media and Visual Art
  • Reproductive biology and impacts on aquatic species
  • Web visibility and informetrics
  • Embedded Systems and FPGA Design
  • Pharmaceutical and Antibiotic Environmental Impacts
  • Topic Modeling
  • Software Engineering and Design Patterns
  • Neural Networks and Applications
  • Educational Technology and Pedagogy
  • Robotic Path Planning Algorithms
  • Seismology and Earthquake Studies
  • Advanced Clustering Algorithms Research

Fujian Normal University
2024

University of Science and Technology of China
2014-2024

Ganzhou People's Hospital
2024

Chaohu University
2009-2022

Microsoft Research Asia (China)
2017-2018

Search
2018

Microsoft (United States)
2018

ZheJiang Academy of Agricultural Sciences
2018

Ocean University of China
2013

Chongqing University
2012

Probabilistic linear discriminant analysis (PLDA) has shown to be effective for modeling speaker and channel variability in the i-vector space text-independent verification. This paper shows that PLDA scoring function could formulated as model comparison between an adapted universal PLDA. Based on this formulation, we show a more robust adaptation attained by adapting through use of minimum divergence estimate prior latent subspace. Experimental results NIST SRE'10 SRE'12 dataset confirm...

10.1109/icassp.2014.6854354 article EN 2014-05-01

Voice assistant represents one of the most popular and important scenarios for speech recognition. In this paper, we propose two adaptation approaches to customize a multi-style well-trained acoustic model towards its subsidiary domain Cortana assistant. First, present anchor-based speaker by extracting information, i-vector or d-vector embeddings, from anchor segments 'Hey Cortana'. The embeddings are mapped layer-wise parameters control transformations both weight matrices biases multiple...

10.1109/icassp.2018.8461553 article EN 2018-04-01

Agile Development is a kind of iterated software development method. Its basic concept people-centered. Estimate and scheme Methodologies are different from the traditional ones. Most papers at home abroad about mainly concentrate contrast fusion between methods. However, researches estimate procedure lesser. This paper introduces methods agile planning based on author's practice experience two aspects: release plan, estimation iteration, then lists three popular methodologies: Xp, scrum...

10.1109/iccrd.2011.5764064 article EN 2011-03-01

Voice anonymization refers to the goal of suppressing personally identifiable voice attributes in speech. State-of-the-art models based on conversion framework accomplish this by replacing speaker with those a pseudo-speaker. This paper proposes exploit uncertainty estimate pseudo-speaker anonymization. For each target speaker, distribution, characterized point and its uncertainty, is estimated from selected set cohort speakers. Based vector sampled used replace an anonymized The efficacy...

10.1109/icassp48485.2024.10446573 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Total variability modeling has shown to be effective for text-independent speaker verification task. It provisions a tractable way estimate the so-called i-vector, which describes and session rendered in an utterance. Due low dimensionality of channel compensation techniques such as linear discriminant analysis (LDA) probabilistic LDA can applied purpose compensation. This paper proposes local technique, central idea is capture associated with individual dimension acoustic space. We analyze...

10.1109/iscslp.2014.6936577 article EN 2014-09-01

In this paper, given the speaker bottleneck feature vectors extracted with discriminant neural networks, we focus on using sequential characteristics for text-dependent verification. each evaluation trial, supervectors are used as representations of rendered in compared speech utterances. To end, dynamic time warping is to warp variable-length vector sequences utterances same length. Thereafter every utterance, a supervector can be obtained concatenation its vectors. We use Euclidean...

10.1109/icassp.2018.8462467 article EN 2018-04-01

We analyze the i-vector extraction from perspective of prior distribution exerted on mean supervector Gaussian mixture model (GMM). To this end, we start off with analysis subspace which leads to compressed representation in standard extraction. then propose use quasi-factorial and show how it impacts total variability space its application for The could be used a standalone manner, or combination prior. In latter context, found that performance can greatly improved followed by This...

10.1109/lsp.2015.2459059 article EN IEEE Signal Processing Letters 2015-07-22

The classical classifiers are ineffective in dealing with the problem of imbalanced big dataset classification. Resampling datasets and balancing samples distribution before training classifier is one most popular approaches to resolve this problem. An effective simple hybrid sampling method based on data partition (HSDP) proposed paper. First, all partitioned into different regions. Then, noise minority region removed boundary selected as oversampling seeds generate synthetic samples....

10.1155/2021/6877284 article EN cc-by Complexity 2021-01-01

Aiming at the current situation of network embedding research focusing on dynamic homogeneous and static heterogeneous information but lack utilization, this paper proposes a method based meta-path improved Rotate model; first uses meta-paths to model semantic relationships involved in network, then GCNs get local node embedding, finally meta-path-level aggression mechanisms aggregate representations nodes, which can solve utilization issues. In addition, temporal processing component time...

10.3390/app122110898 article EN cc-by Applied Sciences 2022-10-27

Conversational speech synthesis aims to synthesize of an individual speaker based on history conversation. However, most studies in conversational only focus the performance current speaker's turn and neglect temporal relationship between turns interlocutors. Therefore, we consider connection for synthesis, which is crucial naturalness coherence conversations. Specifically, this paper formulates a task there no overlap one considered. To complete task, acoustic model proposed leverages...

10.1109/icassp48485.2024.10448356 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Any-to-any singing voice conversion (SVC) is an interesting audio editing technique, aiming to convert the of one singer into that another, given only a few seconds data. However, during process, issue timbre leakage inevitable: converted still sounds like original singer's voice. To tackle this, we propose latent diffusion model for SVC (LDM-SVC) in this work, which attempts perform space using LDM. We pretrain variational autoencoder structure noted open-source So-VITS-SVC project based on...

10.48550/arxiv.2406.05325 preprint EN arXiv (Cornell University) 2024-06-07

Any-to-any singing voice conversion (SVC) is an interesting audio editing technique, aiming to convert the of one singer into that another, given only a few seconds data. However, during process, issue timbre leakage inevitable: converted still sounds like original singer's voice. To tackle this, we propose latent diffusion model for SVC (LDM-SVC) in this work, which attempts perform space using LDM. We pretrain variational autoencoder structure noted open-source So-VITS-SVC project based on...

10.21437/interspeech.2024-250 article EN Interspeech 2022 2024-09-01

The utilization of Pd( ii )-catalyzed oxidation for the transformation terminal olefins into methyl ketones has emerged as a particularly intriguing and versatile strategy in organic synthesis.

10.1039/d4ra07296k article EN cc-by RSC Advances 2024-01-01

Probabilistic linear discriminant analysis (PLDA) has shown to be effective for modeling channel variability in the i-vector space text-independent speaker verification. Speaker verification is a binary hypothesis testing. Given test segment, score could computed as log-likelihood ratio between speaker-adapted PLDA and universal model. This work proposes infer factor specific each segment include estimate models, which essentially shifts scoring function better match that of channel. We also...

10.1109/icassp.2015.7178973 article EN 2015-04-01

Probabilistic linear discriminant analysis (PLDA) is widely described as an effective model for text-independent speaker verification in the i-vector space. The PLDA scoring function typically formulated likelihood ratio between speaker-adapted and universal PLDAs. In this case, adaptation of was performed through factors. paper, we show that channel factors could be equivalently exploited to deal with multi-source conditions. verification, proposed method, a PLDAmodel trained on...

10.1109/icassp.2017.7953184 article EN 2017-03-01
Coming Soon ...