Xuan Dong

ORCID: 0000-0003-0630-0701
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Music and Audio Processing
  • Speech Recognition and Synthesis
  • Advanced Adaptive Filtering Techniques
  • Hearing Loss and Rehabilitation
  • Cytomegalovirus and herpesvirus research
  • Underwater Acoustics Research
  • Power Transformer Diagnostics and Insulation
  • Image Retrieval and Classification Techniques
  • Analog and Mixed-Signal Circuit Design
  • EEG and Brain-Computer Interfaces
  • Gas Sensing Nanomaterials and Sensors
  • Mycobacterium research and diagnosis
  • High voltage insulation and dielectric phenomena

Indiana University
2019-2021

Indiana University Bloomington
2018-2020

Huazhong University of Science and Technology
2005-2019

Changzhou University
2011

Tongji Hospital
2005

Computational objective metrics that use reference signals have been shown to be effective forms of speech assessment in simulated environments, since they are correlated with subjective listening studies. Recent efforts dedicated towards reference-less make real-world more practical, but these approaches predict a limited number measures and not evaluated conditions. In this work, we present novel based framework called the attention enhanced multi-task (AMSA) model, which provides reliable...

10.1109/icassp40776.2020.9053366 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

The real-world capabilities of objective speech quality measures are limited since current (1) developed from simulated data that does not adequately model real environments; or they (2) predict scores always strongly correlated with subjective ratings.Additionally, a large dataset signals listener ratings currently exist, which would help facilitate assessment.In this paper, we collect and the perceptual evaluated by human listeners.We first rating conducting crowdsourced listening studies...

10.21437/interspeech.2020-2809 article EN Interspeech 2022 2020-10-25

Speech assessment is crucial for many applications, but current intrusive methods cannot be used in real environments. Data-driven approaches have been proposed, they use simulated speech materials or only estimate objective scores. In this paper, we propose a novel multi-task non-intrusive approach that capable of simultaneously estimating both subjective and scores real-world speech, to help facilitate learning. This enhances our prior work, which estimated mean-opinion scores, where now...

10.1109/icassp39728.2021.9414182 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Objective metrics, such as the perceptual evaluation of speech quality (PESQ) have become standard measures for evaluating speech. These metrics enable efficient and costless evaluations, where ratings are often computed by comparing a degraded signal to its underlying clean reference signal. Reference-based however, cannot be used evaluate real-world signals that inaccessible references. This project develops nonintrusive framework noisy enhanced We propose an utterance-level...

10.1109/waspaa.2019.8937192 article EN 2019-10-01

Can we detect electric discharge states in gases based on the information visual images? This article proposes a new kind of method where build several detection models for different corona by applying four kinds machine learning algorithms to extract color, brightness, and shape characteristics visible images taken digital camera. Every model is then tested set measure its performance. The are support vector (SVM), K-nearest neighbor regression (KNN), single layer perceptron (SLP), decision...

10.1109/tps.2019.2947289 article EN IEEE Transactions on Plasma Science 2019-11-05

Objective metrics, such as the perceptual evaluation of speech quality (PESQ), short-time objective intelligibility (STOI), and signal-to-distortion ratio (SDR), are often used for evaluating speech. These metrics intrusive since they require a reference (clean) signal to complete evaluation. The need reduces practicality these clean is not typically available during real-world testing. In this paper, two-stage approach presented that estimates score in non-intrusive manner, which enables...

10.1121/10.0002702 article EN publisher-specific-oa The Journal of the Acoustical Society of America 2020-11-01

Recently, vision model pre-training has evolved from relying on manually annotated datasets to leveraging large-scale, web-crawled image-text data. Despite these advances, there is no method that effectively exploits the interleaved data, which very prevalent Internet. Inspired by recent success of compression learning in natural language processing, we propose a novel called Latent Compression Learning (LCL) for This performs latent maximizing mutual information between inputs and outputs...

10.48550/arxiv.2406.07543 preprint EN arXiv (Cornell University) 2024-06-11

The real-world capabilities of objective speech quality measures are limited since current (1) developed from simulated data that does not adequately model real environments; or they (2) predict scores always strongly correlated with subjective ratings. Additionally, a large dataset signals listener ratings currently exist, which would help facilitate assessment. In this paper, we collect and the perceptual evaluated by human listeners. We first rating conducting crowdsourced listening...

10.48550/arxiv.2007.15797 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...