Xianjun Xia

ORCID: 0000-0001-5277-6634
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Music and Audio Processing
  • Speech Recognition and Synthesis
  • Music Technology and Sound Studies
  • HIV-related health complications and treatments
  • Pneumocystis jirovecii pneumonia detection and treatment
  • HIV Research and Treatment
  • Fire effects on ecosystems
  • Winter Sports Injuries and Performance
  • Landslides and related hazards
  • Natural Language Processing Techniques
  • Silicon Carbide Semiconductor Technologies
  • Advanced Text Analysis Techniques
  • Heat shock proteins research
  • Pediatric Hepatobiliary Diseases and Treatments
  • Smart Materials for Construction
  • Gallbladder and Bile Duct Disorders
  • Wound Healing and Treatments
  • Topic Modeling
  • Advanced Data Compression Techniques
  • Thin-Film Transistor Technologies
  • Reconstructive Surgery and Microvascular Techniques
  • Genital Health and Disease
  • Diverse Musicological Studies
  • China's Socioeconomic Reforms and Governance

Tencent (China)
2020-2021

The University of Western Australia
2017-2020

181st Hospital of Chinese People's Liberation Army
2016

University of Science and Technology of China
2012-2014

Shanghai Public Health Clinical Center
2011-2013

Fudan University
2011-2012

Shanghai Academy of Social Sciences
2006

Huzhou Vocational and Technical College
2006

In acoustic event detection, the training data size of some events is often small and imbalanced. To deal with this, this paper proposes generating virtual categorically using auxiliary classifier generative adversarial networks. Soft labels are first calculated to represent localization information. The closer current frame middle manually labeled event, higher soft label will be, which makes positively correlated localization. Then, class quantized used as input condition networks generate...

10.1109/tmm.2018.2879750 article EN IEEE Transactions on Multimedia 2018-11-05

In this technical report, we present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge. comprises two different sub-tasks: (i) 1a focuses on ASC audio signals recorded with multiple (real simulated) devices into ten fine-grained classes, (ii) 1b concerns classification data three higher-level classes using low-complexity solutions. For 1a, propose novel two-stage system leveraging upon ad-hoc...

10.48550/arxiv.2007.08389 preprint EN cc-by arXiv (Cornell University) 2020-01-01

To improve device robustness, a highly desirable key feature of competitive data-driven acoustic scene classification (ASC) system, novel two-stage system based on fully convolutional neural networks (CNNs) is proposed. Our leverages an ad-hoc score combination two CNN classifiers: (i) the first classifies inputs into one three broad classes, and (ii) second same ten finergrained classes. Three different architectures are explored to implement classifiers, frequency sub-sampling scheme...

10.1109/icassp39728.2021.9414835 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

10.1109/icassp49660.2025.10889813 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Abstract Background Surgical site infection (SSI) are the third most frequently reported nosocomial infection, and common on surgical wards. HIV-infected patients may increase possibility of developing SSI after surgery. There few date incidence preventive measures in patients. This study was to determine associated risk factors for And we also explored measures. Methods A retrospective conducted 242 including 17 who combined with hemophilia from October 2008 September 2011 Shanghai Public...

10.1186/1471-2334-12-115 article EN cc-by BMC Infectious Diseases 2012-05-14

This paper deals with the acoustic event detection (AED) to improve accuracy of events. Acoustic task is performed by a regression via classification (RvC) based approach along random forest technique. A discretization process used convert continuous frame positions within events into duration class labels. Outputs category-specific classifiers are then reversed back boundary information. Evaluations on UPC-TALP database which consists highly variable demonstrate efficiency proposed...

10.1109/icme.2017.8019452 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2017-07-01

Acoustic event detection deals with the acoustic signals to determine sound type and estimate audio boundaries. Multi-label classification based approaches are commonly used detect frame wise types a median filter applied happening events. However, multi-label classifiers trained only on ignoring position within To deal this, this paper proposes construct joint learning multi-task system. The first task performs second is predict information. By sharing representations between two tasks, we...

10.1109/tmm.2019.2933330 article EN IEEE Transactions on Multimedia 2019-08-05

In the recent past years, deep learning based machine systems have demonstrated remarkable success for a wide range of tasks in multiple domains such as computer vision, speech recognition and other pattern applications. The purpose this article is to contribute timely review introduction state-of-the-art techniques their effectiveness speech/acoustic signal processing. Thorough investigations various architectures are provided under categories discriminative generative algorithms, including...

10.1109/mcas.2019.2945210 article EN IEEE Circuits and Systems Magazine 2019-01-01

Acoustic event detection, the determination of acoustic type and localisation event, has been widely applied in many real-world applications. Many works adopt multi-label classification techniques to perform polyphonic detection with a global threshold detect active events. However, be set manually is highly dependent on database being tested. To deal this, we replaced fixed method frame-wise dynamic approach this paper. Two novel approaches, namely contour regressor based approaches are...

10.21437/interspeech.2017-746 article EN Interspeech 2022 2017-08-16

Acoustic event detection, the determination of acoustic type and localisation event, has been widely applied in many real-world applications. Many works adopt multi-label classification technique to perform polyphonic detection with a global threshold detect active events. However, manually labeled boundaries are error-prone cannot always be accurate, especially when frame length is too short accurately by human annotators. To deal this, confidence assigned each performed using...

10.1109/icassp.2018.8461845 article EN 2018-04-01

This paper presents an approach for acoustic scene classification using the local binary pattern (LBP) and random forest (RF). The audio signal is converted to a Constant-Q transform (CQT) representation LBP used extract features from this time-frequency representation. CQT representations are divided into number of sub-bands obtain more localized relevant spectral information. We then use select most important each band extracted features. For further performance enhancement, we feature...

10.1109/icme.2018.8486578 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2018-07-01

Sound event detection (SED) has been widely applied in real world applications. Convolutional recurrent neural network based SED approaches have achieved state-of-the-art performance. However, the convolution process is typically performed by using a fixed sized kernel, which adversely affects accuracy especially when acoustic features of different classes are characterized high variations. To deal with this, this article proposes sound technique convolutional framework multiple kernels...

10.1109/taslp.2020.2998298 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2020-01-01

This paper deals with random forest regression based acoustic event detection (AED) by combining features bottleneck (BN). The have a good reputation of being inherently discriminative in signal processing. To deal the unstructured and complex real-world events, an system is constructed using combined features. Evaluations were carried out on UPC-TALP ITC-Irst databases which consist highly variable events. Experimental results demonstrate usefulness low-dimensional relative 5.33% 5.51%...

10.1109/icme.2017.8019418 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2017-07-01

We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC). Specifically, we tackle the ASC task in low-resource environment leveraging recently proposed advanced network pruning mechanism, namely Lottery Ticket Hypothesis (LTH), to find sub-network associated with small amount non-zero parameters. The effectiveness of LTH low-complexity modeling is assessed by...

10.48550/arxiv.2107.01461 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01

This paper presents an improved unit selection and waveform concatenation speech synthesis method by gathering utilizing human feedbacks on synthetic speech. Firstly, a set of texts are synthesized the baseline system. Each prosodic word within is then evaluated as natural one or unnatural listeners. In our proposed method, these segments treated virtual candidate units to extend original corpus for selection. A new system constructed using this extended corpus. error detector based SVM...

10.1109/iscslp.2012.6423524 article EN 2012-12-01

This paper introduces our repairing and denoising network (RaD-Net) for the ICASSP 2024 Speech Signal Improvement (SSI) Challenge. We extend previous framework based on a two-stage propose an upgraded model. Specifically, we replace with COM-Net from TEA-PSE. In addition, multi-resolution discriminators multi-band are adopted in training stage. Finally, use three-step strategy to optimize submit two models different sets of parameters meet RTF requirement tracks. According official results,...

10.48550/arxiv.2401.04389 preprint EN other-oa arXiv (Cornell University) 2024-01-01
Coming Soon ...