Qiaoxi Zhu

ORCID: 0000-0003-3942-0945
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Music and Audio Processing
  • Acoustic Wave Phenomena Research
  • Anomaly Detection Techniques and Applications
  • Hearing Loss and Rehabilitation
  • Advanced Adaptive Filtering Techniques
  • Speech Recognition and Synthesis
  • Multimodal Machine Learning Applications
  • Time Series Analysis and Forecasting
  • Underwater Acoustics Research
  • Traffic Prediction and Management Techniques
  • Antenna Design and Optimization
  • Video Analysis and Summarization
  • Gait Recognition and Analysis
  • Emotion and Mood Recognition
  • Video Surveillance and Tracking Methods
  • Aerodynamics and Acoustics in Jet Flows
  • Internet of Things and Social Network Interactions
  • Recycling and utilization of industrial and municipal waste in materials production
  • Topology Optimization in Engineering
  • Innovations in Concrete and Construction Materials
  • Blind Source Separation Techniques
  • Microwave Engineering and Waveguides
  • Antenna Design and Analysis
  • Water Systems and Optimization

University of Technology Sydney
2018-2025

Information Technology University
2023

Institute of Acoustics
2016-2017

Chinese Academy of Sciences
2016-2017

University of Chinese Academy of Sciences
2017

Unsupervised anomalous sound detection aims to detect unknown abnormal sounds of machines from normal sounds. However, the state-of-the-art approaches are not always stable and perform dramatically differently even for same type, making it impractical general applications. This paper proposes a spectral-temporal fusion based self-supervised method model feature sound, which improves stability performance consistency in individual machines, type. Experiments on DCASE 2020 Challenge Task 2...

10.1109/icassp43922.2022.9747868 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Existing contrastive learning methods for anomalous sound detection refine the audio representation of each sample by using contrast between samples' augmentations (e.g., with time or frequency masking). However, they might be biased augmented data, due to lack physical properties machine sound, thereby limiting performance. This paper uses representations ID, rather than sample. The proposed two-stage method pretrain model incorporating ID and a self-supervised classifier fine-tune learnt...

10.1109/icassp49357.2023.10096054 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Anomalous sound detection (ASD) encounters difficulties with domain shift, where the sounds of machines in target domains differ significantly from those source due to varying operating conditions. Existing methods typically employ classifiers enhance performance, but they often overlook influence domain-unrelated information. This oversight can hinder model's ability clearly distinguish between domains, thereby weakening its capacity differentiate normal abnormal sounds. In this paper, we...

10.48550/arxiv.2501.01604 preprint EN arXiv (Cornell University) 2025-01-02

Purpose Hempcrete has the potential to reduce both CO 2 emissions and energy usage in buildings. a high sound absorption capacity, excellent moisture regulator outstanding thermal insulation properties. However, hempcrete traditionally uses lime-based binders, which are carbon-intensive materials. The low-carbon binders increase sustainability of current research gap. Geopolymer composed aluminosilicate precursors dissolved alkalinity solution. This study investigated suitability calcined...

10.1108/bepam-03-2024-0056 article EN Built Environment Project and Asset Management 2025-02-05

10.1109/icassp49660.2025.10889535 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

10.1109/icassp49660.2025.10889513 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

10.1109/icassp49660.2025.10889208 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

10.1109/icassp49660.2025.10888266 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

This letter introduces a database of Room Impulse Responses (RIRs) measured in seven different rooms for multizone sound field reproduction research various acoustic environments. A circular array 60 loudspeakers was installed each room, with two microphone arrays placed sequentially five zones inside the loudspeaker array. total 260 400 RIRs were to establish database. As demonstration application reproduction, simulations performed on pressure matching and contrast control methods...

10.1121/10.0014958 article EN The Journal of the Acoustical Society of America 2022-10-01

Personal audio systems generate a local sound field for listener while attenuating the energy at pre-defined quiet zones.In practice, system performance is sensitive to errors in acoustic transfer functions between sources and zones.Regularization commonly used improve robustness, however, selecting regularization parameter not always straightforward.In this paper design framework robust reproduction proposed, combining function error modeling.The allows physical perspective on required...

10.17743/jaes.2017.0016 article EN cc-by Journal of the Audio Engineering Society 2017-06-27

Although deep learning is the mainstream method in unsupervised anomalous sound detection, Gaussian Mixture Model (GMM) with statistical audio frequency representation as input can achieve comparable results much lower model complexity and fewer parameters. Existing representations, e.g. log-Mel spectrogram's average or maximum over time, do not always work well for different machines. This paper presents Time-Weighted Frequency Domain Representation (TWFR) GMM (TWFR-GMM) detection. The TWFR...

10.1109/icassp49357.2023.10096356 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Self-supervised learning methods have achieved promising performance for anomalous sound detection (ASD) under domain shift by incorporating the metadata of types and machine attributes in feature learning. However, relation between shifts has yet to be fully utilised despite their potential benefits characterising shifts. This paper presents a hierarchical information constrained self-supervised ASD method, where (section IDs) is constructed used as constraints improve representation. In...

10.1109/icassp48485.2024.10446044 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

The directional loudspeaker array has various applications due to its capability direct sound generation towards the target listener and reduce noise pollution. Differential beamforming recently been applied line produce a broadside frequency-invariant radiation pattern. However, existing methods cannot achieve compromise between robustness broadband beampattern preservation. This paper proposed robust differential design allow radiate patterns with robustness. Specifically, we propose...

10.3390/app14146383 article EN cc-by Applied Sciences 2024-07-22

Automated audio captioning aims to describe data with captions using natural language. Existing methods often employ an encoder-decoder structure, where the attention-based decoder (e.g., Transformer decoder) is widely used and achieves state-of-the-art performance. Although this method effectively captures global information within via self-attention mechanism, it may ignore event short time duration, due its limitation in capturing local signal, leading inaccurate prediction of captions....

10.1109/lsp.2022.3189536 article EN IEEE Signal Processing Letters 2022-01-01

State-of-the-art audio captioning methods typically use the encoder-decoder structure with pretrained neural networks (PANNs) as encoders for feature extraction. However, convolution operation used in PANNs is limited capturing long-time dependencies within an signal, thereby leading to potential performance degradation captioning. This letter presents a novel method using graph attention (GraphAC) based In encoder, module introduced after learn contextual association (i.e. dependency among...

10.1109/lsp.2023.3266114 article EN IEEE Signal Processing Letters 2023-01-01

Abstract Unsupervised anomalous sound detection (ASD) aims to detect unknown sounds of devices when only normal data is available. The autoencoder (AE) and self-supervised learning based methods are two mainstream methods. However, the AE-based could be limited as feature learned from can also fit with sounds, reducing ability model in detecting anomalies sound. not always stable perform differently, even for machines same type. In addition, may short-lived, making it harder distinguish This...

10.1186/s13636-023-00308-4 article EN cc-by EURASIP Journal on Audio Speech and Music Processing 2023-10-13

This paper addresses a two-dimensional multizone sound field reproduction approach using wave-domain method. The desired fields in the bright and dark zones are described as orthogonal expansions of basis functions over regions. loudspeaker weights obtained by maximizing contrast among multiple wave domain. Simulation results demonstrate that compared with conventional acoustic control approach, proposed method improves level array gain entire region is less sensitive to selection...

10.1121/1.5054079 article EN The Journal of the Acoustical Society of America 2018-09-01

Personal audio generates sound zones in a shared space to provide private and personalized listening experiences with minimized interference between consumers. Regularization has been commonly used increase the robustness of such systems against potential perturbations reproduction. However, performance is limited by system geometry as number location loudspeakers controlled zones. This paper proposes optimization method find most geometrically robust approach for personal amongst all...

10.1109/taslp.2018.2889927 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2018-12-27

This paper proposes a three-dimensional wave-domain acoustic contrast control method to reproduce multizone sound field using circular loudspeaker array. In this method, analysis is based on spherical harmonic decomposition, and the weights are obtained by maximizing energy between predefined bright zone dark zone. Simulation results show that proposed provides good separation performance over large spatial region requires lower-order harmonics, resulting in much lower number of microphones...

10.1121/1.5110746 article EN The Journal of the Acoustical Society of America 2019-06-01

Personal audio provides private and personalized listening experiences by generating sound zones in a shared space with minimal interference between zones. One challenge of the design is to achieve best performance limited number microphones loudspeakers. In this paper, two modal domain methods for personal reproduction are compared. spatial harmonic decomposition (SHD) based method other singular value (SVD) method. It demonstrated that SVD more efficient than SHD 2.5 dimensional design....

10.1121/10.0000474 article EN The Journal of the Acoustical Society of America 2020-01-01

Immersive and spatial sound reproduction has been widely studied using loudspeaker arrays. However, flat-panel loudspeakers that utilize thin flat panels with force actuators are a promising alternative to traditional coaxial for practical applications, benefits in low-visual profiles diffuse radiation. Literature addressed the quality applications of three-dimensional reproduction, such as wave field synthesis zones. This paper revisits perception loudspeakers, specifically localization...

10.1121/10.0020827 article EN The Journal of the Acoustical Society of America 2023-09-01

The directional loudspeaker array generating sound beam to the target listener is highly demanded in application. null-constraint-based differential beamforming has recently been applied line produce a broadside frequency-invariant radiation pattern. However, its effective frequency range limited since it only pursues pressure matching few directions. In this paper, we develop modal approach of null-constrained method control pattern better. Specifically, derive domain from information about...

10.23919/eusipco58844.2023.10290084 article EN 2023-09-04

Keywords: Linear differential arrays; beam steering; frequency-invariant beamforming.

10.20944/preprints202408.2253.v1 preprint EN 2024-09-01

First-shot (FS) unsupervised anomalous sound detection (ASD) is a brand-new task introduced in DCASE 2023 Challenge Task 2, where the sounds for target machine types are unseen training. Existing methods often rely on availability of normal and abnormal data from machines. However, due to lack types, it becomes challenging when adapting existing ASD first-shot task. In this paper, we propose new framework ASD, metadata-assisted audio generation used estimate unknown anomalies, by utilising...

10.1109/icassp48485.2024.10448451 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18
Coming Soon ...