- Speech and Audio Processing
- Music and Audio Processing
- Acoustic Wave Phenomena Research
- Anomaly Detection Techniques and Applications
- Hearing Loss and Rehabilitation
- Advanced Adaptive Filtering Techniques
- Speech Recognition and Synthesis
- Multimodal Machine Learning Applications
- Time Series Analysis and Forecasting
- Underwater Acoustics Research
- Traffic Prediction and Management Techniques
- Antenna Design and Optimization
- Video Analysis and Summarization
- Gait Recognition and Analysis
- Emotion and Mood Recognition
- Video Surveillance and Tracking Methods
- Aerodynamics and Acoustics in Jet Flows
- Internet of Things and Social Network Interactions
- Recycling and utilization of industrial and municipal waste in materials production
- Topology Optimization in Engineering
- Innovations in Concrete and Construction Materials
- Blind Source Separation Techniques
- Microwave Engineering and Waveguides
- Antenna Design and Analysis
- Water Systems and Optimization
University of Technology Sydney
2018-2025
Information Technology University
2023
Institute of Acoustics
2016-2017
Chinese Academy of Sciences
2016-2017
University of Chinese Academy of Sciences
2017
Unsupervised anomalous sound detection aims to detect unknown abnormal sounds of machines from normal sounds. However, the state-of-the-art approaches are not always stable and perform dramatically differently even for same type, making it impractical general applications. This paper proposes a spectral-temporal fusion based self-supervised method model feature sound, which improves stability performance consistency in individual machines, type. Experiments on DCASE 2020 Challenge Task 2...
Existing contrastive learning methods for anomalous sound detection refine the audio representation of each sample by using contrast between samples' augmentations (e.g., with time or frequency masking). However, they might be biased augmented data, due to lack physical properties machine sound, thereby limiting performance. This paper uses representations ID, rather than sample. The proposed two-stage method pretrain model incorporating ID and a self-supervised classifier fine-tune learnt...
Anomalous sound detection (ASD) encounters difficulties with domain shift, where the sounds of machines in target domains differ significantly from those source due to varying operating conditions. Existing methods typically employ classifiers enhance performance, but they often overlook influence domain-unrelated information. This oversight can hinder model's ability clearly distinguish between domains, thereby weakening its capacity differentiate normal abnormal sounds. In this paper, we...
Purpose Hempcrete has the potential to reduce both CO 2 emissions and energy usage in buildings. a high sound absorption capacity, excellent moisture regulator outstanding thermal insulation properties. However, hempcrete traditionally uses lime-based binders, which are carbon-intensive materials. The low-carbon binders increase sustainability of current research gap. Geopolymer composed aluminosilicate precursors dissolved alkalinity solution. This study investigated suitability calcined...
This letter introduces a database of Room Impulse Responses (RIRs) measured in seven different rooms for multizone sound field reproduction research various acoustic environments. A circular array 60 loudspeakers was installed each room, with two microphone arrays placed sequentially five zones inside the loudspeaker array. total 260 400 RIRs were to establish database. As demonstration application reproduction, simulations performed on pressure matching and contrast control methods...
Personal audio systems generate a local sound field for listener while attenuating the energy at pre-defined quiet zones.In practice, system performance is sensitive to errors in acoustic transfer functions between sources and zones.Regularization commonly used improve robustness, however, selecting regularization parameter not always straightforward.In this paper design framework robust reproduction proposed, combining function error modeling.The allows physical perspective on required...
Although deep learning is the mainstream method in unsupervised anomalous sound detection, Gaussian Mixture Model (GMM) with statistical audio frequency representation as input can achieve comparable results much lower model complexity and fewer parameters. Existing representations, e.g. log-Mel spectrogram's average or maximum over time, do not always work well for different machines. This paper presents Time-Weighted Frequency Domain Representation (TWFR) GMM (TWFR-GMM) detection. The TWFR...
Self-supervised learning methods have achieved promising performance for anomalous sound detection (ASD) under domain shift by incorporating the metadata of types and machine attributes in feature learning. However, relation between shifts has yet to be fully utilised despite their potential benefits characterising shifts. This paper presents a hierarchical information constrained self-supervised ASD method, where (section IDs) is constructed used as constraints improve representation. In...
The directional loudspeaker array has various applications due to its capability direct sound generation towards the target listener and reduce noise pollution. Differential beamforming recently been applied line produce a broadside frequency-invariant radiation pattern. However, existing methods cannot achieve compromise between robustness broadband beampattern preservation. This paper proposed robust differential design allow radiate patterns with robustness. Specifically, we propose...
Automated audio captioning aims to describe data with captions using natural language. Existing methods often employ an encoder-decoder structure, where the attention-based decoder (e.g., Transformer decoder) is widely used and achieves state-of-the-art performance. Although this method effectively captures global information within via self-attention mechanism, it may ignore event short time duration, due its limitation in capturing local signal, leading inaccurate prediction of captions....
State-of-the-art audio captioning methods typically use the encoder-decoder structure with pretrained neural networks (PANNs) as encoders for feature extraction. However, convolution operation used in PANNs is limited capturing long-time dependencies within an signal, thereby leading to potential performance degradation captioning. This letter presents a novel method using graph attention (GraphAC) based In encoder, module introduced after learn contextual association (i.e. dependency among...
Abstract Unsupervised anomalous sound detection (ASD) aims to detect unknown sounds of devices when only normal data is available. The autoencoder (AE) and self-supervised learning based methods are two mainstream methods. However, the AE-based could be limited as feature learned from can also fit with sounds, reducing ability model in detecting anomalies sound. not always stable perform differently, even for machines same type. In addition, may short-lived, making it harder distinguish This...
This paper addresses a two-dimensional multizone sound field reproduction approach using wave-domain method. The desired fields in the bright and dark zones are described as orthogonal expansions of basis functions over regions. loudspeaker weights obtained by maximizing contrast among multiple wave domain. Simulation results demonstrate that compared with conventional acoustic control approach, proposed method improves level array gain entire region is less sensitive to selection...
Personal audio generates sound zones in a shared space to provide private and personalized listening experiences with minimized interference between consumers. Regularization has been commonly used increase the robustness of such systems against potential perturbations reproduction. However, performance is limited by system geometry as number location loudspeakers controlled zones. This paper proposes optimization method find most geometrically robust approach for personal amongst all...
This paper proposes a three-dimensional wave-domain acoustic contrast control method to reproduce multizone sound field using circular loudspeaker array. In this method, analysis is based on spherical harmonic decomposition, and the weights are obtained by maximizing energy between predefined bright zone dark zone. Simulation results show that proposed provides good separation performance over large spatial region requires lower-order harmonics, resulting in much lower number of microphones...
Personal audio provides private and personalized listening experiences by generating sound zones in a shared space with minimal interference between zones. One challenge of the design is to achieve best performance limited number microphones loudspeakers. In this paper, two modal domain methods for personal reproduction are compared. spatial harmonic decomposition (SHD) based method other singular value (SVD) method. It demonstrated that SVD more efficient than SHD 2.5 dimensional design....
Immersive and spatial sound reproduction has been widely studied using loudspeaker arrays. However, flat-panel loudspeakers that utilize thin flat panels with force actuators are a promising alternative to traditional coaxial for practical applications, benefits in low-visual profiles diffuse radiation. Literature addressed the quality applications of three-dimensional reproduction, such as wave field synthesis zones. This paper revisits perception loudspeakers, specifically localization...
The directional loudspeaker array generating sound beam to the target listener is highly demanded in application. null-constraint-based differential beamforming has recently been applied line produce a broadside frequency-invariant radiation pattern. However, its effective frequency range limited since it only pursues pressure matching few directions. In this paper, we develop modal approach of null-constrained method control pattern better. Specifically, derive domain from information about...
Keywords: Linear differential arrays; beam steering; frequency-invariant beamforming.
First-shot (FS) unsupervised anomalous sound detection (ASD) is a brand-new task introduced in DCASE 2023 Challenge Task 2, where the sounds for target machine types are unseen training. Existing methods often rely on availability of normal and abnormal data from machines. However, due to lack types, it becomes challenging when adapting existing ASD first-shot task. In this paper, we propose new framework ASD, metadata-assisted audio generation used estimate unknown anomalies, by utilising...