- Speech and Audio Processing
- Music and Audio Processing
- Music Technology and Sound Studies
- Advanced Adaptive Filtering Techniques
- Acoustic Wave Phenomena Research
- Hearing Loss and Rehabilitation
- Indoor and Outdoor Localization Technologies
- Underwater Acoustics Research
- Image and Signal Denoising Methods
- Structural Health Monitoring Techniques
- Speech Recognition and Synthesis
- Aerodynamics and Acoustics in Jet Flows
- Diverse Musicological Studies
- Blind Source Separation Techniques
- Model Reduction and Neural Networks
- Target Tracking and Data Fusion in Sensor Networks
- Advanced Vision and Imaging
- Neuroscience and Music Perception
- Advanced Database Systems and Queries
- Digital Media Forensic Detection
- Wood Treatment and Properties
- Computer Graphics and Visualization Techniques
- Semantic Web and Ontologies
- Natural Language Processing Techniques
- Direction-of-Arrival Estimation Techniques
Politecnico di Milano
2016-2025
Consorzio di Bioingegneria e Informatica Medica
2017-2024
Leonardo (Italy)
2023
Institute of Electrical and Electronics Engineers
2021
University of Catania
2021
Polytechnic University of Bari
2018
Ingegneria dei Sistemi (Italy)
2013
Instituto Politécnico Nacional
2011
Sapienza University of Rome
1988
IBM (Italy)
1985
This paper describes an audio-based video surveillance system which automatically detects anomalous audio events in a public square, such as screams or gunshots, and localizes the position of acoustic source, way that video-camera is steered consequently. The employs two parallel GMM classifiers for discriminating from noise gunshots noise, respectively. Each classifier trained using different features, chosen set both conventional innovative features. location source has produced sound...
Wireless acoustic sensor networks (WASNs) are formed by a distributed group of acoustic-sensing devices featuring audio playing and recording capabilities. Current mobile computing platforms offer great possibilities for the design audio-related applications involving nodes. In this context, source localization is one application domains that have attracted most attention research community along last decades. general terms, sources can be achieved studying energy temporal and/or directional...
Acoustic scene reconstruction is a process that aims to infer characteristics of the environment from acoustic measurements. We investigate problem locating planar reflectors in rooms, such as walls and furniture, signals obtained using distributed microphones. Specifically, localization multiple two- dimensional (2-D) achieved by estimation time arrival (TOA) reflected analysis impulse responses (AIRs). The estimated TOAs are converted into elliptical constraints about location line...
Recently, deep learning and machine approaches have been widely employed for various applications in acoustics.Nonetheless, the area of sound field processing reconstruction, classic methods based on solutions wave equation are still widespread.Lately, physics-informed neural networks proposed as a paradigm solving partial differential equations that govern physical phenomena, bridging gap between purely data-driven model-based methods.In this study, we exploit to reconstruct early part...
We propose a method for localizing an acoustic source with distributed microphone networks. Time Differences of Arrival (TDOAs) signals pertaining the same sensor are estimated through Generalized Cross-Correlation. After TDOA filtering stage that discards measurements potentially unreliable, localization is performed by minimizing fourth-order polynomial combines hyperbolic constraints from multiple sensors. The algorithm turns to exhibit significantly lower computational cost compared...
The generalized cross-correlation(GCC) is regarded as the most popular approach for estimating time difference of arrival (TDOA) between signals received at two sensors. Time delay estimates are obtained by maximizing GCC output, where direct-path usually observed a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization algorithms, SRP functional can be written accumulation computed from multiple sensor pairs. Unfortunately, accuracy TDOA...
Abstract Several methods for synthetic audio speech generation have been developed in the literature through years. With great technological advances brought by deep learning, many novel techniques achieving incredible realistic results recently proposed. As these generate convincing fake human voices, they can be used a malicious way to negatively impact on today’s society (e.g., people impersonation, news spreading, opinion formation). For this reason, ability of detecting whether...
Recent advances in deep learning and computer vision have spawned a new class of media forgeries known as deepfakes, which typically consist artificially generated human faces or voices. The creation distribution deepfakes raise many legal ethical concerns. As result, the ability to distinguish between authentic is vital. While can create plausible video audio, it may be challenging for them generate content that consistent terms high-level semantic features, such emotions. Unnatural...
Abstract Of all the characteristics of a violin, those that concern its shape are probably most important ones, as violin maker has complete control over them. Contemporary making, however, is still based more on tradition than understanding, and definitive scientific study specific relations exist between vibrational properties yet to come sorely missed. In this article, using standard statistical learning tools, we show modal frequencies tops can, in fact, be predicted from geometric...
In recent years, audio and video deepfake technology has advanced relentlessly, severely impacting people's reputation reliability. Several factors have facilitated the growing threat. On one hand, hyper-connected society of social mass media enables spread multimedia content worldwide in real-time, facilitating dissemination counterfeit material. other neural network-based techniques made deepfakes easier to produce difficult detect, showing that analysis low-level features is no longer...
This paper describes an audio event detection system which automatically classifies as ambient noise, scream or gunshot. The classification uses two parallel GMM classifiers for discriminating screams from noise and gunshots noise. Each classifier is trained using different features, appropriately chosen a set of 47 are selected according to 2-step process. First, feature subsets increasing size assembled filter selection heuristics. Then, tested with each subset. obtained performance used...
The curse of outlier measurements in estimation problems is a well known issue variety fields. Therefore, removal procedures, which enables the identification spurious within set, have been developed for many different scenarios and applications. In this paper, we propose statistically motivated algorithm time differences arrival (TDOAs), or equivalently range (RD), acquired at sensor arrays. method exploits TDOA-space formalism works by only knowing relative positions. As proposed...
In this paper, we propose a data-driven approach for the reconstruction of unknown room impulse responses (RIRs) based on deep prior paradigm. We formulate RIR as an inverse problem. More specifically, convolutional neural network (CNN) is employed prior, in order to obtain regularized solution problem uniform linear arrays. This allows us avoid assumptions sound wave propagation, acoustic environment, or measuring setting made state-of-the-art algorithms. Moreover, differently from...
In this paper we propose a method for reconstructing the 2D geometry of surrounding environment based on signals acquired by fixed microphone, when series acoustic stimula are produced in different positions space. After estimating Times Of Arrival (TOAs) reflective paths, turn each TOA into projective geometric constraint that can be used determining locations reflectors. The result consists collection planar surfaces correspond to reflectors' locations. present whole processing chain and...
In this paper, we propose a robust and low-complexity acoustic source localization technique based on time differences of arrival (TDOA), which addresses the scenario distributed sensor networks in 3D environments. Network nodes are assumed to be unsynchronized, i.e., TDOAs between microphones belonging different not available. We begin with showing how select feasible for each node, exploiting both geometrical considerations characterization overall generalized cross correlation (GCC)...
In this paper we consider the well-established problem of time differences arrival (TDOA)-based source localization and propose a comprehensive analysis its solution for arbitrary sensor measurement placement. More specifically, define TDOA map from physical space locations to range measurements (TDOAs), in specific case three receivers 2D space. We then study identifiability model, giving complete analytical characterization image invertibility. This has been conducted completely...
We present a spatial filtering approach to first-order steerable Differential Microphone Arrays (DMAs) with arbitrary planar geometry. In particular, the design of filter is based on recently proposed frequency-domain methodology that approximates, in least-square sense, target beampattern using Jacobi-Anger expansion involving Bessel functions. Despite generality approach, however, its computational cost turns out be excessive when working limited processing resources. The beamforming...
In this manuscript, we describe a novel methodology for nearfield acoustic holography (NAH). The proposed technique is based on convolutional neural networks, with autoencoder architecture, to reconstruct the pressure and velocity fields surface of vibrating structure using sampled soundfield holographic plane as input. loss function used training network combination two components. first component error in reconstructed velocity. second between sound its estimate obtained from forward...
Head-Related Transfer Functions (HRTFs) have fundamental applications for realistic rendering in immersive audio scenarios. However, they are strongly subject-dependent as vary considerably depending on the shape of ears, head and torso. Thus, personalization procedures required accurate binaural rendering. Recently, Denoising Diffusion Probabilistic Models (DDPMs), a class generative learning techniques, been applied to solve variety signal processing-related problems. In this paper, we...
In this study, we investigate the influence of thickness profile top plate and bracing pattern on vibrational acoustic properties archtop guitars through finite element simulations. Starting from a laser scan real guitar, develop fully parametric three-dimensional model guitar body. The is parametrically controlled by adjusting lower surface while maintaining fixed upper surface. Both geometric numerical modeling techniques are used to analyze mechanical behavior including modal analysis,...
We propose a spatial filtering method for linear arrays of First-Order Steerable Differential Microphones (FOSDMs), which operates in two layers. In the former, signals acquired by individual microphones are locally filtered to produce outputs FOSDMs. latter, FOSDMs processed another filter. analyse different design methodologies and study conditions under layers can be decoupled. The proposed two-layer filter flexibly controlled with single scalar parameter, chosen, example, maximize White...