Danilo Comminiello

ORCID: 0000-0003-4067-4504
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Advanced Adaptive Filtering Techniques
  • Music and Audio Processing
  • Blind Source Separation Techniques
  • Image and Signal Denoising Methods
  • Neural Networks and Applications
  • Generative Adversarial Networks and Image Synthesis
  • Machine Learning and ELM
  • Control Systems and Identification
  • Digital Filter Design and Implementation
  • Music Technology and Sound Studies
  • Speech Recognition and Synthesis
  • Model Reduction and Neural Networks
  • Domain Adaptation and Few-Shot Learning
  • Advanced Image Processing Techniques
  • Hearing Loss and Rehabilitation
  • AI in cancer detection
  • Acoustic Wave Phenomena Research
  • Neural Networks and Reservoir Computing
  • Image Retrieval and Classification Techniques
  • Human Pose and Action Recognition
  • Advanced Neural Network Applications
  • Emotion and Mood Recognition
  • Medical Image Segmentation Techniques
  • Advanced MRI Techniques and Applications

Sapienza University of Rome
2016-2025

European University of Rome
2021

Institute of Electrical and Electronics Engineers
2019

Canadian Standards Association
2019

Weatherford College
2014

Henan Tianguan Group (China)
2014

Institute of Electronics, Computer and Telecommunication Engineering
2010

This paper introduces a new class of nonlinear adaptive filters, whose structure is based on Hammerstein model. Such filters derive from the functional link filter (FLAF) model, defined by input expansion, which enhances representation signal through projection in higher dimensional space, and subsequent filtering. In particular, two robust FLAF-based architectures are proposed designed ad hoc to tackle nonlinearities acoustic echo cancellation (AEC). The simplest architecture split FLAF,...

10.1109/tasl.2013.2255276 article EN IEEE Transactions on Audio Speech and Language Processing 2013-03-27

The extreme learning machine (ELM) was recently proposed as a unifying framework for different families of algorithms. classical ELM model consists linear combination fixed number nonlinear expansions the input vector. Learning in is hence equivalent to finding optimal weights that minimize error on dataset. update works batch mode, either with explicit feature mappings or implicit defined by kernels. Although an online version has been former, no work done up this point latter, and whether...

10.1109/tnnls.2014.2382094 article EN IEEE Transactions on Neural Networks and Learning Systems 2014-12-31

Breast cancer is the most widespread neoplasm among women and early detection of this disease critical. Deep learning techniques have become great interest to improve diagnostic performance. However, distinguishing between malignant benign masses in whole mammograms poses a challenge, as they appear nearly identical an untrained eye, region (ROI) constitutes only small fraction entire image. In paper, we propose framework, parameterized hypercomplex attention maps (PHAM), overcome these...

10.1016/j.patrec.2024.04.014 article EN cc-by Pattern Recognition Letters 2024-04-18

In this paper two novel nonlinear cascade adaptive architectures, here called sandwich models, suitable for the identification of general systems are presented. The proposed architectures rely on combination structural blocks, each one implementing a linear filter or memoryless function. All functions involved in adaptation process based spline and can be easily modified during learning using gradient-based techniques. particular, simple form on-line algorithms is derived. addition, we...

10.1109/tcsi.2015.2423791 article EN IEEE Transactions on Circuits and Systems I Regular Papers 2015-06-16

The L3DAS22 Challenge is aimed at encouraging the development of machine learning strategies for 3D speech enhancement and sound localization detection in office-like environments. This challenge improves extends tasks L3DAS21 edition. We generated a new dataset, which maintains same general characteristics datasets, but with an extended number data points adding constrains that improve baseline model's efficiency overcome major difficulties encountered by participants previous challenge....

10.1109/icassp43922.2022.9746872 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

The modeling of human emotion expression in speech signals is an important, yet challenging task. high resource demand recognition models, combined with the general scarcity emotion-labelled data are obstacles to development and application effective solutions this field. In paper, we present approach jointly circumvent these difficulties. Our method, named RH-emo, a novel semi-supervised architecture aimed at extracting quaternion embeddings from real-valued monoaural spectrograms, enabling...

10.1109/taslp.2023.3250840 article EN cc-by IEEE/ACM Transactions on Audio Speech and Language Processing 2023-01-01

Semantic communications represent a significant breakthrough with respect to the current communication paradigm, as they focus on recovering meaning behind transmitted sequence of symbols, rather than symbols themselves. In semantic communications, scope destination is not recover list identical ones, but message that semantically equivalent emitted by source. This paradigm shift introduces many degrees freedom encoding and decoding rules can be exploited make systems much more efficient....

10.1109/mcom.005.2200829 article EN IEEE Communications Magazine 2023-11-01

Recently, a new class of nonlinear adaptive filtering architectures has been introduced based on the functional link filter (FLAF) model. Here we focus specifically split FLAF (SFLAF) architecture, which separates adaptation linear and coefficients using two different filters in parallel. This property makes SFLAF well-suited method for problems like acoustic echo cancellation (NAEC), separation tasks brings some performance improvement. Although flexibility is one main features SFLAF,...

10.1109/taslp.2014.2324175 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2014-05-14

Recently, a novel class of nonlinear adaptive filters, called spline filters (SAFs), has been introduced and demonstrated to be very effective in many practical applications. The learning rules these architectures are based on the least mean square (LMS) algorithm. In order provide theoretical foundation SAF, this paper we steady-state performance evaluation. particular, after stochastic analysis behavior SAF approach under Gaussian assumption, analytical derivation excess error (EMSE)...

10.1109/tsp.2015.2493986 article EN IEEE Transactions on Signal Processing 2015-10-26

Learning from data in the quaternion domain enables us to exploit internal dependencies of 4D signals and treating them as a single entity. One models that perfectly suits with quaternion-valued processing is represented by 3D acoustic their spherical harmonics decomposition. In this paper, we address problem localizing detecting sound events spatial field using processing. particular, consider harmonic components captured first-order ambisonic microphone process convolutional neural...

10.1109/icassp.2019.8682711 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-17

Hypercomplex neural networks have proven to reduce the overall number of parameters while ensuring valuable performance by leveraging properties Clifford algebras. Recently, hypercomplex linear layers been further improved involving efficient parameterized Kronecker products. In this article, we define parameterization convolutional and introduce family (PHNNs) that are lightweight large-scale models. Our method grasps convolution rules filter organization directly from data without...

10.1109/tnnls.2022.3226772 article EN cc-by IEEE Transactions on Neural Networks and Learning Systems 2022-12-13

Semantic communication is expected to be one of the cores next-generation AI-based communications. One possibilities offered by semantic capability regenerate, at destination side, images or videos semantically equivalent transmitted ones, without necessarily recovering sequence bits. The current solutions still lack ability build complex scenes from received partial information. Clearly, there an unmet need balance effectiveness generation methods and complexity information, possibly taking...

10.48550/arxiv.2306.04321 preprint EN cc-by-nc-nd arXiv (Cornell University) 2023-01-01

In this paper, we propose a Deep Recurrent Neural Network (DRNN) approach based on Long-Short Term Memory (LSTM) units for the classification of audio signals recorded in construction sites. Five classes multiple vehicles and tools, normally used sites, have been considered. The input provided to DRNN consists concatenation several spectral features, like MFCCs, mel-scaled spectrogram, chroma contrast. proposed architecture feature extraction described. Some experimental results, obtained by...

10.23919/eusipco47968.2020.9287802 article EN 2021 29th European Signal Processing Conference (EUSIPCO) 2020-12-18

The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus speech enhancement (SE) sound localization detection (SELD). Alongside the challenge, we release dataset, a 65 hours corpus, accompanied Python API that facilitates data usage results submission stage. Usually, approaches to tasks are based single-perspective Ambisonics recordings or arrays of single-capsule microphones. We propose,...

10.1109/mlsp52302.2021.9596248 preprint EN 2021-10-25

While deep generative models are showing exciting abilities in computer vision and natural language processing, their adoption communication frameworks is still far underestimated. These methods demonstrated to evolve solutions classic problems such as denoising, restoration, or compression. Nevertheless, can unveil real potential semantic frameworks, which the receiver not asked recover sequence of bits used encode transmitted (semantic) message, but only regenerate content that...

10.48550/arxiv.2401.06803 preprint EN cc-by arXiv (Cornell University) 2024-01-01

Directly sending audio signals from a transmitter to receiver across noisy channel may absorb consistent bandwidth and be prone errors when trying recover the transmitted bits. On contrary, recent semantic communication approach proposes send semantics then regenerate semantically content at without exactly recovering bitstream. In this paper, we propose generative framework that faces problem as an inverse problem, therefore being robust different corruptions. Our method transmits...

10.1109/icassp48485.2024.10447612 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Deep probabilistic generative models have achieved incredible success in many fields of application. Among such models, variational autoencoders (VAEs) proved their ability modeling a process by learning latent representation the input. In this paper, we propose novel VAE defined quaternion domain, which exploits properties algebra to improve performance while significantly reducing number parameters required network. The proposed with respect traditional VAEs relies on leverage internal...

10.1109/icassp39728.2021.9413859 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

This paper addresses the problem of translating night-time thermal infrared images, which are most adopted image modalities to analyze scenes, daytime color images (NTIT2DC), provide better perceptions objects. We introduce a novel model that focuses on enhancing quality target generation without merely colorizing it. The proposed structural aware (StawGAN) enables translation better-shaped and high-definition objects in domain. test our aerial DroneVeichle dataset containing RGB-IR paired...

10.1109/iscas46773.2023.10181838 article EN 2022 IEEE International Symposium on Circuits and Systems (ISCAS) 2023-05-21
Coming Soon ...