Anastasios Alexandridis

ORCID: 0000-0003-0682-6009
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Music and Audio Processing
  • Speech Recognition and Synthesis
  • Advanced Adaptive Filtering Techniques
  • Indoor and Outdoor Localization Technologies
  • Acoustic Wave Phenomena Research
  • Corporate Finance and Governance
  • Flow Measurement and Analysis
  • Hearing Loss and Rehabilitation
  • Underwater Acoustics Research
  • Music Technology and Sound Studies
  • Natural Language Processing Techniques
  • Speech and dialogue systems
  • Topic Modeling
  • Auditing, Earnings Management, Governance
  • QR Code Applications and Technologies
  • Corruption and Economic Development
  • International Business and FDI
  • Mobile Crowdsensing and Crowdsourcing
  • Direction-of-Arrival Estimation Techniques
  • Accounting Theory and Financial Reporting
  • Global Financial Crisis and Policies
  • IoT and Edge/Fog Computing

Amazon (United States)
2022-2023

Amazon (Germany)
2022

Foundation for Research and Technology Hellas
2013-2018

University of Crete
2011-2018

FORTH Institute of Computer Science
2015-2017

FORTH Institute of Electronic Structure and Laser
2014-2017

Wireless acoustic sensor networks (WASNs) are formed by a distributed group of acoustic-sensing devices featuring audio playing and recording capabilities. Current mobile computing platforms offer great possibilities for the design audio-related applications involving nodes. In this context, source localization is one application domains that have attracted most attention research community along last decades. general terms, sources can be achieved studying energy temporal and/or directional...

10.1155/2017/3956282 article EN cc-by Wireless Communications and Mobile Computing 2017-01-01

In this paper, we consider the data-association problem for localization of multiple sound sources in a wireless acoustic sensor network, where each node is microphone array, using direction arrival (DOA) estimates. The arises because central that receives DOA estimates from nodes cannot know to which source they belong. Hence, DOAs different correspond same must be found order perform accurate localization. We present method identify correct association and thus accurately estimate their...

10.1109/taslp.2017.2772831 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2017-01-01

This paper proposes a real-time method for capturing and reproducing spatial audio based on circular microphone array. Following different approach than other recently proposed array-based methods audio, the estimates directions of arrival active sound sources per time-frame basis performs source separation with fixed superdirective beamformer, which results in more accurate modelling reproduction recorded acoustic environment. The separated signals are downmixed into one monophonic signal,...

10.1155/2013/718574 article EN cc-by Journal of Electrical and Computer Engineering 2013-01-01

In this work, we consider the multiple sound source location estimation and counting problem in a wireless acoustic sensor network, where each consists of microphone array. Our method is based on inferring estimate for frequency captured signals. A clustering approach-where number clusters (i.e., sources) also an unknown parameter-is then employed to decide sources their locations. The efficiency our proposed evaluated through simulations real recordings scenarios with up three simultaneous...

10.1109/waspaa.2015.7336895 article EN 2015-10-01

Neural contextual biasing for end-to-end neural ASR transducers has shown significant improvements in the recognition of named entities, such as contact names or device names. However, it comes with cost increased compute, layers (which are usually based on cross-attention) add complexity to transducers. In this paper, we propose gated models that can estimate at runtime when is needed and toggle off. That way, does not run every audio frame, but only frames where be helpful correct...

10.1109/icassp49357.2023.10095322 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

In this work we propose a grid-based method to estimate the location of multiple sources in wireless acoustic sensor network, where each node contains microphone array and only transmits direction-of-arrival (DOA) estimates time interval, minimizing transmissions central processing node. We present new on modeling DOA estimation error such scenario. Through extensive, realistic simulations, show our outperforms other state-of-the-art methods, both accuracy complexity. localization results...

10.5281/zenodo.44173 article EN 2014-11-13

In this paper, we consider the data association problem that arises when localizing multiple sound sources using direction of arrival (DOA) estimates from microphone arrays. such a scenario, DOAs across arrays correspond to same source is unknown and must be found for accurate localization. We present an algorithm finds correct DOA based on features extracted each propose. Our method results in high localization accuracy scenarios with missed detections, reverberation, noise outperforms...

10.1109/eusipco.2015.7362644 article EN 2015-08-01

We present the design of a digital microphone array comprised MEMS microphones and evaluate its potential for spatial audio capturing direction-of-arrival (DOA) estimation which is an essential part encoding soundscape. The device cheaper more compact alternative to analog arrays require external - usually expensive analog-to-digital converters sound cards. However, performance such DOA acquisition has not been investigated. In this work, efficiency evaluated compared typical same geometry....

10.1109/eusipco.2016.7760321 article EN 2021 29th European Signal Processing Conference (EUSIPCO) 2016-08-01

We propose a real-time method for coding an acoustic environment based on estimating the Direction-of-Arrival (DOA) and reproducing it using arbitrary loudspeaker configuration or headphones. encode sound field with use of one audio signal side-information. The can be further encoded MP3 encoder to reduce bitrate. investigate how such affect spatial impression quality reproduction. Also, we lossless efficient compression scheme Our is compared other recently proposed microphone array methods...

10.1109/icassp.2013.6637656 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

We present dual-attention neural biasing, an architecture designed to boost Wake Words (WW) recognition and improve inference time latency on speech tasks. This enables a dynamic switch for its runtime compute paths by exploiting WW spotting select which branch of attention networks execute input audio frame. With this approach, we effectively accuracy while saving cost as defined floating point operations (FLOPs). Using in-house de-identified dataset, demonstrate that the proposed network...

10.1109/icassp49357.2023.10096075 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Speaker localization and counting in real-life conditions remains a challenging task. The computational burden, transmission usage synchronization issues pose several limitations. Moreover, the physical characteristics of real speakers terms directivity pattern orientation, as well restrictions microphone array positioning, which commonly have to be placed close walls, deteriorate performance. In this paper, we propose method that accounts for adjacent wall reflections evaluate it using...

10.1109/icassp.2017.7953336 article EN 2017-03-01

The Forthroid is a location-based system that "augments" physical objects with multimedia information and enables users to receive about or request services related objects. It employs computer-vision techniques Quick Response codes (QR-codes). We have implemented prototype on Android platforms evaluated its performance systems metrics subjective tests. discuss our findings challenges in prototyping OS. analysis indicates the network server are main sources of delay, while CPU load may vary...

10.1109/lanman.2011.6076933 article EN 2011-10-01

To achieve robust far-field automatic speech recognition (ASR), existing techniques typically employ an acoustic front end (AFE) cascaded with a neural transducer (NT) ASR model. The AFE output, however, could be unreliable, as the beamforming output in is steered to wrong direction. A promising way address this issue exploit microphone signals before stage and after echo cancellation (post-AEC) AFE. We argue that both, post-AEC outputs, are complementary it possible leverage redundancy...

10.48550/arxiv.2303.00692 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Recently, wireless acoustic sensor networks (WASNs) have received significant attention from the research community and a variety of methods been proposed for numerous applications, such as location estimation speech enhancement. The lack publicly available datasets with signals recorded in WASNs, presents difficulties obtaining consistent performance indicators across different approaches. In this paper, we present release dataset real an outdoor WASN comprised four microphone arrays. Our...

10.1109/mmsp.2018.8547105 article EN 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP) 2018-08-01

We introduce Caching Networks (CachingNets), a speech recognition network architecture capable of delivering faster, more accurate decoding by leveraging common patterns. By explicitly incorporating select sentences unique to each user into the network's design, we show how train model as an extension popular sequence transducer through multitask learning procedure. further propose and experiment with different phrase caching policies, which are effective for virtual voice-assistant (VA)...

10.1109/icassp43922.2022.9747770 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

On-device spoken language understanding (SLU) offers the potential for significant latency savings compared to cloud-based processing, as audio stream does not need be transmitted a server. We present Tiny Signal-to-interpretation (TinyS2I), an end-to-end on-device SLU approach which is focused on heavily resource constrained devices. TinyS2I brings reduction without accuracy degradation, by exploiting use cases when distribution of utterances that users speak device largely heavy-tailed....

10.1109/icassp43922.2022.9747245 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Narrowband direction-of-arrival (DOA) estimates for each time-frequency (TF) point offer a parametric spatial modeling of the acoustic environment which is very commonly used in many applications, such as source separation, dereverberation, and audio. However, irrespective narrowband DOA estimation method used, TF-points suffer from erroneous due to noise reverberation. We propose novel technique yield more accurate TF-domain, through statistical TF-point with complex Watson distribution....

10.1109/eusipco.2016.7760492 article EN 2021 29th European Signal Processing Conference (EUSIPCO) 2016-08-01
Coming Soon ...