NFDI4DS | UHH-SEMS - Publication Details

Paweł Świętojański

ORCID: 0000-0001-5896-4505

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5084169592

Research Areas

Speech Recognition and Synthesis
Music and Audio Processing
Speech and Audio Processing
Natural Language Processing Techniques
Topic Modeling
Speech and dialogue systems
Animal Vocal Communication and Behavior
Hydrocarbon exploration and reservoir analysis
Enhanced Oil Recovery Techniques
Hydraulic Fracturing and Reservoir Analysis
Marine animal studies overview
Drilling and Well Engineering
Seismic Imaging and Inversion Techniques
Advanced Image Processing Techniques
Image and Signal Denoising Methods
AI in Service Interactions
Mineral Processing and Grinding
Context-Aware Activity Recognition Systems
Metaheuristic Optimization Algorithms Research
Non-Destructive Testing Techniques
Intelligent Tutoring Systems and Adaptive Learning
Underwater Acoustics Research
Evolutionary Algorithms and Applications
Subtitles and Audiovisual Media
Text and Document Classification Technologies

Apple (United Kingdom)
2020-2023

UNSW Sydney
2018-2022

Emotech (United Kingdom)
2018-2019

University of Edinburgh
2012-2017

Akademia Tarnowska
2007-2012

Convolutional Neural Networks for Distant Speech Recognition

OPENALEX - Publications

Paweł Świętojański Arnab Ghoshal Steve Renals

We investigate convolutional neural networks (CNNs) for large vocabulary distant speech recognition, trained using recorded from a single microphone (SDM) and multiple microphones (MDM). In the MDM case we explore beamformed signal input representation compared with direct use of acoustic channels as parallel to CNN. have explored different weight sharing approaches, propose channel-wise convolution two-way pooling. Our experiments, AMI meeting corpus, found that CNNs improve word error rate...

10.1109/lsp.2014.2325781 article EN IEEE Signal Processing Letters 2014-05-20

Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models

OPENALEX - Publications

Paweł Świętojański Steve Renals

This paper proposes a simple yet effective model-based neural network speaker adaptation technique that learns speaker-specific hidden unit contributions given data, without requiring any form of speaker-adaptive training, or labelled data. An additional amplitude parameter is defined for each unit; the parameters are tied speaker, and learned using unsupervised adaptation. We conducted experiments on TED talks as used in International Workshop Spoken Language Translation (IWSLT)...

10.1109/slt.2014.7078569 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2014-12-01

Multi-Task Self-Supervised Learning for Robust Speech Recognition

OPENALEX - Publications

Mirco Ravanelli Jianyuan Zhong Santiago Pascual Paweł Świętojański João Monteiro and 2 more

Despite the growing interest in unsupervised learning, extracting meaningful knowledge from unlabelled audio remains an open challenge. To take a step this direction, we recently proposed problem-agnostic speech encoder (PASE), that combines convolutional followed by multiple neural networks, called workers, tasked to solve self-supervised problems (i.e., ones do not require manual annotations as ground truth). PASE was shown capture relevant information, including speaker voice-print and...

10.1109/icassp40776.2020.9053569 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

Multilingual training of deep neural networks

OPENALEX - Publications

Arnab Ghoshal Paweł Świętojański Steve Renals

We investigate multilingual modeling in the context of a deep neural network (DNN) - hidden Markov model (HMM) hybrid, where DNN outputs are used as HMM state likelihoods. By viewing networks cascade feature extractors followed by logistic regression classifier, we hypothesise that layers, which act extractors, will be transferable between languages. As corollary, propose training layers on multiple languages makes them more suitable for such cross-lingual transfer. experimentally confirm...

10.1109/icassp.2013.6639084 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

Machine learning for predicting properties of porous media from 2d X-ray images

OPENALEX - Publications

Naif Alqahtani Fatimah Alzubaidi Ryan T. Armstrong Paweł Świętojański Peyman Mostaghimi

10.1016/j.petrol.2019.106514 article EN Journal of Petroleum Science and Engineering 2019-09-26

Automated lithology classification from drill core images using convolutional neural networks

OPENALEX - Publications

Fatimah Alzubaidi Peyman Mostaghimi Paweł Świętojański Stuart Clark Ryan T. Armstrong

10.1016/j.petrol.2020.107933 article EN Journal of Petroleum Science and Engineering 2020-09-13

Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR

OPENALEX - Publications

Paweł Świętojański Arnab Ghoshal Steve Renals

We investigate the use of cross-lingual acoustic data to initialise deep neural network (DNN) models by means unsupervised restricted Boltzmann machine (RBM) pre-training. DNNs for German are pretrained using one or all German, Portuguese, Spanish and Swedish. The used in a tandem configuration, where outputs as features hidden Markov model (HMM) whose emission densities modeled Gaussian mixture (GMMs), well hybrid HMM state likelihoods. experiments show that pretraining is more crucial...

10.1109/slt.2012.6424230 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2012-12-01

Hybrid acoustic models for distant and multichannel large vocabulary speech recognition

OPENALEX - Publications

Paweł Świętojański Arnab Ghoshal Steve Renals

We investigate the application of deep neural network (DNN)-hidden Markov model (HMM) hybrid acoustic models for far-field speech recognition meetings recorded using microphone arrays. show that achieve significantly better accuracy than conventional systems based on Gaussian mixture (GMMs). observe up to 8% absolute word error rate (WER) reduction from a discriminatively trained GMM baseline when single distant microphone, and between 4-6% WER beamforming various combinations array...

10.1109/asru.2013.6707744 article EN 2013-12-01

SLURP: A Spoken Language Understanding Resource Package

OPENALEX - Publications

Emanuele Bastianelli Andrea Vanzo Paweł Świętojański Verena Rieser

Spoken Language Understanding infers semantic meaning directly from audio data, and thus promises to reduce error propagation misunderstandings in end-user applications. However, publicly available SLU resources are limited. In this paper, we release SLURP, a new package containing the following: (1) A challenging dataset English spanning 18 domains, which is substantially bigger linguistically more diverse than existing datasets; (2) Competitive baselines based on state-of-the-art NLU ASR...

10.18653/v1/2020.emnlp-main.588 article EN cc-by 2020-01-01

Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation

OPENALEX - Publications

Paweł Świętojański Jinyu Li Steve Renals

This work presents a broad study on the adaptation of neural network acoustic models by means learning hidden unit contributions (LHUC) - method that linearly re-combines units in speaker- or environment-dependent manner using small amounts unsupervised data. We also extend LHUC to speaker adaptive training (SAT) framework leads more adaptable DNN model, working both speaker-dependent and speaker-independent manner, without requirements maintain auxiliary feature extractors introduce...

10.1109/taslp.2016.2560534 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2016-04-29

A study of speaker adaptation for DNN-based speech synthesis

OPENALEX - Publications

Zhizheng Wu Paweł Świętojański Christophe Veaux Steve Renals Simon King

A major advantage of statistical parametric speech synthesis (SPSS) over unit-selection is its adaptability and controllability in changing speaker characteristics speaking style. Recently, several studies using deep neural networks (DNNs) as acoustic models for SPSS have shown promising results. However, the DNNs has not been systematically studied. In this paper, we conduct an experimental analysis adaptation DNN-based at different levels. particular, augment a low-dimensional...

10.7488/ds/259 article EN 2015-09-06

Benchmarking Natural Language Understanding Services for building Conversational Agents

OPENALEX - Publications

Xingkun Liu Arash Eshghi Paweł Świętojański Verena Rieser

We have recently seen the emergence of several publicly available Natural Language Understanding (NLU) toolkits, which map user utterances to structured, but more abstract, Dialogue Act (DA) or Intent specifications, while making this process accessible lay developer. In paper, we present first wide coverage evaluation and comparison some most popular NLU services, on a large, multi-domain (21 domains) dataset 25K that collected annotated with Entity Type specifications will be released as...

10.48550/arxiv.1903.05566 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Digital Rock Segmentation for Petrophysical Analysis With Reduced User Bias Using Convolutional Neural Networks

OPENALEX - Publications

Yufu Niu Peyman Mostaghimi Mehdi Shabaninejad Paweł Świętojański Ryan T. Armstrong

Abstract Pore‐scale digital images are usually obtained from microcomputed tomography data that has been segmented into void and grain space. Image segmentation is a crucial step in the process of rock analysis can influence pore‐scale characterization studies and/or numerical simulation petrophysical properties. This concerning since all methods have user‐selected parameters result biases. Convolutional neural networks (CNNs) provide way forward once trained, CNN consistent reliable image...

10.1029/2019wr026597 article EN Water Resources Research 2020-01-30

Multi-Modal Sequence Fusion via Recursive Attention for Emotion Recognition

OPENALEX - Publications

Rory Beard Ritwik Das Raymond W. M. Ng Pavithra Gopalakrishnan Luka Eerens and 2 more

Rory Beard, Ritwik Das, Raymond W. M. Ng, P. G. Keerthana Gopalakrishnan, Luka Eerens, Pawel Swietojanski, Ondrej Miksik. Proceedings of the 22nd Conference on Computational Natural Language Learning. 2018.

10.18653/v1/k18-1025 article EN cc-by 2018-01-01

Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

OPENALEX - Publications

Peter Bell Joachim Fainberg Ondřej Klejch Jinyu Li Steve Renals and 1 more

We present a structured overview of adaptation algorithms for neural network-based speech recognition, considering both hybrid hidden Markov model / network systems and end-to-end systems, with focus on speaker adaptation, domain accent adaptation. The characterizes as based embeddings, parameter or data augmentation. meta-analysis the performance recognition algorithms, relative error rate reductions reported in literature.

10.1109/ojsp.2020.3045349 article EN cc-by IEEE Open Journal of Signal Processing 2020-12-17

An Innovative Application of Generative Adversarial Networks for Physically Accurate Rock Images With an Unprecedented Field of View

OPENALEX - Publications

Yufu Niu Ying Da Wang Peyman Mostaghimi Paweł Świętojański Ryan T. Armstrong

Abstract High‐resolution X‐ray microcomputed tomography (micro‐CT) data are used for the accurate determination of rock petrophysical properties. data, however, result in a small field view, and thus, representativeness simulation domain can be brought into question when dealing with geophysical applications. This paper applies cycle‐in‐cycle generative adversarial network (CinCGAN) to improve resolution 3‐D micro‐CT create super‐resolution image using unpaired training images. Effective...

10.1029/2020gl089029 article EN Geophysical Research Letters 2020-11-10

Flow-Based Characterization of Digital Rock Images Using Deep Learning

OPENALEX - Publications

Naif Alqahtani Traiwit Chung Ying Da Wang Ryan T. Armstrong Paweł Świętojański and 1 more

Summary X-ray imaging of porous media has revolutionized the interpretation various microscale phenomena in subsurface systems. The volumetric images acquired from this technology, known as digital rocks (DR), make it a suitable candidate for machine learning and computer-vision applications. current routine DR frameworks involving image processing modeling are susceptible to user bias expensive computation requirements, especially large domains. In comparison, inference with trained...

10.2118/205376-pa article EN SPE Journal 2021-03-24

Automated Rock Quality Designation Using Convolutional Neural Networks

OPENALEX - Publications

Fatimah Alzubaidi Peyman Mostaghimi Guangyao Si Paweł Świętojański Ryan T. Armstrong

Abstract Mineral and hydrocarbon exploration relies heavily on geological geotechnical information extracted from drill cores. Traditional drill-core characterization is based purely the subjective expertise of a geologist. New technologies can provide automatic mineral analysis high-resolution core images in non-destructive manner. However, automated rock mass presents significant challenge due to its lack generalization robustness. To date, estimation quality designation (RQD), key...

10.1007/s00603-022-02805-y article EN cc-by Rock Mechanics and Rock Engineering 2022-02-26

A study of speaker adaptation for DNN-based speech synthesis

OPENALEX - Publications

Zhizheng Wu Paweł Świętojański Christophe Veaux Steve Renals Simon King

A major advantage of statistical parametric speech synthesis (SPSS) over unit-selection is its adaptability and controllability in changing speaker characteristics speaking style.Recently, several studies using deep neural networks (DNNs) as acoustic models for SPSS have shown promising results.However, the DNNs has not been systematically studied.In this paper, we conduct an experimental analysis adaptation DNN-based at different levels.In particular, augment a low-dimensional...

10.21437/interspeech.2015-270 article EN Interspeech 2022 2015-09-06

Combining in-domain and out-of-domain speech data for automatic recognition of disordered speech

OPENALEX - Publications

Heidi Christensen M. B. Aniol Peter Bell Phil Green Thomas Hain and 2 more

Recently there has been increasing interest in ways of using outof-domain (OOD) data to improve automatic speech recognition performance domains where only limited is available.This paper focuses on one such domain, namely that disordered for which very small databases exist, but normal can be considered OOD.Standard approaches handling use adaptation from OOD models into the target here we investigate an alternative approach with its focus feature extraction stage: used train...

10.21437/interspeech.2013-324 article EN Interspeech 2022 2013-08-25

Investigation of maxout networks for speech recognition

OPENALEX - Publications

Paweł Świętojański Jinyu Li Jui-Ting Huang

We explore the use of maxout neuron in various aspects acoustic modelling for large vocabulary speech recognition systems; including low-resource scenario and multilingual knowledge transfers. Through experiments on voice search short message dictation datasets, we found that networks are around three times faster to train offer lower or comparable word error rates several tasks, when compared with logistic nonlinearity. also present a detailed study unit internal behaviour suggesting...

10.1109/icassp.2014.6855088 article EN 2014-05-01

Revisiting hybrid and GMM-HMM system combination techniques

OPENALEX - Publications

Paweł Świętojański Arnab Ghoshal Steve Renals

In this paper we investigate techniques to combine hybrid HMM-DNN (hidden Markov model - deep neural network) and tandem HMM-GMM Gaussian mixture model) acoustic models using: (1) averaging, (2) lattice combination with Minimum Bayes Risk decoding. We have performed experiments on the "TED Talks" task following protocol of IWSLT-2012 evaluation. Our experimental results suggest that DNN-based GMM-based are complementary, error rates being reduced by up 8% relative when DNN GMM systems...

10.1109/icassp.2013.6638967 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

Neural networks for distant speech recognition

OPENALEX - Publications

Steve Renals Paweł Świętojański

Distant conversational speech recognition is challenging owing to the presence of multiple, overlapping talkers, additional non-speech acoustic sources, and effects reverberation. In this paper we review work on distant recognition, with an emphasis approaches which combine multichannel signal processing modelling, investigate use hybrid neural network / hidden Markov model models for meetings recorded using microphone arrays. particular convolutional fully-connected networks different...

10.1109/hscma.2014.6843274 article EN 2014-05-01

Coming Soon ...