Matthew Wiesner
- Speech Recognition and Synthesis
- Natural Language Processing Techniques
- Gamma-ray bursts and supernovae
- Speech and Audio Processing
- Topic Modeling
- Music and Audio Processing
- Pulsars and Gravitational Waves Research
- Speech and dialogue systems
- Astronomy and Astrophysical Research
- Geophysics and Gravity Measurements
- Galaxies: Formation, Evolution, Phenomena
- Stellar, planetary, and galactic studies
- History and Developments in Astronomy
- Adaptive optics and wavefront sensing
- Phonetics and Phonology Research
- Dispute Resolution and Class Actions
- Linguistic Variation and Morphology
- Gaussian Processes and Bayesian Inference
- Domain Adaptation and Few-Shot Learning
- Scientific Research and Discoveries
- Astrophysical Phenomena and Observations
- CCD and CMOS Imaging Sensors
- Language, Discourse, Communication Strategies
- Astrophysics and Cosmic Phenomena
- Statistical and numerical algorithms
Johns Hopkins University
2018-2025
Benedictine University
2020-2024
Medical College of Wisconsin
2021
Purdue University West Lafayette
2015-2020
Fermi National Accelerator Laboratory
2014-2020
Institute for Language and Speech Processing
2018
Language Technology Centre
2018
McGill University
2015
Northern Illinois University
2010-2014
Sequence-to-sequence (seq2seq) approach for low-resource ASR is a relatively new direction in speech research. The benefits by performing model training without using lexicon and alignments. However, this poses problem of requiring more data compared to conventional DNN-HMM systems. In work, we attempt use from 10 BABEL languages build multilingual seq2seq as prior model, then port them towards 4 other transfer learning approach. We also explore different architectures improving the model....
Oliver Adams, Matthew Wiesner, Shinji Watanabe, David Yarowsky. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and translation (ST) research across many non-English source languages.The corpus is a collection of audio recordings from talks in 8 languages.We segment transcripts into sentences align them sourcelanguage target-language translations.The released along with open-sourced code enabling extension new languages as they become available.Our creation methodology can be applied more than previous work, creates...
Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner. Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021). 2021.
Abstract We describe the simulated sky survey underlying second data challenge (DC2) carried out in preparation for analysis of Vera C. Rubin Observatory Legacy Survey Space and Time (LSST) by LSST Dark Energy Science Collaboration (LSST DESC). Significant connections across multiple science domains will be a hallmark LSST; DC2 program represents unique modeling effort that stresses this interconnectivity way has not been attempted before. This encompasses full end-to-end approach: starting...
This paper introduces a new open source platform for end-to-end speech processing named ESPnet. ESPnet mainly focuses on automatic recognition (ASR), and adopts widely-used dynamic neural network toolkits, Chainer PyTorch, as main deep learning engine. also follows the Kaldi ASR toolkit style data processing, feature extraction/format, recipes to provide complete setup other experiments. explains major architecture of this software platform, several important functionalities, which...
We address the problem of optimally identifying all kilonovae detected via gravitational wave emission in upcoming LIGO/Virgo/KAGRA Collaboration observing run, O4, which is expected to be sensitive a factor $\sim 7$ more Binary Neutron Stars alerts than previously. Electromagnetic follow-up but brightest these new events will require $>1$ meter telescopes, for limited time available. present an optimized strategy Dark Energy Camera during O4. base our study on simulations O4 and wide-prior...
The Dark Energy Survey (DES) is a next generation optical survey aimed at understanding the accelerating expansion of universe using four complementary methods: weak gravitational lensing, galaxy cluster counts, baryon acoustic oscillations, and Type Ia supernovae. To perform 5000 sq-degree wide field 30 supernova surveys, DES Collaboration built Camera (DECam), 3 square-degree, 570-Megapixel CCD camera that was installed prime focus Blanco 4-meter telescope Cerro Tololo Inter-American...
We present a new end-to-end architecture for automatic speech recognition (ASR) that can be trained using \emph{symbolic} input in addition to the traditional acoustic input. This utilizes two separate encoders: one and another symbolic input, both sharing attention decoder parameters. call this multi-modal data augmentation network (MMDA), as it support (acoustic symbolic) enables seamless mixing of large text datasets with significantly smaller transcribed corpora during training. study...
Zero-shot voice conversion has recently made substantial progress, but many models still depend on external supervised systems to disentangle speaker identity and linguistic content. Furthermore, current methods often use parallel conversion, where the converted speech inherits source utterance's temporal structure, restricting similarity privacy. To overcome these limitations, we introduce GenVC, a generative zero-shot model. GenVC learns content style in self-supervised manner, eliminating...
Driven by advances in self-supervised learning for speech, state-of-the-art synthetic speech detectors have achieved low error rates on popular benchmarks such as ASVspoof. However, prior do not address the wide range of real-world variability speech. Are reported realistic conditions? To assess detector failure modes and robustness under controlled distribution shifts, we introduce ShiftySpeech, a benchmark with more than 3000 hours from 7 domains, 6 TTS systems, 12 vocoders, 3 languages....
Abstract We present optical follow-up imaging obtained with the Katzman Automatic Imaging Telescope, Las Cumbres Observatory Global Telescope Network, Nickel Swope and Thacher of LIGO/Virgo gravitational wave (GW) signal from neutron star–black hole (NSBH) merger GW190814. searched GW190814 localization region (19 deg 2 for 90th percentile best localization), covering a total 51 94.6% two-dimensional region. Analyzing properties 189 transients that we consider as candidate counterparts to...
ABSTRACT We investigate the ability of human ‘expert’ classifiers to identify strong gravitational lens candidates in Dark Energy Survey like imaging. recruited a total 55 people that completed more than 25 per cent project. During classification task, we present participants 1489 images. The sample contains variety data including simulations, real lenses, non-lens examples, and unlabelled data. find experts are extremely good at finding bright, well-resolved Einstein rings, while arcs with...
In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot set-ting where no transcribed CS data is available for training. Previously proposed frameworks which conditionally factorize bilingual task into its constituent monolingual parts are a promising starting point leveraging efficiently. However, these methods require modules perform language segmentation. That is, each module has simultaneously detect points and transcribe...
Abstract On 2019 August 14 at 21:10:39 UTC, the LIGO/Virgo Collaboration (LVC) detected a possible neutron star–black hole merger (NSBH), first ever identified. An extensive search for an optical counterpart of this event, designated GW190814, was undertaken using Dark Energy Camera on 4 m Victor M. Blanco Telescope Cerro Tololo Inter-American Observatory. Target Opportunity interrupts were issued eight separate nights to observe 11 candidates 4.1 Southern Astrophysical Research (SOAR)...
Connectionist temporal classification (CTC) models are known to have peaky output distributions. Such behavior is not a problem for automatic speech recognition (ASR), but it can cause inaccurate forced alignments (FA), especially at finer granularity, e.g., phoneme level. This paper aims alleviating the CTC and improve its suitability alignment generation, by leveraging label priors, so that scores of paths containing fewer blanks boosted maximized during training. As result, our model...
ABSTRACT Data Challenge 1 (DC1) is the first synthetic data set produced by Rubin Observatory Legacy Survey of Space and Time (LSST) Dark Energy Science Collaboration (DESC). DC1 designed to develop validate reduction analysis study impact systematic effects that will affect LSST set. comprised r-band observations 40 deg2 10 yr depth. We present each stage simulation process: (a) generation, synthesizing sources from cosmological N-body simulations in individual sensor-visit images with...
We report the discovery of seven new, very bright gravitational lens systems from our ongoing search, Sloan Bright Arcs Survey (SBAS). Two are confirmed to have high source redshifts z=2.19 and z=2.94. Three other lie at intermediate redshift with z=1.33,1.82,1.93 two low z=0.66,0.86. The lensed galaxies in all these bright, i-band magnitudes ranging 19.73-22.06. present spectrum each along estimates Einstein radius for system. foreground most is identified by a red sequence based cluster...