Stephan Vogel

ORCID: 0000-0003-2630-4881
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Cognitive and developmental aspects of mathematical skills
  • Speech and dialogue systems
  • Speech Recognition and Synthesis
  • Neuroscience, Education and Cognitive Function
  • Text Readability and Simplification
  • Reading and Literacy Development
  • Algorithms and Data Compression
  • Mathematics Education and Teaching Techniques
  • Semantic Web and Ontologies
  • Handwritten Text Recognition Techniques
  • Biomedical Text Mining and Ontologies
  • Multimodal Machine Learning Applications
  • Translation Studies and Practices
  • Speech and Audio Processing
  • Machine Learning and Algorithms
  • Neuroscience and Music Perception
  • Education Methods and Practices
  • Creativity in Education and Neuroscience
  • Music and Audio Processing
  • Education, Achievement, and Giftedness
  • Psychology, Coaching, and Therapy
  • Visual and Cognitive Learning Processes
  • Transcranial Magnetic Stimulation Studies

University of Graz
2015-2024

TU Dortmund University
2023

Leibniz Research Centre for Working Environment and Human Factors
2023

Chinese University of Hong Kong
2023

Czech Academy of Sciences, Institute of Psychology
2020

Neuroscience Institute
2020

Qatar Airways (Qatar)
2015-2018

Western University
2012-2017

Hamad bin Khalifa University
2015-2017

University College London
2017

In this paper, we describe a new model for word alignment in statistical translation and present experimental results. The idea of the is to make probabilities dependent on differences positions rather than absolute positions. To achieve goal, approach uses first-order Hidden Markov (HMM) problem as they are used successfully speech recognition time problem. difference HMM that there no monotony constraint possible orderings. We details test several bilingual corpora.

10.3115/993268.993313 article EN 1996-01-01

Training word alignment models on large corpora is a very time-consuming processes. This paper describes two parallel implementations of GIZA++ that accelerate this process. One the runs computer clusters, other multi-processor system using multi-threading technology. Results show near-linear speed-up according to number CPUs used, and quality preserved.

10.3115/1622110.1622119 article EN 2008-01-01

Fahim Dalvi, Nadir Durrani, Hassan Sajjad, Stephan Vogel. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018.

10.18653/v1/n18-2079 article EN cc-by 2018-01-01

In this paper we present a recipe and language resources for training testing Arabic speech recognition systems using the KALDI toolkit. We built prototype broadcast news system 200 hours GALE data that is publicly available through LDC. describe in detail decisions made building system: MADA toolkit text normalization vowelization; why use 36 phonemes; how generate pronunciations; build model. report results state-of-the-art modeling decoding techniques. The scripts are released on QCRI's...

10.1109/slt.2014.7078629 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2014-12-01

This paper describes the Arabic MGB-3 Challenge - Speech Recognition in Wild. Unlike last year's MGB-2 Challenge, for which recognition task was based on more than 1,200 hours broadcast TV news recordings from Aljazeera programs, emphasises dialectal using a multi-genre collection of Egyptian YouTube videos. Seven genres were used data collection: comedy, cooking, family/kids, fashion, drama, sports, and science (TEDx). A total 16 videos, split evenly across different genres, divided into...

10.1109/asru.2017.8268952 article EN 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2017-12-01

Functional magnetic resonance imaging (fMRI) studies investigating the neural mechanisms underlying developmental dyscalculia are scarce and results thus far inconclusive. Main aim of present study is to investigate correlates nonsymbolic number magnitude processing in children with without dyscalculia.18 (9 dyscalculia) were asked solve a non-symbolic comparison task (finger patterns) during brain scanning. For spatial control identical stimuli employed, instructions varying only (judgment...

10.1186/1744-9081-5-35 article EN cc-by Behavioral and Brain Functions 2009-01-01

The way the human brain constructs representations of numerical symbols is poorly understood. While increasing evidence from neuroimaging studies has indicated that intraparietal sulcus (IPS) becomes increasingly specialized for symbolic magnitude representation over developmental time, extent to which these changes are associated with age-related differences in or non-numerical processes, such as response selection, remains be uncovered. To address outstanding questions we investigated...

10.1016/j.dcn.2014.12.001 article EN cc-by-nc-nd Developmental Cognitive Neuroscience 2014-12-10

The ability to process the numerical magnitude of sets items has been characterized in many animal species. Neuroimaging data have associated this represent nonsymbolic magnitudes (e.g., arrays dots) with activity bilateral parietal lobes. Yet quantitative abilities humans are not limited processing sets. Humans used sense as foundation for symbolic systems representation magnitude. Although symbol use is widespread human cultures, brain regions involved symbols just beginning be understood....

10.1162/jocn_a_00323 article EN Journal of Cognitive Neuroscience 2012-11-19

In this paper we explore the challenges in crowdsourcing task of translation over web which remotely located translators work on providing translations independent each other. We then propose a collaborative workflow for to address some these challenges. our pipeline model, are working phases where output from earlier can be enhanced subsequent phases. also highlight novel contributions model like assistive and synthesis that leverage monolingual bilingual speakers alike. evaluate approach...

10.1145/2145204.2145382 article EN 2012-02-11

In this paper a robust, adaptive approach for mining parallel sentences from bilingual comparable news collection is described Sentence length models and lexicon-based are combined under maximum likelihood criterion. Specific proposed to handle insertions deletions that frequent in data collected the web. The adaptive, updating translation lexicon iteratively using mined get better vocabulary coverage probability parameter estimation. Experiments carried out on 10 years of Xinhua collection....

10.1109/icdm.2002.1184044 article EN 2003-06-26

We explore unsupervised language model adaptation techniques for Statistical Machine Translation. The hypotheses from the machine translation output are converted into queries at different levels of representation power and used to extract similar sentences very large monolingual text collection. Specific models then build retrieved data interpolated with a general background model. Experiments show significant improvements when translating these adapted models.

10.3115/1220355.1220414 article EN 2004-01-01
Coming Soon ...