- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Topic Modeling
- Speech and Audio Processing
- Speech and dialogue systems
- Sentiment Analysis and Opinion Mining
- Music and Audio Processing
- Text Readability and Simplification
- Infant Health and Development
- Semantic Web and Ontologies
- Network Packet Processing and Optimization
- Advanced Chemical Sensor Technologies
- Multimodal Machine Learning Applications
- Authorship Attribution and Profiling
- Advanced Text Analysis Techniques
- Library Science and Information Systems
- Mathematics, Computing, and Information Processing
- Imbalanced Data Classification Techniques
- Infrastructure Maintenance and Monitoring
- Seismology and Earthquake Studies
- Neural Networks and Applications
- Text and Document Classification Technologies
- Machine Learning and Data Classification
- Biomedical Text Mining and Ontologies
Mohamed bin Zayed University of Artificial Intelligence
2023-2025
IT University of Copenhagen
2023
Tokyo Institute of Technology
2023
Administration for Community Living
2023
American Jewish Committee
2023
United Arab Emirates University
2019-2022
George Washington University
2015-2019
Software (Spain)
2019
Hanan Aldarmaki, Mona Diab. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
With the growing influence of Large Language Models (LLMs), there is increasing interest in integrating speech representations with them to enable more seamless multi-modal processing and understanding. This study introduces a novel approach that leverages self-supervised combination instruction-tuned LLMs for speech-to-text translation. The proposed modality adapter align extracted features using English-language data. Our experiments demonstrate this method effectively preserves semantic...
Most existing methods for automatic bilingual dictionary induction rely on prior alignments between the source and target languages, such as parallel corpora or seed dictionaries. For many language pairs, supervised are not readily available. We propose an unsupervised approach learning a pair of languages given their independently-learned monolingual word embeddings. The proposed method exploits local global structures in vector spaces to align them that similar words mapped each other....
Nada Almarwani, Hanan Aldarmaki, Mona Diab. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.
Text word embeddings that encode distributional semantics work by modeling contextual similarities of frequently occurring words. Acoustic embeddings, on the other hand, typically low-level phonetic similarities. Semantic for spoken words have been previously explored using analogous algorithms to Word2Vec, but resulting vectors still mainly encoded rather than semantic features. In this paper, we examine assumptions and architectures used in previous works show experimentally how shallow...
We evaluated various compositional models, from bag-of-words representations to RNN-based on several extrinsic supervised and unsupervised evaluation benchmarks. Our results confirm that weighted vector averaging can outperform context-sensitive models in most benchmarks, but structural features encoded RNN also be useful certain classification tasks. analyzed some of the datasets identify aspects meaning they measure characteristics explain their performance variance.
We present a new and improved part of speech tagger for Arabic text that incorporates set novel features constraints.This framework is presented within the MADAMIRA software suite, state-of-the-art toolkit language processing.Starting from linear SVM model with basic lexical features, we add range derived morphological analysis clustering methods.We show using these significantly improves part-of-speech tagging accuracy, especially unseen words, which results in better generalization across...
Lexical ambiguity, a challenging phenomenon in all natural languages, is particularly prevalent for languages with diacritics that tend to be omitted writing, such as Arabic. Omitting leads an increase the number of homographs: different words same spelling. Diacritic restoration could theoretically help disambiguate these words, but practice, overall sparsity performance degradation NLP applications. In this paper, we propose approaches automatically marking subset diacritic restoration,...
We present a matrix factorization model for learning cross-lingual representations sentences. Using sentence-aligned corpora, the proposed learns distributed by factoring given data into language-dependent factors and one shared factor. As result, input sentences from both languages can be mapped fixed-length vectors then compared directly using cosine similarity measure, which achieves 0.8 Pearson correlation on Spanish-English semantic textual similarity.
Recently, large pre-trained multilingual speech models have shown potential in scaling Automatic Speech Recognition (ASR) to many low-resource languages. Some of these employ language adapters their formulation, which helps improve monolingual performance and avoids some the drawbacks multi-lingual modeling on resource-rich However, this formulation restricts usability code-switched speech, where two languages are mixed together same utterance. In work, we propose ways effectively fine-tune...
We develop and investigate several cross-lingual alignment approaches for neural sentence embedding models, such as the supervised inference classifier, InferSent, sequential encoder-decoder models. evaluate three frameworks applied to these models: joint modeling, representation transfer learning, mapping, using parallel text guide alignment. Our results support a scalable approach modular of embeddings, where we observe better performance compared models in intrinsic extrinsic evaluations,...
We present a matrix factorization model for learning cross-lingual representations.Using sentence-aligned corpora, the proposed learns distributed representations by factoring given data into language-dependent factors and one shared factor.Moreover, can quickly learn more than two languages without undermining quality of monolingual components.The achieves an accuracy 88% on English to German document classification, 0.8 Pearson correlation Spanish-English semantic textual similarity.While...
This paper introduces Mixat: a dataset of Emirati speech code-mixed with English. Mixat was developed to address the shortcomings current recognition resources when applied speech, and in particular, bilignual speakers who often mix switch between their local dialect The data set consists 15 hours derived from two public podcasts featuring native speakers, one which is form conversations host guest. Therefore, collection contains examples Emirati-English code-switching both formal natural...
Neural multi-channel speech enhancement models, in particular those based on the U-Net architecture, demonstrate promising performance and generalization potential. These models typically encode input channels independently, integrate during later stages of network. In this paper, we propose a novel modification these by incorporating relative information from outset, where each channel is processed conjunction with reference through stacking. This strategy exploits comparative differences...
Speech recognition and speech synthesis models are typically trained separately, each with its own set of learning objectives, training data, model parameters, resulting in two distinct large networks. We propose a parameter-efficient approach to ASR TTS jointly via multi-task objective shared parameters. Our evaluation demonstrates that the performance our is comparable individually while significantly saving computational memory costs ($\sim$50\% reduction total number parameters required...
Developing robust automatic speech recognition (ASR) systems for Arabic, a language characterized by its rich dialectal diversity and often considered low-resource in technology, demands effective strategies to manage complexity. This study explores three critical factors influencing ASR performance: the role of coverage pre-training, effectiveness dialect-specific fine-tuning compared multi-dialectal approach, ability generalize unseen dialects. Through extensive experiments across...