NFDI4DS | UHH-SEMS - Publication Details

Jade Abbott

ORCID: 0000-0001-6061-0888

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5087385370

Research Areas

Natural Language Processing Techniques
Topic Modeling
Health, Environment, Cognitive Aging
Biomedical and Engineering Education
Genetics, Bioinformatics, and Biomedical Research
Metaheuristic Optimization Algorithms Research
Translation Studies and Practices
Multimodal Machine Learning Applications
Text Readability and Simplification
Software Engineering Research
Wikis in Education and Collaboration
Speech Recognition and Synthesis
Social and Intergroup Psychology
Robotics and Automated Systems
Image Processing and 3D Reconstruction
Semantic Web and Ontologies
Climate Change Communication and Perception
ICT in Developing Communities
Language, Linguistics, Cultural Analysis
Computational and Text Analysis Methods
Digital Humanities and Scholarship
Text and Document Classification Technologies
Insect and Arachnid Ecology and Behavior
Language, Metaphor, and Cognition
Advanced Multi-Objective Optimization Algorithms

Carnegie Mellon University
2022-2023

The University of Melbourne
2022-2023

Applied Mathematics (United States)
2023

Karlsruhe Institute of Technology
2023

Minzu University of China
2023

Dublin City University
2023

Fondazione Bruno Kessler
2023

University of Trento
2023

University of the Witwatersrand
2023

SIL International
2022

MasakhaNER: Named Entity Recognition for African Languages

OPENALEX - Publications

David Ifeoluwa Adelani Jade Abbott Graham Neubig Daniel D’souza Julia Kreutzer and 56 more

Abstract We take a step towards addressing the under- representation of African continent in NLP research by bringing together different stakeholders to create first large, publicly available, high-quality dataset for named entity recognition (NER) ten languages. detail characteristics these languages help researchers and practitioners better understand challenges they pose NER tasks. analyze our datasets conduct an extensive empirical evaluation state- of-the-art methods across both...

10.1162/tacl_a_00416 article EN cc-by Transactions of the Association for Computational Linguistics 2021-01-01

Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages

OPENALEX - Publications

Wilhelmina Nekoto Vukosi Marivate Tshinondiwa Matsila Timi Fasubaa Taiwo Fagbohungbe and 42 more

Wilhelmina Nekoto, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddeen Muhammad, Salomon Kabongo Kabenamualu, Salomey Osei, Freshia Sackey, Rubungo Andre Niyongabo, Ricky Macharm, Perez Ogayo, Orevaoghene Ahia, Musie Meressa Berhe, Mofetoluwa Adeyemi, Masabata Mokgesi-Selinga, Lawrence Okegbemi, Laura Martinus, Kolawole Tajudeen, Kevin Degila, Kelechi Ogueji, Kathleen Siminyu, Julia Kreutzer, Jason Webster, Jamiil Toure Ali, Jade...

10.18653/v1/2020.findings-emnlp.195 article EN cc-by 2020-01-01

A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation

OPENALEX - Publications

David Ifeoluwa Adelani Jesujoba O. Alabi Angela Fan Julia Kreutzer Xiaoyu Shen and 40 more

David Adelani, Jesujoba Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, Dietrich Klakow, Peter Nabende, Ernie Chang, Tajuddeen Gwadabe, Freshia Sackey, Bonaventure F. P. Dossou, Chris Emezue, Colin Leong, Michael Beukman, Shamsuddeen Muhammad, Guyo Jarso, Oreen Yousuf, Andre Niyongabo Rubungo, Gilles Hacheme, Eric Wairagala, Muhammad Umair Nasir, Benjamin Ajibade, Tunde Ajayi, Yvonne Gitau, Jade Abbott, Mohamed Ahmed, Millicent Ochieng, Anuoluwapo Aremu, Perez...

10.18653/v1/2022.naacl-main.223 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

A Focus on Neural Machine Translation for African Languages

OPENALEX - Publications

Laura Martinus Jade Abbott

African languages are numerous, complex and low-resourced. The datasets required for machine translation difficult to discover, existing research is hard reproduce. Minimal attention has been given so there scant regarding the problems that arise when using techniques. To begin addressing these problems, we trained models translate English five of official South (Afrikaans, isiZulu, Northern Sotho, Setswana, Xitsonga), making use modern neural results obtained show promise techniques...

10.48550/arxiv.1906.05685 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Masakhane -- Machine Translation For Africa

OPENALEX - Publications

Iroro Orife Julia Kreutzer Blessing Sibanda Daniel Whitenack Kathleen Siminyu and 20 more

Africa has over 2000 languages. Despite this, African languages account for a small portion of available resources and publications in Natural Language Processing (NLP). This is due to multiple factors, including: lack focus from government funding, discoverability, community, sheer language complexity, difficulty reproducing papers no benchmarks compare techniques. To begin address the identified problems, MASAKHANE, an open-source, continent-wide, distributed, online research effort...

10.48550/arxiv.2003.11529 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Participatory Detection of Language Barriers towards Multilingual Sustainability(ies) in Africa

OPENALEX - Publications

Gabriela Litre Fabrice Hirsch Patrick Caron Alexander Andrason Nathalie Bonnardel and 8 more

After decades of political, economic, and scientific efforts, humanity has not gotten any closer to global sustainability. With less than a decade reach the UN Sustainable Development Goals (SDGs) deadline 2030 Agenda, we show that development agendas may be getting lost in translation, from their initial formulation final implementation. Sustainability science does “speak” most 2000 languages Africa, where lack indigenous terminology hinders efforts such as COVID-19 pandemic fight....

10.3390/su14138133 article EN Sustainability 2022-07-04

Consultative engagement of stakeholders toward a roadmap for African language technologies

OPENALEX - Publications

Kathleen Siminyu Jade Abbott Kọ́lá Túbọ̀sún Anuoluwapo Aremu Blessing K. Sibanda and 7 more

<h2>Summary</h2> There has been a rise in natural language processing (NLP) communities across the African continent (Masakhane, AfricaNLP workshops). With this momentum noted, and given existing power asymmetries that plague continent, there is an urgent need to ensure these technologies move toward shared goals between organizations stakeholders, not only improve representation of languages cutting-edge NLP research but also enables technological advances human dignity, well-being, equity...

10.1016/j.patter.2023.100820 article EN cc-by Patterns 2023-08-01

AI4D -- African Language Program

OPENALEX - Publications

Kathleen Siminyu Godson Kalipe Davor Orlić Jade Abbott Vukosi Marivate and 13 more

Advances in speech and language technologies enable tools such as voice-search, text-to-speech, recognition machine translation. These are however only available for high resource languages like English, French or Chinese. Without foundational digital resources African languages, which considered low-resource the context, these advanced remain out of reach. This work details AI4D - Language Program, a 3-part project that 1) incentivised crowd-sourcing, collection curation datasets through an...

10.48550/arxiv.2104.02516 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Benchmarking Neural Machine Translation for Southern African Languages

OPENALEX - Publications

Laura Martinus Jade Abbott

Unlike major Western languages, most African languages are very low-resourced. Furthermore, the resources that do exist often scattered and difficult to obtain discover. As a result, data code for existing research has rarely been shared. This lead struggle reproduce reported results, few publicly available benchmarks machine translation models exist. To start address these problems, we trained neural 5 Southern on publicly-available datasets. Code is provided training evaluate newly...

10.48550/arxiv.1906.10511 preprint EN cc-by arXiv (Cornell University) 2019-01-01

Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages

OPENALEX - Publications

Wilhelmina Nekoto Vukosi Marivate Tshinondiwa Matsila Timi Fasubaa Kolawole Tajudeen and 43 more

Research in NLP lacks geographic diversity, and the question of how can be scaled to low-resourced languages has not yet been adequately solved. "Low-resourced"-ness is a complex problem going beyond data availability reflects systemic problems society. In this paper, we focus on task Machine Translation (MT), that plays crucial role for information accessibility communication worldwide. Despite immense improvements MT over past decade, centered around few high-resourced languages. As...

10.48550/arxiv.2010.02353 preprint EN other-oa arXiv (Cornell University) 2020-01-01

MasakhaNER: Named entity recognition for African languages

OPENALEX - Publications

David Ifeoluwa Adelani Jade Abbott Graham Neubig Daniel D’souza Julia Kreutzer and 56 more

10.1162/tacl article EN other-oa 2021-06-14

InkubaLM: A small language model for low-resource African languages

OPENALEX - Publications

Atnafu Lambebo Tonja Bonaventure F. P. Dossou Jessica Ojo Jenalea Rajab Fadel Thior and 6 more

High-resource language models often fall short in the African context, where there is a critical need for that are efficient, accessible, and locally relevant, even amidst significant computing data constraints. This paper introduces InkubaLM, small model with 0.4 billion parameters, which achieves performance comparable to significantly larger parameter counts more extensive training on tasks such as machine translation, question-answering, AfriMMLU, AfriXnli task. Notably, InkubaLM...

10.48550/arxiv.2408.17024 preprint EN arXiv (Cornell University) 2024-08-30

Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu

OPENALEX - Publications

Derwin Ngomane Rooweither Mabuya Jade Abbott Vukosi Marivate

In this study, we investigate the effectiveness of using cross-lingual word embeddings for zero-shot transfer learning between a language with an abundant resource, English, and languagewith limited isiZulu. IsiZulu is part South African Nguni family, which characterised by complex agglutinating morphology. We use VecMap, open source tool, to obtain embeddings. To perform extrinsic evaluation embeddings, train news classifier on labelled English data in order categorise unlabelled isiZulu...

10.18653/v1/2023.rail-1.2 article EN cc-by 2023-01-01

Coming Soon ...