NFDI4DS | UHH-SEMS - Publication Details

Ji Ma

ORCID: 0009-0009-2102-8209

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101540966

Research Areas

Natural Language Processing Techniques
Topic Modeling
Multimodal Machine Learning Applications
Advanced Data Compression Techniques
EFL/ESL Teaching and Learning
Educational and Psychological Assessments
Text and Document Classification Technologies
Speech Recognition and Synthesis
Innovative Education and Learning Practices
Speech and Audio Processing
Multilingual Education and Policy
Educator Training and Historical Pedagogy

Google (United Kingdom)
2023

RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses

OPENALEX - Publications

Honglei Zhuang Zhen Qin Rolf Jagerman Kai Hui Ji Ma and 4 more

Pretrained language models such as BERT have been shown to be exceptionally effective for text ranking. However, there are limited studies on how leverage more powerful sequence-to-sequence T5. Existing attempts usually formulate ranking a classification problem and rely postprocessing obtain ranked list. In this paper, we propose RankT5 study two T5-based model structures, an encoder-decoder encoder-only one, so that they not only can directly output scores each query-document pair, but...

10.1145/3539618.3592047 article EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023-07-18

Speech recognition in multiple languages and domains: the 2003 BBN/LIMSI EARS system

OPENALEX - Publications

Richard Schwartz Thomas Colthurst Nicolae Duta H. Gish Rishabh Iyer and 16 more

We report on the results of first evaluations for BBN/LIMSI system under new DARPA EARS program. The were carried out conversational telephone speech (CTS) and broadcast news (BN) three languages: English, Mandarin, Arabic. In addition to providing descriptions evaluation results, paper highlights methods that worked well across two domains those few one domain but not other. For BN evaluations, which had be run 10 times real-time, we demonstrated a joint with time constraint achieved better...

10.1109/icassp.2004.1326654 preprint EN IEEE International Conference on Acoustics Speech and Signal Processing 2004-09-28

DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments

OPENALEX - Publications

Ji Ma Hongming Dai Yao Mu Pengying Wu Hao Wang and 4 more

Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments has emerged as a particularly challenging task within the domain of Embodied AI. Existing datasets for developing ZSON algorithms lack consideration dynamic obstacles, object attribute diversity, scene texts, thus exhibiting noticeable discrepancies from real-world situations. To address these issues, we propose Dataset Open-Vocabulary Dynamic Environments (DOZE)...

10.48550/arxiv.2402.19007 preprint EN arXiv (Cornell University) 2024-02-29

QAmeleon: Multilingual QA with Only 5 Examples

OPENALEX - Publications

Priyanka Agrawal Chris Alberti Fantine Huot Joshua Maynez Ji Ma and 4 more

Abstract The availability of large, high-quality datasets has been a major driver recent progress in question answering (QA). Such annotated datasets, however, are difficult and costly to collect, rarely exist languages other than English, rendering QA technology inaccessible underrepresented languages. An alternative building large monolingual training is leverage pre-trained language models (PLMs) under few-shot learning setting. Our approach, QAmeleon, uses PLM automatically generate...

10.1162/tacl_a_00625 article EN cc-by Transactions of the Association for Computational Linguistics 2023-01-01

Coming Soon ...