- Topic Modeling
- Natural Language Processing Techniques
- Speech and dialogue systems
- Video Surveillance and Tracking Methods
- Anomaly Detection Techniques and Applications
- Multimodal Machine Learning Applications
- Edcuational Technology Systems
- Text Readability and Simplification
- Evacuation and Crowd Dynamics
- Speech Recognition and Synthesis
- Traffic Prediction and Management Techniques
- Explainable Artificial Intelligence (XAI)
- Machine Learning and Data Classification
- Machine Learning in Healthcare
- Hallucinations in medical conditions
- Bayesian Modeling and Causal Inference
- Advanced Neural Network Applications
- Digital Storytelling and Education
- Culinary Culture and Tourism
- Text and Document Classification Technologies
- Digital Mental Health Interventions
- Mental Health via Writing
- Fire Detection and Safety Systems
- Artificial Intelligence in Healthcare and Education
- Neural Networks and Applications
University of Hong Kong
2022-2023
Hong Kong University of Science and Technology
2022-2023
University of Tasmania
2023
Badan Penelitian dan Pengembangan Kesehatan
2023
Bandung Institute of Technology
2018-2021
Yejin Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji, Tiezheng Yu, Willy Chung, Quyet V. Do, Yan Xu, Pascale Fung. Proceedings of the 13th International Joint Conference on Natural Language Processing and 3rd Asia-Pacific Chapter Association for Computational Linguistics (Volume 1: Long Papers). 2023.
This paper proposes a framework for quantitatively evaluating interactive LLMs such as ChatGPT using publicly available data sets. We carry out an extensive technical evaluation of 23 sets covering 8 different common NLP application tasks. evaluate the multitask, multilingual and multi-modal aspects based on these newly designed multimodal dataset. find that outperforms with zero-shot learning most tasks even fine-tuned models some it is better at understanding non-Latin script languages...
Although Indonesian is known to be the fourth most frequently used language over internet, research progress on this in natural processing (NLP) slow-moving due a lack of available resources. In response, we introduce first-ever vast resource for training, evaluating, and benchmarking understanding (IndoNLU) tasks. IndoNLU includes twelve tasks, ranging from single sentence classification pair-sentences sequence labeling with different levels complexity. The datasets tasks lie domains styles...
Samuel Cahyawijaya, Genta Indra Winata, Bryan Wilie, Karissa Vincentio, Xiaohong Li, Adhiguna Kuncoro, Sebastian Ruder, Zhi Yuan Lim, Syafri Bahar, Masayu Khodra, Ayu Purwarianti, Pascale Fung. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021.
Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Winata, Bryan Wilie, Fajri Koto, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Muhammad Satrio Wicaksono, Ivan Parmonangan, Ika Alfina, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali Septiandri, James Jaya, Kaustubh Dhole, Arie Suryani, Rifki Afina Putri, Dan Su, Keith Stevens, Made Nindyatama Nityasya, Adilazuarda, Ryan Hadiwijaya,...
Dialogue systems can leverage large pre-trained language models and knowledge to generate fluent informative responses. However, these are still prone produce hallucinated responses not supported by the input source, which greatly hinders their application. The heterogeneity between external dialogue context challenges representation learning source integration, further contributes unfaithfulness. To handle this challenge more faithful responses, paper presents RHO (ρ) utilizing...
Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina Mcmillan-major, Anna Shvets, Ashish Upadhyay, Bernd Bohnet, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna...
Although Indonesian is known to be the fourth most frequently used language over internet, research progress on this in natural processing (NLP) slow-moving due a lack of available resources. In response, we introduce first-ever vast resource for training, evaluation, and benchmarking understanding (IndoNLU) tasks. IndoNLU includes twelve tasks, ranging from single sentence classification pair-sentences sequence labeling with different levels complexity. The datasets tasks lie domains styles...
Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Linuwih, Bryan Wilie, Galih Muridan, Genta Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung. Proceedings of the 13th International Joint Conference on Natural Language Processing and 3rd Asia-Pacific Chapter Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Question rewriting (QR) is a subtask of conversational question answering (CQA) aiming to ease the challenges understanding dependencies among dialogue history by reformulating questions in self-contained form. Despite seeming plausible, little evidence available justify QR as mitigation method for CQA. To verify effectiveness CQA, we investigate reinforcement learning approach that integrates and CQA tasks does not require corresponding datasets targeted CQA.We find, however, RL on par with...
Large language models (LLMs) have been used for diverse tasks in natural processing (NLP), yet remain under-explored task-oriented dialogue systems (TODS), especially end-to-end TODS.We present In-structTODS, a novel off-the-shelf framework zero-shot that can adapt to domains without fine-tuning.By leveraging LLMs, Instruct-TODS generates proxy belief state seamlessly translates user intentions into dynamic queries efficient interaction with any KB.Our extensive experiments demonstrate...
Dialogue systems can leverage large pre-trained language models and knowledge to generate fluent informative responses. However, these are still prone produce hallucinated responses not supported by the input source, which greatly hinders their application. The heterogeneity between external dialogue context challenges representation learning source integration, further contributes unfaithfulness. To handle this challenge more faithful responses, paper presents RHO ($\rho$) utilizing...
Resolving dependencies among dialogue history is one of the main obstacles in research on conversational question answering (QA). The rewrites (QR) task has been shown to be effective solve this problem by reformulating questions a self-contained form. However, QR datasets are limited and existing methods tend depend assumption existence corresponding for every CQA dataset.This paper proposes reinforcement learning approach that integrates tasks without labeled datasets. We train model based...
Samuel Cahyawijaya, Bryan Wilie, Holy Lovenia, Huan Zhong, MingQian Yuk-Yu Nancy Ip, Pascale Fung. Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis (LOUHI). 2022.
Large language models (LLMs) show remarkable human-like capability in various domains and languages. However, a notable quality gap arises low-resource languages, e.g., Indonesian indigenous rendering them ineffective inefficient such linguistic contexts. To bridge this gap, we introduce Cendol, collection of LLMs encompassing both decoder-only encoder-decoder architectures across range model sizes. We highlight Cendol's effectiveness diverse array tasks, attaining 20% improvement,...
The widespread application of Large Language Models (LLMs) across various tasks and fields has necessitated the alignment these models with human values preferences. Given approaches value alignment, ranging from Reinforcement Learning Human Feedback (RLHF), to constitutional learning, etc. there is an urgent need understand scope nature injected into before their release. There also a for model without costly large scale annotation effort. We propose UniVaR, high-dimensional representation...
The capability to reason from text is crucial for real-world NLP applications. Real-world scenarios often involve incomplete or evolving data. In response, individuals update their beliefs and understandings accordingly. However, most existing evaluations assume that language models (LMs) operate with consistent information. We introduce Belief-R, a new dataset designed test LMs' belief revision ability when presented evidence. Inspired by how humans suppress prior inferences, this task...
The hallucination problem of Large Language Models (LLMs) significantly limits their reliability and trustworthiness. Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries. Inspired by this, our paper investigates whether LLMs can estimate own risk before response generation. We analyze the internal mechanisms broadly both in terms training data sources across 15 diverse Natural Generation (NLG) tasks, spanning over 700 datasets. Our...
Vision Language Models (VLMs) often struggle with culture-specific knowledge, particularly in languages other than English and underrepresented cultural contexts. To evaluate their understanding of such we introduce WorldCuisines, a massive-scale benchmark for multilingual multicultural, visually grounded language understanding. This includes visual question answering (VQA) dataset text-image pairs across 30 dialects, spanning 9 families featuring over 1 million data points, making it the...
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but evaluation choices become sub-optimal as better alternatives arise. problem especially pertinent natural language generation requires ever-improving suites of datasets, metrics, and human make definitive claims. To following best model practices easier, we introduce GEMv2. The new version...
At the center of underlying issues that halt Indonesian natural language processing (NLP) research advancement, we find data scarcity. Resources in languages, especially local ones, are extremely scarce and underrepresented. Many researchers do not publish their dataset. Furthermore, few public datasets have scattered across different platforms, thus makes performing reproducible data-centric NLP even more arduous. Rising to this challenge, initiate first crowdsourcing effort, NusaCrowd....