- Natural Language Processing Techniques
- Topic Modeling
- Gastric Cancer Management and Outcomes
- Speech Recognition and Synthesis
- Colorectal Cancer Screening and Detection
- Gastrointestinal Tumor Research and Treatment
- Metastasis and carcinoma case studies
- Gastrointestinal disorders and treatments
- Esophageal and GI Pathology
- Multimodal Machine Learning Applications
- Neurotransmitter Receptor Influence on Behavior
- Gastrointestinal Bleeding Diagnosis and Treatment
- Receptor Mechanisms and Signaling
- Liver Disease Diagnosis and Treatment
- Pharmacological Receptor Mechanisms and Effects
- Speech and dialogue systems
- Radiomics and Machine Learning in Medical Imaging
- Diverticular Disease and Complications
- Esophageal Cancer Research and Treatment
- Biliary and Gastrointestinal Fistulas
- Head and Neck Cancer Studies
- Cancer Diagnosis and Treatment
- Colorectal Cancer Surgical Treatments
- Parkinson's Disease Mechanisms and Treatments
- Clinical Nutrition and Gastroenterology
National Defense Medical Center
2014-2024
Tri-Service General Hospital
2014-2024
Temple University
2020-2023
Baylor College of Medicine
2023
Institut national de recherche en informatique et en automatique
2023
Carnegie Mellon University
2023
Johns Hopkins University
2022
City University of Hong Kong
2022
Meta (United States)
2022
Meta (Israel)
2019-2021
Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.
Abstract One of the biggest challenges hindering progress in low-resource and multilingual machine translation is lack good evaluation benchmarks. Current benchmarks either coverage languages, consider only restricted domains, or are low quality because they constructed using semi-automatic procedures. In this work, we introduce Flores-101 benchmark, consisting 3001 sentences extracted from English Wikipedia covering a variety different topics domains. These have been translated 101...
Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.
Creating the Babel Fish, a tool that helps individuals translate speech between any two languages, requires advanced technological innovation and linguistic expertise. Although conventional speech-to-speech translation systems composed of multiple subsystems performing in cascaded fashion exist1–3, scalable high-performing unified systems4,5 remain underexplored. To address this gap, here we introduce SEAMLESSM4T–Massively Multilingual Multimodal Machine Translation–a single model supports...
Recent work demonstrates the potential of multilingual pretraining creating one model that can be used for various tasks in different languages. Previous has demonstrated machine translation systems created by finetuning on bitext. In this work, we show models through finetuning. Instead direction, a pretrained is finetuned many directions at same time. Compared to trained from scratch, starting incorporates benefits large quantities unlabeled monolingual data, which particularly important...
Abstract Van der Waals heterobilayers of transition metal dichalcogenides with spin–valley coupling carriers in different layers have emerged as a new platform for exploring spin/valleytronic applications. The interlayer was predicted to exhibit subtle changes the atomic registry. Manually stacked heterobilayers, however, are incommensurate inevitable twist and/or lattice mismatch, where properties associated registry difficult access by optical means. Here, we unveil distinct polarization...
One of the biggest challenges hindering progress in low-resource and multilingual machine translation is lack good evaluation benchmarks. Current benchmarks either coverage languages, consider only restricted domains, or are low quality because they constructed using semi-automatic procedures. In this work, we introduce FLORES-101 benchmark, consisting 3001 sentences extracted from English Wikipedia covering a variety different topics domains. These have been translated 101 languages by...
Recent work demonstrates the potential of training one model for multilingual machine translation.In parallel, denoising pretraining using unlabeled monolingual data as a starting point finetuning bitext translation systems has demonstrated strong performance gains.However, little been explored on to combine with in single model.In this work, we fill gap by studying how models can be created through finetuning.Fintuning from pretrained incorporates benefits large quantities data, which is...
Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.
Background: Although cold snare polypectomy (CSP) is considered effective in reducing delayed postpolypectomy bleeding risk, direct evidence supporting its safety the general population remains lacking. Objective: To clarify whether CSP would reduce risk after compared with hot (HSP) population. Design: Multicenter randomized controlled study. (ClinicalTrials.gov: NCT03373136) Setting: 6 sites Taiwan, July 2018 through 2020. Participants: Participants aged 40 years or older polyps of 4 to 10...
BACKGROUND : Artificial intelligence (AI) research in colonoscopy is progressing rapidly but widespread clinical implementation not yet a reality. We aimed to identify the top priorities. METHODS An established modified Delphi approach for priority setting was used. Fifteen international experts, including endoscopists and translational computer scientists/engineers, from nine countries participated an online survey over 9 months. Questions related AI were generated as long-list first round,...
Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee. Findings of the Association for Computational Linguistics: ACL 2023.
Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models enable end-to-end expressive and multilingual translations in streaming fashion. First, contribute an improved version the massively multimodal SeamlessM4T model-SeamlessM4T v2. This newer model, incorporating updated UnitY2 framework, was trained on more low-resource language...
INTRODUCTION: Concerns regarding bleeding remain in cold snare polypectomy (CSP) for small pedunculated (0-Ip) polyps. The aim of this study was to compare the risk CSP and hot (HSP) such lesions. METHODS: Data on 0-Ip colorectal polyps ≤10 mm were extracted from a large, pragmatic, randomized trial. Immediate postpolypectomy (IPPB), defined as perioperative use clip bleeding, evaluated through polyp-level analysis. Delayed (DPPB), occurring within 2 weeks postoperatively, assessed at...
Objectives Gilbert's syndrome is a congenital, nonhemolytic, unconjugated hyperbilirubinemia. The most common genotype of the homozygous polymorphism, A(TA)7TAA, in promoter gene for UDP-glucuronosyltransferase 1A1 (UGT1A1), with thymine adenine insertion TATA-box-like sequence, which results decrease UGT1A1 activity. mechanism responsible this activity, however, has not been elucidated. To clarify underlying deficiency activity patients syndrome. Methods assay using wild-type A(TA)6TAA or...
This paper describes Facebook AI's submission to WMT20 shared news translation task. We focus on the low resource setting and participate in two language pairs, Tamil <-> English Inuktitut English, where there are limited out-of-domain bitext monolingual data. approach problem using main strategies, leveraging all available data adapting system target domain. explore techniques that leverage from languages, such as self-supervised model pretraining, multilingual models, augmentation,...