Peng–Jen Chen

ORCID: 0000-0001-5400-905X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Topic Modeling
  • Gastric Cancer Management and Outcomes
  • Speech Recognition and Synthesis
  • Colorectal Cancer Screening and Detection
  • Gastrointestinal Tumor Research and Treatment
  • Metastasis and carcinoma case studies
  • Gastrointestinal disorders and treatments
  • Esophageal and GI Pathology
  • Multimodal Machine Learning Applications
  • Neurotransmitter Receptor Influence on Behavior
  • Gastrointestinal Bleeding Diagnosis and Treatment
  • Receptor Mechanisms and Signaling
  • Liver Disease Diagnosis and Treatment
  • Pharmacological Receptor Mechanisms and Effects
  • Speech and dialogue systems
  • Radiomics and Machine Learning in Medical Imaging
  • Diverticular Disease and Complications
  • Esophageal Cancer Research and Treatment
  • Biliary and Gastrointestinal Fistulas
  • Head and Neck Cancer Studies
  • Cancer Diagnosis and Treatment
  • Colorectal Cancer Surgical Treatments
  • Parkinson's Disease Mechanisms and Treatments
  • Clinical Nutrition and Gastroenterology

National Defense Medical Center
2014-2024

Tri-Service General Hospital
2014-2024

Temple University
2020-2023

Baylor College of Medicine
2023

Institut national de recherche en informatique et en automatique
2023

Carnegie Mellon University
2023

Johns Hopkins University
2022

City University of Hong Kong
2022

Meta (United States)
2022

Meta (Israel)
2019-2021

Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1632 article EN cc-by 2019-01-01

Abstract One of the biggest challenges hindering progress in low-resource and multilingual machine translation is lack good evaluation benchmarks. Current benchmarks either coverage languages, consider only restricted domains, or are low quality because they constructed using semi-automatic procedures. In this work, we introduce Flores-101 benchmark, consisting 3001 sentences extracted from English Wikipedia covering a variety different topics domains. These have been translated 101...

10.1162/tacl_a_00474 article EN cc-by Transactions of the Association for Computational Linguistics 2022-01-01

Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.235 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Creating the Babel Fish, a tool that helps individuals translate speech between any two languages, requires advanced technological innovation and linguistic expertise. Although conventional speech-to-speech translation systems composed of multiple subsystems performing in cascaded fashion exist1–3, scalable high-performing unified systems4,5 remain underexplored. To address this gap, here we introduce SEAMLESSM4T–Massively Multilingual Multimodal Machine Translation–a single model supports...

10.1038/s41586-024-08359-z article EN cc-by-nc-nd Nature 2025-01-15

Recent work demonstrates the potential of multilingual pretraining creating one model that can be used for various tasks in different languages. Previous has demonstrated machine translation systems created by finetuning on bitext. In this work, we show models through finetuning. Instead direction, a pretrained is finetuned many directions at same time. Compared to trained from scratch, starting incorporates benefits large quantities unlabeled monolingual data, which particularly important...

10.48550/arxiv.2008.00401 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Abstract Van der Waals heterobilayers of transition metal dichalcogenides with spin–valley coupling carriers in different layers have emerged as a new platform for exploring spin/valleytronic applications. The interlayer was predicted to exhibit subtle changes the atomic registry. Manually stacked heterobilayers, however, are incommensurate inevitable twist and/or lattice mismatch, where properties associated registry difficult access by optical means. Here, we unveil distinct polarization...

10.1038/s41467-018-03869-7 article EN cc-by Nature Communications 2018-04-10

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is lack good evaluation benchmarks. Current benchmarks either coverage languages, consider only restricted domains, or are low quality because they constructed using semi-automatic procedures. In this work, we introduce FLORES-101 benchmark, consisting 3001 sentences extracted from English Wikipedia covering a variety different topics domains. These have been translated 101 languages by...

10.48550/arxiv.2106.03193 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Recent work demonstrates the potential of training one model for multilingual machine translation.In parallel, denoising pretraining using unlabeled monolingual data as a starting point finetuning bitext translation systems has demonstrated strong performance gains.However, little been explored on to combine with in single model.In this work, we fill gap by studying how models can be created through finetuning.Fintuning from pretrained incorporates benefits large quantities data, which is...

10.18653/v1/2021.findings-acl.304 article EN cc-by 2021-01-01

Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.63 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Background: Although cold snare polypectomy (CSP) is considered effective in reducing delayed postpolypectomy bleeding risk, direct evidence supporting its safety the general population remains lacking. Objective: To clarify whether CSP would reduce risk after compared with hot (HSP) population. Design: Multicenter randomized controlled study. (ClinicalTrials.gov: NCT03373136) Setting: 6 sites Taiwan, July 2018 through 2020. Participants: Participants aged 40 years or older polyps of 4 to 10...

10.7326/m22-2189 article EN Annals of Internal Medicine 2023-02-20

BACKGROUND : Artificial intelligence (AI) research in colonoscopy is progressing rapidly but widespread clinical implementation not yet a reality. We aimed to identify the top priorities. METHODS An established modified Delphi approach for priority setting was used. Fifteen international experts, including endoscopists and translational computer scientists/engineers, from nine countries participated an online survey over 9 months. Questions related AI were generated as long-list first round,...

10.1055/a-1306-7590 article EN cc-by Endoscopy 2020-11-09

Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.872 article EN cc-by 2023-01-01

Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee. Findings of the Association for Computational Linguistics: ACL 2023.

10.18653/v1/2023.findings-acl.307 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models enable end-to-end expressive and multilingual translations in streaming fashion. First, contribute an improved version the massively multimodal SeamlessM4T model-SeamlessM4T v2. This newer model, incorporating updated UnitY2 framework, was trained on more low-resource language...

10.48550/arxiv.2312.05187 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

INTRODUCTION: Concerns regarding bleeding remain in cold snare polypectomy (CSP) for small pedunculated (0-Ip) polyps. The aim of this study was to compare the risk CSP and hot (HSP) such lesions. METHODS: Data on 0-Ip colorectal polyps ≤10 mm were extracted from a large, pragmatic, randomized trial. Immediate postpolypectomy (IPPB), defined as perioperative use clip bleeding, evaluated through polyp-level analysis. Delayed (DPPB), occurring within 2 weeks postoperatively, assessed at...

10.14309/ajg.0000000000002847 article EN cc-by-nc-nd The American Journal of Gastroenterology 2024-05-09

Objectives Gilbert's syndrome is a congenital, nonhemolytic, unconjugated hyperbilirubinemia. The most common genotype of the homozygous polymorphism, A(TA)7TAA, in promoter gene for UDP-glucuronosyltransferase 1A1 (UGT1A1), with thymine adenine insertion TATA-box-like sequence, which results decrease UGT1A1 activity. mechanism responsible this activity, however, has not been elucidated. To clarify underlying deficiency activity patients syndrome. Methods assay using wild-type A(TA)6TAA or...

10.1097/fpc.0b013e328012d0da article EN Pharmacogenetics and Genomics 2007-04-01

This paper describes Facebook AI's submission to WMT20 shared news translation task. We focus on the low resource setting and participate in two language pairs, Tamil <-> English Inuktitut English, where there are limited out-of-domain bitext monolingual data. approach problem using main strategies, leveraging all available data adapting system target domain. explore techniques that leverage from languages, such as self-supervised model pretraining, multilingual models, augmentation,...

10.48550/arxiv.2011.08298 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...