NFDI4DS | UHH-SEMS - Publication Details

Peng–Jen Chen

ORCID: 0000-0001-5400-905X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5072019513

Research Areas

Natural Language Processing Techniques
Topic Modeling
Gastric Cancer Management and Outcomes
Speech Recognition and Synthesis
Colorectal Cancer Screening and Detection
Gastrointestinal Tumor Research and Treatment
Metastasis and carcinoma case studies
Gastrointestinal disorders and treatments
Esophageal and GI Pathology
Multimodal Machine Learning Applications
Neurotransmitter Receptor Influence on Behavior
Gastrointestinal Bleeding Diagnosis and Treatment
Receptor Mechanisms and Signaling
Liver Disease Diagnosis and Treatment
Pharmacological Receptor Mechanisms and Effects
Speech and dialogue systems
Radiomics and Machine Learning in Medical Imaging
Diverticular Disease and Complications
Esophageal Cancer Research and Treatment
Biliary and Gastrointestinal Fistulas
Head and Neck Cancer Studies
Cancer Diagnosis and Treatment
Colorectal Cancer Surgical Treatments
Parkinson's Disease Mechanisms and Treatments
Clinical Nutrition and Gastroenterology

National Defense Medical Center
2014-2024

Tri-Service General Hospital
2014-2024

Temple University
2020-2023

Baylor College of Medicine
2023

Institut national de recherche en informatique et en automatique
2023

Carnegie Mellon University
2023

Johns Hopkins University
2022

City University of Hong Kong
2022

Meta (United States)
2022

Meta (Israel)
2019-2021

Accurate Classification of Diminutive Colorectal Polyps Using Computer-Aided Analysis

OPENALEX - Publications

Peng–Jen Chen Meng-Chiung Lin Mei-Ju Lai Jung-Chun Lin Henry Horng‐Shing Lu and 1 more

10.1053/j.gastro.2017.10.010 article EN Gastroenterology 2017-10-16

The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English

OPENALEX - Publications

Francisco Guzmán Peng–Jen Chen Myle Ott Juan Pino Guillaume Lample and 3 more

Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1632 article EN cc-by 2019-01-01

The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

OPENALEX - Publications

Naman Goyal Cynthia Gao Vishrav Chaudhary Peng–Jen Chen Guillaume Wenzek and 5 more

Abstract One of the biggest challenges hindering progress in low-resource and multilingual machine translation is lack good evaluation benchmarks. Current benchmarks either coverage languages, consider only restricted domains, or are low quality because they constructed using semi-automatic procedures. In this work, we introduce Flores-101 benchmark, consisting 3001 sentences extracted from English Wikipedia covering a variety different topics domains. These have been translated 101...

10.1162/tacl_a_00474 article EN cc-by Transactions of the Association for Computational Linguistics 2022-01-01

Direct Speech-to-Speech Translation With Discrete Units

OPENALEX - Publications

Ann Lee Peng–Jen Chen Changhan Wang Jiatao Gu Sravya Popuri and 7 more

Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.235 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Joint speech and text machine translation for up to 100 languages

OPENALEX - Publications

Loïc Barrault Yu-An Chung Mariano Coria Meglioli David Dale Ning Dong and 62 more

Creating the Babel Fish, a tool that helps individuals translate speech between any two languages, requires advanced technological innovation and linguistic expertise. Although conventional speech-to-speech translation systems composed of multiple subsystems performing in cascaded fashion exist1–3, scalable high-performing unified systems4,5 remain underexplored. To address this gap, here we introduce SEAMLESSM4T–Massively Multilingual Multimodal Machine Translation–a single model supports...

10.1038/s41586-024-08359-z article EN cc-by-nc-nd Nature 2025-01-15

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

OPENALEX - Publications

Yuqing Tang Chau Tran Xian Li Peng–Jen Chen Naman Goyal and 3 more

Recent work demonstrates the potential of multilingual pretraining creating one model that can be used for various tasks in different languages. Previous has demonstrated machine translation systems created by finetuning on bitext. In this work, we show models through finetuning. Instead direction, a pretrained is finetuned many directions at same time. Compared to trained from scratch, starting incorporates benefits large quantities unlabeled monolingual data, which particularly important...

10.48550/arxiv.2008.00401 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Negative circular polarization emissions from WSe2/MoSe2 commensurate heterobilayers

OPENALEX - Publications

Wei‐Ting Hsu Li‐Syuan Lu Po-Hsun Wu Ming-Hao Lee Peng–Jen Chen and 6 more

Abstract Van der Waals heterobilayers of transition metal dichalcogenides with spin–valley coupling carriers in different layers have emerged as a new platform for exploring spin/valleytronic applications. The interlayer was predicted to exhibit subtle changes the atomic registry. Manually stacked heterobilayers, however, are incommensurate inevitable twist and/or lattice mismatch, where properties associated registry difficult access by optical means. Here, we unveil distinct polarization...

10.1038/s41467-018-03869-7 article EN cc-by Nature Communications 2018-04-10

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

OPENALEX - Publications

Naman Goyal Cynthia Gao Vishrav Chaudhary Peng–Jen Chen Guillaume Wenzek and 5 more

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is lack good evaluation benchmarks. Current benchmarks either coverage languages, consider only restricted domains, or are low quality because they constructed using semi-automatic procedures. In this work, we introduce FLORES-101 benchmark, consisting 3001 sentences extracted from English Wikipedia covering a variety different topics domains. These have been translated 101 languages by...

10.48550/arxiv.2106.03193 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Multilingual Translation from Denoising Pre-Training

OPENALEX - Publications

Yuqing Tang Chau Tran Xian Li Peng–Jen Chen Naman Goyal and 3 more

Recent work demonstrates the potential of training one model for multilingual machine translation.In parallel, denoising pretraining using unlabeled monolingual data as a starting point finetuning bitext translation systems has demonstrated strong performance gains.However, little been explored on to combine with in single model.In this work, we fill gap by studying how models can be created through finetuning.Fintuning from pretrained incorporates benefits large quantities data, which is...

10.18653/v1/2021.findings-acl.304 article EN cc-by 2021-01-01

Textless Speech-to-Speech Translation on Real Data

OPENALEX - Publications

Ann Lee Hongyu Gong Paul-Ambroise Duquenne Holger Schwenk Peng–Jen Chen and 6 more

Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.63 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Cold Versus Hot Snare Polypectomy for Small Colorectal Polyps

OPENALEX - Publications

Li‐Chun Chang Chi‐Yang Chang Chi-Yi Chen Cheng‐Hao Tseng Peng–Jen Chen and 12 more

Background: Although cold snare polypectomy (CSP) is considered effective in reducing delayed postpolypectomy bleeding risk, direct evidence supporting its safety the general population remains lacking. Objective: To clarify whether CSP would reduce risk after compared with hot (HSP) population. Design: Multicenter randomized controlled study. (ClinicalTrials.gov: NCT03373136) Setting: 6 sites Taiwan, July 2018 through 2020. Participants: Participants aged 40 years or older polyps of 4 to 10...

10.7326/m22-2189 article EN Annals of Internal Medicine 2023-02-20

Establishing key research questions for the implementation of artificial intelligence in colonoscopy: a modified Delphi method

OPENALEX - Publications

Omer F. Ahmad Yuichi Mori Masashi Misawa Toyoki Kudo J. Anderson and 18 more

BACKGROUND : Artificial intelligence (AI) research in colonoscopy is progressing rapidly but widespread clinical implementation not yet a reality. We aimed to identify the top priorities. METHODS An established modified Delphi approach for priority setting was used. Fifteen international experts, including endoscopists and translational computer scientists/engineers, from nine countries participated an online survey over 9 months. Questions related AI were generated as long-list first round,...

10.1055/a-1306-7590 article EN cc-by Endoscopy 2020-11-09

Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

OPENALEX - Publications

Sravya Popuri Peng–Jen Chen Changhan Wang Juan Pino Yossi Adi and 3 more

10.21437/interspeech.2022-11032 article EN Interspeech 2022 2022-09-16

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

OPENALEX - Publications

Hirofumi Inaguma Sravya Popuri Ilia Kulikov Peng–Jen Chen Changhan Wang and 5 more

Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.872 article EN cc-by 2023-01-01

Speech-to-Speech Translation for a Real-world Unwritten Language

OPENALEX - Publications

Peng–Jen Chen Kevin Tran Yilin Yang Jingfei Du Justine Kao and 11 more

Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee. Findings of the Association for Computational Linguistics: ACL 2023.

10.18653/v1/2023.findings-acl.307 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2023-01-01

Seamless: Multilingual Expressive and Streaming Speech Translation

OPENALEX - Publications

Seamless Communication Loïc Barrault Yu-An Chung Mariano Coria Meglioli David C. Dale and 60 more

Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models enable end-to-end expressive and multilingual translations in streaming fashion. First, contribute an improved version the massively multimodal SeamlessM4T model-SeamlessM4T v2. This newer model, incorporating updated UnitY2 framework, was trained on more low-resource language...

10.48550/arxiv.2312.05187 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Bleeding Risk of Cold Versus Hot Snare Polypectomy for Pedunculated Colorectal Polyps Measuring 10 mm or Less: Subgroup Analysis of a Large Randomized Controlled Trial

OPENALEX - Publications

Cheng‐Hao Tseng Li‐Chun Chang Jia‐Ling Wu Chi‐Yang Chang Chi-Yi Chen and 14 more

INTRODUCTION: Concerns regarding bleeding remain in cold snare polypectomy (CSP) for small pedunculated (0-Ip) polyps. The aim of this study was to compare the risk CSP and hot (HSP) such lesions. METHODS: Data on 0-Ip colorectal polyps ≤10 mm were extracted from a large, pragmatic, randomized trial. Immediate postpolypectomy (IPPB), defined as perioperative use clip bleeding, evaluated through polyp-level analysis. Delayed (DPPB), occurring within 2 weeks postoperatively, assessed at...

10.14309/ajg.0000000000002847 article EN cc-by-nc-nd The American Journal of Gastroenterology 2024-05-09

Molecular pathogenesis of Gilbert's syndrome: decreased TATA-binding protein binding affinity of UGT1A1 gene promoter

OPENALEX - Publications

Tsai‐Yuan Hsieh Tzu‐Yue Shiu Shih‐Ming Huang Hsuan‐Hwai Lin Tai-Chi Lee and 6 more

Objectives Gilbert's syndrome is a congenital, nonhemolytic, unconjugated hyperbilirubinemia. The most common genotype of the homozygous polymorphism, A(TA)7TAA, in promoter gene for UDP-glucuronosyltransferase 1A1 (UGT1A1), with thymine adenine insertion TATA-box-like sequence, which results decrease UGT1A1 activity. mechanism responsible this activity, however, has not been elucidated. To clarify underlying deficiency activity patients syndrome. Methods assay using wild-type A(TA)6TAA or...

10.1097/fpc.0b013e328012d0da article EN Pharmacogenetics and Genomics 2007-04-01

Endoscopic submucosal dissection with the pulley method for early-stage gastric cancer (with video)

OPENALEX - Publications

Chung‐Hsien Li Peng–Jen Chen Heng-Cheng Chu Tien‐Yu Huang Yu‐Lueng Shih and 2 more

10.1016/j.gie.2010.08.041 article EN Gastrointestinal Endoscopy 2010-10-28

Effect of surface potential on epithelial cell adhesion, proliferation and morphology

OPENALEX - Publications

Hsun‐Yun Chang Wei‐Lun Kao Yun‐Wen You Yi-Hsuan Chu Kuo-Jui Chu and 4 more

10.1016/j.colsurfb.2016.01.049 article EN Colloids and Surfaces B Biointerfaces 2016-01-29

Facebook AI's WMT20 News Translation Task Submission

OPENALEX - Publications

Peng–Jen Chen Ann Lee Changhan Wang Naman Goyal Angela Fan and 2 more

This paper describes Facebook AI's submission to WMT20 shared news translation task. We focus on the low resource setting and participate in two language pairs, Tamil <-> English Inuktitut English, where there are limited out-of-domain bitext monolingual data. approach problem using main strategies, leveraging all available data adapting system target domain. explore techniques that leverage from languages, such as self-supervised model pretraining, multilingual models, augmentation,...

10.48550/arxiv.2011.08298 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Coming Soon ...