Jason Phang

ORCID: 0000-0003-3522-1869
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Radiomics and Machine Learning in Medical Imaging
  • AI in cancer detection
  • Speech Recognition and Synthesis
  • Explainable Artificial Intelligence (XAI)
  • Multimodal Machine Learning Applications
  • Advanced Text Analysis Techniques
  • Speech and dialogue systems
  • Adversarial Robustness in Machine Learning
  • Software Engineering Research
  • Colorectal Cancer Screening and Detection
  • Hate Speech and Cyberbullying Detection
  • Machine Learning and Data Classification
  • Text Readability and Simplification
  • Biomedical Text Mining and Ontologies
  • Image Retrieval and Classification Techniques
  • Medical Imaging and Analysis
  • Scientific Computing and Data Management
  • Intelligent Tutoring Systems and Adaptive Learning
  • Domain Adaptation and Few-Shot Learning
  • Mobile Crowdsensing and Crowdsourcing
  • Neurobiology of Language and Bilingualism
  • Multi-Agent Systems and Negotiation
  • Innovative Teaching and Learning Methods

New York University
2018-2024

Hong Kong Polytechnic University
2023

Bangalore University
2023

University of the Basque Country
2023

Nokia (United Kingdom)
2023

Cape Eleuthera Institute
2022

Machine Intelligence Research Labs
2022

The University of Tokyo
2022

German Research Centre for Artificial Intelligence
2022

Allen Institute
2021

We present a deep convolutional neural network for breast cancer screening exam classification, trained, and evaluated on over 200000 exams (over 1000000 images). Our achieves an AUC of 0.895 in predicting the presence breast, when tested population. attribute high accuracy to few technical advances. 1) network's novel two-stage architecture training procedure, which allows us use high-capacity patch-level learn from pixel-level labels alongside learning macroscopic breast-level labels. 2) A...

10.1109/tmi.2019.2945514 article EN cc-by IEEE Transactions on Medical Imaging 2019-10-07

Recent work has demonstrated that increased training dataset diversity improves general cross-domain knowledge and downstream generalization capability for large-scale language models. With this in mind, we present \textit{the Pile}: an 825 GiB English text corpus targeted at The Pile is constructed from 22 diverse high-quality subsets -- both existing newly many of which derive academic or professional sources. Our evaluation the untuned performance GPT-2 GPT-3 on shows these models...

10.48550/arxiv.2101.00027 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Sidney Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, Usvsn Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbach. Proceedings of BigScience Episode #5 -- Workshop on Challenges & Perspectives in Creating Large Language Models. 2022.

10.18653/v1/2022.bigscience-1.9 article EN cc-by 2022-01-01

Pretraining sentence encoders with language modeling and related unsupervised tasks has recently been shown to be very effective for understanding tasks. By supplementing model-style pretraining further training on data-rich supervised tasks, such as natural inference, we obtain additional performance improvements the GLUE benchmark. Applying supplementary BERT (Devlin et al., 2018), attain a score of 81.8---the state art (as 02/24/2019) 1.4 point improvement over BERT. We also observe...

10.48550/arxiv.1811.01088 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman. Proceedings of the 58th Annual Meeting Association for Computational Linguistics. 2020.

10.18653/v1/2020.acl-main.467 article EN cc-by 2020-01-01

Medical images differ from natural in significantly higher resolutions and smaller regions of interest. Because these differences, neural network architectures that work well for might not be applicable to medical image analysis. In this work, we propose a novel model address unique properties images. This first uses low-capacity, yet memory-efficient, on the whole identify most informative regions. It then applies another higher-capacity collect details chosen Finally, it employs fusion...

10.1016/j.media.2020.101908 article EN cc-by-nc-nd Medical Image Analysis 2020-12-17

Alex Warstadt, Yu Cao, Ioana Grosu, Wei Peng, Hagen Blix, Yining Nie, Anna Alsop, Shikha Bordia, Haokun Liu, Alicia Parrish, Sheng-Fu Wang, Jason Phang, Anhad Mohananey, Phu Mon Htut, Paloma Jeretic, Samuel R. Bowman. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1286 article EN cc-by 2019-01-01

We investigate the extent to which individual attention heads in pretrained transformer language models, such as BERT and RoBERTa, implicitly capture syntactic dependency relations. employ two methods---taking maximum weight computing spanning tree---to extract implicit relations from weights of each layer/head, compare them ground-truth Universal Dependency (UD) trees. show that, for some UD relation types, there exist that can recover type significantly better than baselines on parsed...

10.48550/arxiv.1911.12246 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Yada Pruksachatkun, Phil Yeres, Haokun Liu, Jason Phang, Phu Mon Htut, Alex Wang, Ian Tenney, Samuel R. Bowman. Proceedings of the 58th Annual Meeting Association for Computational Linguistics: System Demonstrations. 2020.

10.18653/v1/2020.acl-demos.15 preprint EN 2020-01-01

It is well documented that NLP models learn social biases, but little work has been done on how these biases manifest in model outputs for applied tasks like question answering (QA). We introduce the Bias Benchmark QA (BBQ), a dataset of question-sets constructed by authors highlight attested against people belonging to protected classes along nine dimensions relevant U.S. English-speaking contexts. Our task evaluate responses at two levels: (i) given an under-informative context, we test...

10.18653/v1/2022.findings-acl.165 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01

Language models (LMs) are pretrained to imitate internet text, including content that would violate human preferences if generated by an LM: falsehoods, offensive comments, personally identifiable information, low-quality or buggy code, and more. Here, we explore alternative objectives for pretraining LMs in a way also guides them generate text aligned with preferences. We benchmark five feedback across three tasks study how they affect the trade-off between alignment capabilities of LMs....

10.48550/arxiv.2302.08582 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Humans possess an extraordinary ability to create and utilize tools, allowing them overcome physical limitations explore new frontiers. With the advent of foundation models, AI systems have potential be equally adept in tool use as humans. This paradigm, i.e., learning with combines strengths specialized tools models achieve enhanced accuracy, efficiency, automation problem-solving. Despite its immense potential, there is still a lack comprehensive understanding key challenges,...

10.48550/arxiv.2304.08354 preprint EN other-oa arXiv (Cornell University) 2023-01-01

While pretrained models such as BERT have shown large gains across natural language understanding tasks, their performance can be improved by further training the model on a data-rich intermediate task, before fine-tuning it target task. However, is still poorly understood when and why intermediate-task beneficial for given To investigate this, we perform large-scale study RoBERTa with 110 intermediate-target task combinations. We evaluate all trained 25 probing tasks meant to reveal...

10.48550/arxiv.2005.00628 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Richard Yuanzhe Pang, Alicia Parrish, Nitish Joshi, Nikita Nangia, Jason Phang, Angelica Chen, Vishakh Padmakumar, Johnny Ma, Jana Thompson, He He, Samuel Bowman. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.391 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

While large pretrained Transformer models have proven highly capable at tackling natural language tasks, handling long sequence inputs still poses a significant challenge. One such task is input summarization, where are longer than the maximum context of most models. Through an extensive set experiments, we investigate what model architectural changes and pretraining paradigms efficiently adapt for summarization. We find that staggered, block-local with global encoder tokens strikes good...

10.18653/v1/2023.emnlp-main.240 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

In sentence compression, the task of shortening sentences while retaining original meaning, models tend to be trained on large corpora containing pairs verbose and compressed sentences. To remove need for paired corpora, we emulate a summarization add noise extend train denoising auto-encoder recover original, constructing an end-to-end training regime without any examples We conduct human evaluation our model standard text dataset show that it performs comparably supervised baseline based...

10.18653/v1/k18-1040 article EN cc-by 2018-01-01

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive language model trained on the Pile, whose weights will be made freely and openly available to public through permissive license. It is, best of our knowledge, largest dense that has publicly at time submission. In this work, we describe \model{}'s architecture training evaluate its performance range language-understanding, mathematics, knowledge-based tasks. find GPT-NeoX-20B is particularly powerful few-shot reasoner gains far...

10.48550/arxiv.2204.06745 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Teven Le Scao, Thomas Wang, Daniel Hesslow, Stas Bekman, M Saiful Bari, Stella Biderman, Hady Elsahar, Niklas Muennighoff, Jason Phang, Ofir Press, Colin Raffel, Victor Sanh, Sheng Shen, Lintang Sutawika, Jaesung Tae, Zheng Xin Yong, Julien Launay, Iz Beltagy. Findings of the Association for Computational Linguistics: EMNLP 2022.

10.18653/v1/2022.findings-emnlp.54 article EN cc-by 2022-01-01

Intermediate-task training—fine-tuning a pretrained model on an intermediate task before fine-tuning again the target task—often improves performance substantially language understanding tasks in monolingual English settings. We investigate whether intermediate-task training is still helpful non-English tasks. Using nine language-understanding tasks, we evaluate transfer zero-shot cross-lingual setting XTREME benchmark. see large improvements from BUCC and Tatoeba sentence retrieval moderate...

10.18653/v1/2020.aacl-main.56 article EN 2020-01-01

Julian Michael, Ari Holtzman, Alicia Parrish, Aaron Mueller, Alex Wang, Angelica Chen, Divyam Madaan, Nikita Nangia, Richard Yuanzhe Pang, Jason Phang, Samuel R. Bowman. Proceedings of the 61st Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2023.

10.18653/v1/2023.acl-long.903 article EN cc-by 2023-01-01

Summarization datasets are often assembled either by scraping naturally occurring public-domain summaries—which nearly always in difficult-to-work-with technical domains—or using approximate heuristics to extract them from everyday text—which frequently yields unfaithful summaries. In this work, we turn a slower but more straightforward approach developing summarization benchmark data: We hire highly-qualified contractors read stories and write original summaries scratch. To amortize reading...

10.18653/v1/2022.emnlp-main.75 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01
Coming Soon ...