Hoifung Poon

ORCID: 0000-0002-9067-0918
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Biomedical Text Mining and Ontologies
  • Bayesian Modeling and Causal Inference
  • Semantic Web and Ontologies
  • Radiomics and Machine Learning in Medical Imaging
  • Multimodal Machine Learning Applications
  • Machine Learning in Healthcare
  • Artificial Intelligence in Healthcare and Education
  • AI in cancer detection
  • Advanced Graph Theory Research
  • Bioinformatics and Genomic Networks
  • Cancer Genomics and Diagnostics
  • Computational Drug Discovery Methods
  • Machine Learning and Data Classification
  • Text Readability and Simplification
  • Limits and Structures in Graph Theory
  • Data Quality and Management
  • Domain Adaptation and Few-Shot Learning
  • Graph Labeling and Dimension Problems
  • Adversarial Robustness in Machine Learning
  • Genetic Associations and Epidemiology
  • Advanced Database Systems and Queries
  • Neural Networks and Applications
  • Genomics and Rare Diseases

Microsoft (United States)
2015-2025

Microsoft Research (United Kingdom)
2013-2024

Microsoft (Finland)
2020

University of Washington
2006-2016

West Virginia University
2000-2005

Singapore Polytechnic
1989-1991

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural processing (NLP) tasks. However, most pretraining efforts focus general domain corpora, newswire and Web. A prevailing assumption is that even domain-specific can benefit by starting from general-domain models. In this article, we challenge showing for domains with abundant unlabeled text, biomedicine, models scratch results in substantial over continual of To facilitate investigation, compile...

10.1145/3458754 article EN ACM Transactions on Computing for Healthcare 2021-10-15

Models that learn to represent textual and knowledge base relations in the same continuous latent space are able perform joint inferences among two kinds of obtain high accuracy on completion (Riedel et al., 2013).In this paper we propose a model captures compositional structure relations, jointly optimizes entity, base, relation representations.The proposed significantly improves performance over does not share parameters with common sub-structure.

10.18653/v1/d15-1174 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

Pre-trained language models have attracted increasing attention in the biomedical domain, inspired by their great success general natural domain. Among two main branches of pre-trained i.e. BERT (and its variants) and GPT variants), first one has been extensively studied such as BioBERT PubMedBERT. While they achieved on a variety discriminative downstream tasks, lack generation ability constrains application scope. In this paper, we propose BioGPT, domain-specific generative Transformer...

10.1093/bib/bbac409 article EN Briefings in Bioinformatics 2022-09-24

Past work in relation extraction has focused on binary relations single sentences. Recent NLP inroads high-value domains have sparked interest the more general setting of extracting n-ary that span multiple In this paper, we explore a framework based graph long short-term memory networks (graph LSTMs) can be easily extended to cross-sentence extraction. The formulation provides unified way exploring different LSTM approaches and incorporating various intra-sentential inter-sentential...

10.1162/tacl_a_00049 article EN cc-by Transactions of the Association for Computational Linguistics 2017-12-01

The key limiting factor in graphical model inference and learning is the complexity of partition function. We thus ask question: what are most general conditions under which function tractable? answer leads to a new kind deep architecture, we call sum product networks (SPNs) will present this abstract. idea SPNs compactly represent by introducing multiple layers hidden variables. An SPN rooted directed acyclic graph with variables as leaves, sums products internal nodes, weighted edges.

10.1109/iccvw.2011.6130310 article EN 2011-11-01

We present the first unsupervised approach to problem of learning a semantic parser, using Markov logic.Our USP system transforms dependency trees into quasi-logical forms, recursively induces lambda forms from these, and clusters them abstract away syntactic variations same meaning.The MAP parse sentence is obtained by assigning its parts lambda-form composing them.We evaluate our it extract knowledge base biomedical abstracts answer questions.USP substantially outperforms TextRunner, DIRT...

10.3115/1699510.1699512 article EN 2009-01-01

The growing demand for structured knowledge has led to great interest in relation extraction, especially cases with limited supervision. However, existing distance supervision approaches only extract relations expressed single sentences. In general, cross-sentence extraction is under-explored, even the supervised-learning setting. this paper, we propose first approach applying distant extraction. At core of our a graph representation that can incorporate both standard dependencies and...

10.18653/v1/e17-1110 article EN cc-by 2017-01-01

The study and understanding of human behaviour is relevant to computer science, artificial intelligence, neural computation, cognitive philosophy, psychology, several other areas. Presupposing cognition as basis behaviour, among the most prominent tools in modelling are computational-logic systems, connectionist models cognition, uncertainty. Recent studies psychology have produced a number reasoning, learning, language that underpinned by computation. In addition, efforts science research...

10.48550/arxiv.1711.03902 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Abstract Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands image tiles 1–3 . Prior models have often resorted to subsampling small portion for each slide, thus missing the important slide-level context 4 Here we present Prov-GigaPath, whole-slide foundation model pretrained on 1.3 billion 256 × in 171,189 whole slides from Providence, large US health network comprising 28 cancer centres. The originated more than 30,000...

10.1038/s41586-024-07441-w article EN cc-by Nature 2024-05-22

Large neural language models have transformed modern natural processing (NLP) applications. However, fine-tuning such for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on stability show that performance may be sensitive to pretraining settings and an exploration of techniques addressing instability. these can substantially improve low-resource NLP Specifically, freezing...

10.1016/j.patter.2023.100729 article EN cc-by-nc-nd Patterns 2023-04-01

Conversational generative AI has demonstrated remarkable promise for empowering biomedical practitioners, but current investigations focus on unimodal text. Multimodal conversational seen rapid progress by leveraging billions of image-text pairs from the public web, such general-domain vision-language models still lack sophistication in understanding and conversing about images. In this paper, we propose a cost-efficient approach training assistant that can answer open-ended research...

10.48550/arxiv.2306.00890 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Generalist foundation models such as GPT-4 have displayed surprising capabilities in a wide variety of domains and tasks. Yet, there is prevalent assumption that they cannot match specialist fine-tuned models. For example, most explorations to date on medical competency benchmarks leveraged domain-specific training, exemplified by efforts BioGPT Med-PaLM. We build prior study GPT-4's challenge the absence special training. Rather than using simple prompting highlight model's out-of-the-box...

10.48550/arxiv.2311.16452 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Biomedical data is inherently multimodal, comprising physical measurements and natural language narratives. A generalist biomedical AI model needs to simultaneously process different modalities of data, including text images. Therefore, training an effective requires high-quality multimodal such as parallel image-text pairs. Here, we present PMC-15M, a novel dataset that two orders magnitude larger than existing datasets MIMIC-CXR, spans diverse range image types. PMC-15M contains 15 million...

10.48550/arxiv.2303.00915 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Machine learning approaches to coreference resolution are typically supervised, and require expensive labeled data. Some unsupervised have been proposed (e.g., Haghighi Klein (2007)), but they less accurate. In this paper, we present the first approach that is competitive with supervised ones. This made possible by performing joint inference across mentions, in contrast pairwise classification used methods, using Markov logic as a representation language, which enables us easily express...

10.3115/1613715.1613796 article EN 2008-01-01

Morphological segmentation breaks words into morphemes (the basic semantic units). It is a key component for natural language processing systems. Unsupervised morphological attractive, because in every there are virtually unlimited supplies of text, but very few labeled resources. However, most existing model-based systems unsupervised use directed generative models, making it difficult to leverage arbitrary overlapping features that potentially helpful learning. In this paper, we present...

10.3115/1620754.1620785 article EN 2009-01-01

Modeling relation paths has offered significant gains in embedding models for knowledge base (KB) completion.However, enumerating between two entities is very expensive, and existing approaches typically resort to approximation with a sampled subset.This problem particularly acute when text jointly modeled KB relations used provide direct evidence facts mentioned it.In this paper, we propose the first exact dynamic programming algorithm which enables efficient incorporation of all bounded...

10.18653/v1/p16-1136 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016-01-01

Past work in relation extraction has focused on binary relations single sentences. Recent NLP inroads high-value domains have sparked interest the more general setting of extracting n-ary that span multiple In this paper, we explore a framework based graph long short-term memory networks (graph LSTMs) can be easily extended to cross-sentence extraction. The formulation provides unified way exploring different LSTM approaches and incorporating various intra-sentential inter-sentential...

10.48550/arxiv.1708.03743 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Robin Jia, Cliff Wong, Hoifung Poon. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1370 preprint EN 2019-01-01

Translating the genetic and epigenetic heterogeneity underlying human cancers into therapeutic strategies is an ongoing challenge. Large-scale sequencing efforts have uncovered a spectrum of mutations in many hematologic malignancies, including acute myeloid leukemia (AML), suggesting that combinations agents will be required to treat these diseases effectively. Combinatorial approaches also critical for combating emergence genetically heterogeneous subclones, rescue signals...

10.1073/pnas.1703094114 article EN Proceedings of the National Academy of Sciences 2017-08-07
Coming Soon ...