NFDI4DS | UHH-SEMS - Publication Details

Hao Cheng

ORCID: 0000-0003-4823-0908

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101511712

Research Areas

Topic Modeling
Natural Language Processing Techniques
Multimodal Machine Learning Applications
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Adversarial Robustness in Machine Learning
Image Retrieval and Classification Techniques
Speech and dialogue systems
Anomaly Detection Techniques and Applications
Data Management and Algorithms
Biomedical Text Mining and Ontologies
Chaos control and synchronization
Advanced Clustering Algorithms Research
Face and Expression Recognition
Sentiment Analysis and Opinion Mining
Neural Networks and Reservoir Computing
Advanced Neural Network Applications
Machine Learning and ELM
Nonlinear Dynamics and Pattern Formation
Human Pose and Action Recognition
Advanced Statistical Methods and Models
Random lasers and scattering media
Bayesian Modeling and Causal Inference
Advanced Graph Neural Networks
AI in Service Interactions

Nanyang Technological University
2020-2024

Jinan University
2024

State Key Laboratory of Cryptology
2024

University of Illinois Urbana-Champaign
2023

Microsoft (United States)
2021-2023

Carnegie Mellon University
2021-2023

Nankai University
2021-2022

Southwest University
2021-2022

Civil Aviation Management Institute of China
2022

Microsoft (Finland)
2020-2022

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

OPENALEX - Publications

裕二池谷 Robert Tinn Hao Cheng Michael Lucas Naoto Usuyama and 4 more

Pretraining large neural language models, such as BERT, has led to impressive gains on many natural processing (NLP) tasks. However, most pretraining efforts focus general domain corpora, newswire and Web. A prevailing assumption is that even domain-specific can benefit by starting from general-domain models. In this article, we challenge showing for domains with abundant unlabeled text, biomedicine, models scratch results in substantial over continual of To facilitate investigation, compile...

10.1145/3458754 article EN ACM Transactions on Computing for Healthcare 2021-10-15

Language Models for Image Captioning: The Quirks and What Works

OPENALEX - Publications

Jacob Devlin Hao Cheng Hao Fang Saurabh Gupta Li Deng and 3 more

Jacob Devlin, Hao Cheng, Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, Margaret Mitchell. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2017 article EN cc-by 2015-01-01

Fine-tuning large neural language models for biomedical natural language processing

OPENALEX - Publications

Robert Tinn Hao Cheng 裕二池谷 Naoto Usuyama Xiaodong Liu and 3 more

Large neural language models have transformed modern natural processing (NLP) applications. However, fine-tuning such for specific tasks remains challenging as model size increases, especially with small labeled datasets, which are common in biomedical NLP. We conduct a systematic study on stability show that performance may be sensitive to pretraining settings and an exploration of techniques addressing instability. these can substantially improve low-resource NLP Specifically, freezing...

10.1016/j.patter.2023.100729 article EN cc-by-nc-nd Patterns 2023-04-01

Adversarial Robustness vs. Model Compression, or Both?

OPENALEX - Publications

Shaokai Ye Kaidi Xu Sijia Liu Hao Cheng Jan-Henrik Lambrechts and 5 more

It is well known that deep neural networks (DNNs) are vulnerable to adversarial attacks, which implemented by adding crafted perturbations onto benign examples. Min-max robust optimization based training can provide a notion of security against attacks. However, robustness requires significantly larger capacity the network than for natural with only This paper proposes framework concurrent and weight pruning enables model compression while still preserving essentially tackles dilemma...

10.1109/iccv.2019.00020 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Adversarial Training for Large Neural Language Models

OPENALEX - Publications

Xiaodong Liu Hao Cheng Pengcheng He Weizhu Chen Yu Wang and 2 more

Generalization and robustness are both key desiderata for designing machine learning methods. Adversarial training can enhance robustness, but past work often finds it hurts generalization. In natural language processing (NLP), pre-training large neural models such as BERT have demonstrated impressive gain in generalization a variety of tasks, with further improvement from adversarial fine-tuning. However, these still vulnerable to attacks. this paper, we show that improve robustness. We...

10.48550/arxiv.2004.08994 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Dialogue State Tracking with a Language Model using Schema-Driven Prompting

OPENALEX - Publications

Chia‐Hsuan Lee Hao Cheng Mari Ostendorf

Task-oriented conversational systems often use dialogue state tracking to represent the user's intentions, which involves filling in values of pre-defined slots. Many approaches have been proposed, using task-specific architectures with special-purpose classifiers. Recently, good results obtained more general based on pretrained language models. Here, we introduce a new variation modeling approach that uses schema-driven prompting provide task-aware history encoding is used for both...

10.18653/v1/2021.emnlp-main.404 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021-01-01

High-efficiency chaotic time series prediction based on time convolution neural network

OPENALEX - Publications

Wei Cheng Yan Wang Peng Zheng Xiaodong Ren Yubei Shuai and 4 more

10.1016/j.chaos.2021.111304 article EN Chaos Solitons & Fractals 2021-08-18

NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned

OPENALEX - Publications

Sewon Min Jordan Boyd‐Graber Chris Alberti Danqi Chen Eunsol Choi and 48 more

We review the EfficientQA competition from NeurIPS 2020. The focused on open-domain question answering (QA), where systems take natural language questions as input and return answers. aim of was to build that can predict correct answers while also satisfying strict on-disk memory budgets. These budgets were designed encourage contestants explore trade-off between storing retrieval corpora or parameters learned models. In this report, we describe motivation organization competition, best...

10.48550/arxiv.2101.00133 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Frequency Guidance Matters in Few-Shot Learning

OPENALEX - Publications

Hao Cheng Siyuan Yang Joey Tianyi Zhou Lanqing Guo Bihan Wen

Few-shot classification aims to learn a discriminative feature representation recognize unseen classes with few labeled support samples. While most few-shot learning methods focus on exploiting the spatial information of image samples, frequency has also been proven essential in tasks. In this paper, we investigate effect different components To enhance performance and generalizability methods, propose novel Frequency-Guided Learning framework (dubbed FGFL), which leverages task-specific...

10.1109/iccv51070.2023.01085 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Augmenting Language Models with Long-Term Memory

OPENALEX - Publications

Weizhi Wang Li Dong Hao Cheng Xiaodong Liu Xifeng Yan and 2 more

Existing large language models (LLMs) can only afford fix-sized inputs due to the input length limit, preventing them from utilizing rich long-context information past inputs. To address this, we propose a framework, Language Models Augmented with Long-Term Memory (LongMem), which enables LLMs memorize long history. We design novel decoupled network architecture original backbone LLM frozen as memory encoder and an adaptive residual side-network retriever reader. Such easily cache update...

10.48550/arxiv.2306.07174 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Disentangled Feature Representation for Few-Shot Image Classification

OPENALEX - Publications

Hao Cheng Yufei Wang Haoliang Li Alex C. Kot Bihan Wen

Learning the generalizable feature representation is critical to few-shot image classification. While recent works exploited task-specific embedding using meta-tasks for learning, they are limited in many challenging tasks as being distracted by excursive features such background, domain, and style of samples. In this work, we propose a novel disentangled (DFR) framework, dubbed DFR, learning applications. DFR can adaptively decouple discriminative that modeled classification branch, from...

10.1109/tnnls.2023.3241919 article EN IEEE Transactions on Neural Networks and Learning Systems 2023-02-16

Sounding Board: A User-Centric and Content-Driven Social Chatbot

OPENALEX - Publications

Hao Fang Hao Cheng Maarten Sap Elizabeth Clark Ari Holtzman and 3 more

Hao Fang, Cheng, Maarten Sap, Elizabeth Clark, Ari Holtzman, Yejin Choi, Noah A. Smith, Mari Ostendorf. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Demonstrations. 2018.

10.18653/v1/n18-5020 article EN cc-by 2018-01-01

Graph Neural Networks With Triple Attention for Few-Shot Learning

OPENALEX - Publications

Hao Cheng Joey Tianyi Zhou Wee Peng Tay Bihan Wen

Recent advances in Graph Neural Networks (GNNs) have achieved superior results many challenging tasks, such as few-shot learning. Despite its capacity to learn and generalize a model from only few annotated samples, GNN is limited scalability, deep models usually suffer severe over-fitting over-smoothing. In this work, we propose novel framework with <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">triple-attention mechanism</i> ,...

10.1109/tmm.2022.3233442 article EN cc-by IEEE Transactions on Multimedia 2023-01-01

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention

OPENALEX - Publications

Xu Yi‐chong Chenguang Zhu Shuohang Wang Siqi Sun Hao Cheng and 5 more

Most of today's AI systems focus on using self-attention mechanisms and transformer architectures large amounts diverse data to achieve impressive performance gains. In this paper, we propose augment the architecture with an external attention mechanism bring knowledge context bear. By integrating information into prediction process, hope reduce need for ever-larger models increase democratization systems. We find that proposed can significantly improve existing systems, allowing...

10.24963/ijcai.2022/383 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

Constrained locally weighted clustering

OPENALEX - Publications

Hao Cheng Kien A. Hua Khanh Vu

Data clustering is a difficult problem due to the complex and heterogeneous natures of multidimensional data. To improve accuracy, we propose scheme capture local correlation structures: associate each cluster with an independent weighting vector embed it in subspace spanned by adaptive combination dimensions. Our algorithm takes advantage known pairwise instance-level constraints. The data points constraint set are divided into groups through inference; group assigned feasible which...

10.14778/1453856.1453871 article EN Proceedings of the VLDB Endowment 2008-08-01

Open-Domain Name Error Detection using a Multi-Task RNN

OPENALEX - Publications

Hao Cheng Hao Fang Mari Ostendorf

Out-of-vocabulary name errors in speech recognition create significant problems for downstream language processing, but the fact that they are rare poses challenges automatic detection, particularly an open-domain scenario.To address this problem, a multi-task recurrent neural network model sentence-level detection is proposed use combination with out-of-vocabulary word detection.The also effective leveraging external text data.Experiments show 26% improvement name-error F-score over system...

10.18653/v1/d15-1085 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

Bi-directional Attention with Agreement for Dependency Parsing

OPENALEX - Publications

Hao Cheng Hao Fang Xiaodong He Jianfeng Gao Li Deng

We develop a novel bi-directional attention model for dependency parsing, which learns to agree on headword predictions from the forward and backward parsing directions. The procedure each direction is formulated as sequentially querying memory component that stores continuous embeddings. proposed parser makes use of {\it soft} embeddings, allowing implicitly capture high-order history without dramatically increasing computational complexity. conduct experiments English, Chinese, 12 other...

10.18653/v1/d16-1238 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

UnitedQA: A Hybrid Approach for Open Domain Question Answering

OPENALEX - Publications

Hao Cheng Yelong Shen Xiaodong Liu Pengcheng He Weizhu Chen and 1 more

Hao Cheng, Yelong Shen, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.240 article EN cc-by 2021-01-01

Defending against Backdoor Attack on Deep Neural Networks

OPENALEX - Publications

Hao Cheng Kaidi Xu Sijia Liu Pin‐Yu Chen Pu Zhao and 1 more

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called \textit{backdoor attack}, which injects backdoor trigger small portion of training data (also known as poisoning) such trained DNN induces misclassification while facing examples with trigger. To be specific, carefully study effect both real and synthetic attacks internal response...

10.48550/arxiv.2002.12162 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

OPENALEX - Publications

Xiaodong Liu Yu Wang Jianshu Ji Hao Cheng Xueyun Zhu and 6 more

Xiaodong Liu, Yu Wang, Jianshu Ji, Hao Cheng, Xueyun Zhu, Emmanuel Awa, Pengcheng He, Weizhu Chen, Hoifung Poon, Guihong Cao, Jianfeng Gao. Proceedings of the 58th Annual Meeting Association for Computational Linguistics: System Demonstrations. 2020.

10.18653/v1/2020.acl-demos.16 article EN cc-by 2020-01-01

Posterior Differential Regularization with f-divergence for Improving Model Robustness

OPENALEX - Publications

Hao Cheng Xiaodong Liu Lis Kanashiro Pereira Yaoliang Yu Jianfeng Gao

Hao Cheng, Xiaodong Liu, Lis Pereira, Yaoliang Yu, Jianfeng Gao. Proceedings of the 2021 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2021.

10.18653/v1/2021.naacl-main.85 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2021-01-01

Chain-of-Skills: A Configurable Model for Open-Domain Question Answering

OPENALEX - Publications

Kaixin Ma Hao Cheng Yu Zhang Xiaodong Liu Eric Nyberg and 1 more

The retrieval model is an indispensable component for real-world knowledge-intensive tasks, e.g., open-domain question answering (ODQA). As separate skills are annotated different datasets, recent work focuses on customized methods, limiting the transfer- ability and scalability. In this work, we propose a modular retriever where individual modules correspond to key that can be reused across datasets. Our approach supports flexible skill configurations based target domain boost performance....

10.18653/v1/2023.acl-long.89 article EN cc-by 2023-01-01

Self-Verification Improves Few-Shot Clinical Information Extraction

OPENALEX - Publications

Zelalem Gero Chandan Singh Hao Cheng Tristan Naumann Michel Galley and 2 more

Extracting patient information from unstructured text is a critical task in health decision-support and clinical research. Large language models (LLMs) have shown the potential to accelerate curation via few-shot in-context learning, contrast supervised learning which requires much more costly human annotations. However, despite drastic advances modern LLMs such as GPT-4, they still struggle with issues regarding accuracy interpretability, especially mission-critical domains health. Here, we...

10.48550/arxiv.2306.00024 preprint EN cc-by arXiv (Cornell University) 2023-01-01

A Dynamic Speaker Model for Conversational Interactions

OPENALEX - Publications

Hao Cheng Hao Fang Mari Ostendorf

Hao Cheng, Fang, Mari Ostendorf. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1284 article EN 2019-01-01

Coming Soon ...