Houfeng Wang

ORCID: 0000-0001-7130-1589
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Text and Document Classification Technologies
  • Sentiment Analysis and Opinion Mining
  • Multimodal Machine Learning Applications
  • Speech and dialogue systems
  • Biomedical Text Mining and Ontologies
  • Text Readability and Simplification
  • Spam and Phishing Detection
  • Semantic Web and Ontologies
  • Advanced Graph Neural Networks
  • Domain Adaptation and Few-Shot Learning
  • Web Data Mining and Analysis
  • Data Quality and Management
  • Recommender Systems and Techniques
  • Adversarial Robustness in Machine Learning
  • Advanced Computational Techniques and Applications
  • Language, Metaphor, and Cognition
  • Expert finding and Q&A systems
  • Advanced Image and Video Retrieval Techniques
  • Speech Recognition and Synthesis
  • Algorithms and Data Compression
  • Face and Expression Recognition
  • Machine Learning in Bioinformatics

University of Chinese Academy of Sciences
2024-2025

Peking University
2015-2024

Microsoft Research Asia (China)
2016-2022

Fujian Agriculture and Forestry University
2021-2022

South China Institute of Collaborative Innovation
2015-2018

Singapore University of Technology and Design
2014

Institute of Linguistics
2014

University of Science and Technology of China
2007

Central China Normal University
1991-1996

Aspect-level sentiment classification aims at identifying the polarity of specific target in its context. Previous approaches have realized importance targets and developed various methods with goal precisely modeling thier contexts via generating target-specific representations. However, these studies always ignore separate targets. In this paper, we argue that both deserve special treatment need to be learned their own representations interactive learning. Then, propose attention networks...

10.24963/ijcai.2017/568 preprint EN 2017-07-28

Lianzhe Huang, Dehong Ma, Sujian Li, Xiaodong Zhang, Houfeng Wang. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1345 article EN cc-by 2019-01-01

Multi-label classification is an important yet challenging task in natural language processing. It more complex than single-label that the labels tend to be correlated. Existing methods ignore correlations between labels. Besides, different parts of text can contribute differently for predicting labels, which not considered by existing models. In this paper, we propose view multi-label as a sequence generation problem, and apply model with novel decoder structure solve it. Extensive...

10.48550/arxiv.1806.04822 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Relation classification is an important semantic processing task in the field of natural language (NLP).In this paper, we present a novel model BRCNN to classify relation two entities sentence.Some state-of-the-art systems concentrate on modeling shortest dependency path (SDP) between leveraging convolutional or recurrent neural networks.We further explore how make full use relations information SDP, by combining networks and twochannel with long short term memory (LSTM) units.We propose...

10.18653/v1/p16-1072 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016-01-01

Jingjing Xu, Xu Sun, Qi Zeng, Xiaodong Zhang, Xuancheng Ren, Houfeng Wang, Wenjie Li. Proceedings of the 56th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2018.

10.18653/v1/p18-1090 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Yang Liu, Furu Wei, Sujian Li, Heng Ji, Ming Zhou, Houfeng Wang. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2047 preprint EN cc-by 2015-01-01

Hierarchical text classification is a challenging subtask of multi-label due to its complex label hierarchy. Existing methods encode and hierarchy separately mix their representations for classification, where the remains unchanged all input text. Instead modeling them separately, in this work, we propose Hierarchy-guided Contrastive Learning (HGCLR) directly embed into encoder. During training, HGCLR constructs positive samples under guidance By pulling together sample, encoder can learn...

10.18653/v1/2022.acl-long.491 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Ziqiang Cao, Furu Wei, Sujian Li, Wenjie Ming Zhou, Houfeng Wang. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2015.

10.3115/v1/p15-2136 article EN cc-by 2015-01-01

Aspect term extraction (ATE) aims at identifying all aspect terms in a sentence and is usually modeled as sequence labeling problem. However, based methods cannot make full use of the overall meaning whole have limitation processing dependencies between labels. To tackle these problems, we first explore to formalize ATE sequence-to-sequence (Seq2Seq) learning task where source target are composed words labels respectively. At same time, Seq2Seq suit correspond one by one, design gated unit...

10.18653/v1/p19-1344 article EN cc-by 2019-01-01

Microblogging services, such as Twitter, have become popular channels for people to express their opinions towards a broad range of topics. Twitter generates huge volume instant messages (i.e. tweets) carrying users' sentiments and attitudes every minute, which both necessitates automatic opinion summarization poses great challenges the system. In this paper, we study problem entities, celebrities brands, in Twitter. We propose an entity-centric topic-based framework, aims produce summaries...

10.1145/2339530.2339592 article EN 2012-08-12

Previous work introduced transition-based algorithms to form a unified architecture of parsing rhetorical structures (including span, nuclearity and relation), but did not achieve satisfactory performance. In this paper, we propose that model is more appropriate for the naked discourse tree (i.e., identifying span nuclearity) due data sparsity. At same time, argue relation labeling can benefit from structure should be treated elaborately with consideration three kinds relations including...

10.18653/v1/p17-2029 article EN cc-by 2017-01-01

Answer selection plays a key role in community question answering (CQA). Previous research on answer usually ignores the problems of redundancy and noise prevalent CQA. In this paper, we propose to treat different text segments differently design novel attentive interactive neural network (AI-NN) focus those useful selection. The representations are first learned by convolutional networks (CNNs) or other architectures. Then AI-NN learns interactions each paired two texts. Row-wise...

10.1609/aaai.v31i1.11006 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2017-02-12

Large language models (LLMs) often contain misleading content, emphasizing the need to align them with human values ensure secure AI systems. Reinforcement learning from feedback (RLHF) has been employed achieve this alignment. However, it encompasses two main drawbacks: (1) RLHF exhibits complexity, instability, and sensitivity hyperparameters in contrast SFT. (2) Despite massive trial-and-error, multiple sampling is reduced pair-wise contrast, thus lacking contrasts a macro perspective. In...

10.1609/aaai.v38i17.29865 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Aspect-level sentiment classification aims at identifying the polarity of specific target in its context. Previous approaches have realized importance targets and developed various methods with goal precisely modeling their contexts via generating target-specific representations. However, these studies always ignore separate targets. In this paper, we argue that both deserve special treatment need to be learned own representations interactive learning. Then, propose attention networks (IAN)...

10.48550/arxiv.1709.00893 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We propose a simple yet effective technique for neural network learning. The forward propagation is computed as usual. In back propagation, only small subset of the full gradient to update model parameters. vectors are sparsified in such way that top-$k$ elements (in terms magnitude) kept. As result, $k$ rows or columns (depending on layout) weight matrix modified, leading linear reduction ($k$ divided by vector dimension) computational cost. Surprisingly, experimental results demonstrate we...

10.48550/arxiv.1706.06197 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Targeted sentiment analysis (TSA) aims at extracting targets and classifying their classes. Previous works only exploit word embeddings as features do not explore more potentials of neural networks when jointly learning the two tasks. In this paper, we carefully design hierarchical stack bidirectional gated recurrent units (HSBi-GRU) model to learn abstract for both tasks, propose a HSBi-GRU based joint which allows target label have influence on label. Experimental results datasets show...

10.18653/v1/d18-1504 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Aspect-level sentiment classification aims to distinguish the polarities over aspect terms in a sentence. Existing approaches mostly focus on modeling relationship between given words and their contexts with attention, ignore use of more elaborate knowledge implicit context. In this paper, we exploit syntactic awareness model by graph attention network dependency tree structure external pre-training BERT language model, which helps interaction context better. And subwords are integrated into...

10.18653/v1/2020.coling-main.69 article EN cc-by Proceedings of the 17th international conference on Computational linguistics - 2020-01-01

The encode-decoder framework has shown recent success in image captioning. Visual attention, which is good at detailedness, and semantic comprehensiveness, have been separately proposed to ground the caption on image. In this paper, we propose Stepwise Image-Topic Merging Network (simNet) that makes use of two kinds attention same time. At each time step when generating caption, decoder adaptively merges attentive information extracted topics according generated context, so visual can be...

10.18653/v1/d18-1013 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Grammatical error correction (GEC) is a promising natural language processing (NLP) application, whose goal to change the sentences with grammatical errors into correct ones. Neural machine translation (NMT) approaches have been widely applied this translation-like task. However, such methods need fairly large parallel corpus of error-annotated sentence pairs, which not easy get especially in field Chinese correction. In paper, we propose simple yet effective method improve NMT-based GEC...

10.1609/aaai.v34i01.5476 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Hierarchical text classification (HTC) is a challenging subtask of multi-label due to its complex label hierarchy.Recently, the pretrained language models (PLM)have been widely adopted in HTC through fine-tuning paradigm. However, this paradigm, there exists huge gap between tasks with sophisticated hierarchy and masked model (MLM) pretraining PLMs thus potential cannot be fully tapped.To bridge gap, paper, we propose HPT, Hierarchy-aware Prompt Tuning method handle from MLM...

10.18653/v1/2022.emnlp-main.246 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2022-01-01

This paper proposes a method to extract product features from user reviews and generate review summary. only relies on specifications, which usually are easy obtain. Other resources like segmenter, POS tagger or parser not required. At feature extraction stage, multiple specifications clustered extend the vocabulary of features. Hierarchy structure information unit measurement mined specification improve accuracy extraction. summary generation hierarchy in is used provide natural conceptual view

10.3115/1667583.1667637 article EN 2009-01-01
Coming Soon ...