Ting Liu

ORCID: 0000-0002-9091-7757
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Speech and dialogue systems
  • Multimodal Machine Learning Applications
  • Sentiment Analysis and Opinion Mining
  • Text Readability and Simplification
  • Text and Document Classification Technologies
  • Web Data Mining and Analysis
  • Speech Recognition and Synthesis
  • Semantic Web and Ontologies
  • Domain Adaptation and Few-Shot Learning
  • Adversarial Robustness in Machine Learning
  • Complex Network Analysis Techniques
  • Explainable Artificial Intelligence (XAI)
  • Biomedical Text Mining and Ontologies
  • Advanced Graph Neural Networks
  • Expert finding and Q&A systems
  • Software Engineering Research
  • Handwritten Text Recognition Techniques
  • Rough Sets and Fuzzy Logic
  • Spam and Phishing Detection
  • Bayesian Modeling and Causal Inference
  • Multi-Agent Systems and Negotiation
  • Recommender Systems and Techniques

Harbin Institute of Technology
2015-2024

Peng Cheng Laboratory
2020-2021

Microsoft Research (United Kingdom)
2020

Shanghai Jiao Tong University
2020

Heilongjiang Institute of Technology
2008-2017

Johns Hopkins University
2016

Baidu (China)
2016

University of Toronto
2015

University at Albany, State University of New York
2003-2012

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Zhou. Findings of the Association for Computational Linguistics: EMNLP 2020.

10.18653/v1/2020.findings-emnlp.139 article EN cc-by 2020-01-01

We present a method that learns word embedding for Twitter sentiment classification in this paper.Most existing algorithms learning continuous representations typically only model the syntactic context of words but ignore text.This is problematic analysis as they usually map with similar opposite polarity, such good and bad, to neighboring vectors.We address issue by sentimentspecific (SSWE), which encodes information representation words.Specifically, we develop three neural networks...

10.3115/v1/p14-1146 article EN 2014-01-01

We introduce a deep memory network for aspect level sentiment classification.Unlike feature-based SVM and sequential neural models such as LSTM, this approach explicitly captures the importance of each context word when inferring polarity an aspect.Such degree text representation are calculated with multiple computational layers, which is attention model over external memory.Experiments on laptop restaurant datasets demonstrate that our performs comparable to state-of-art feature based...

10.18653/v1/d16-1021 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks, and consecutive variants have been proposed to further improve the performance of pre-trained language models. In this paper, we target on revisiting Chinese models examine their effectiveness in a non-English release model series community. We also propose simple but effective called MacBERT, which improves upon RoBERTa several ways, especially masking strategy that...

10.18653/v1/2020.findings-emnlp.58 preprint EN cc-by 2020-01-01

Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks, and its consecutive variants have been proposed to further improve the performance of pre-trained language models.In this paper, we aim first introduce whole word masking (wwm) strategy for Chinese BERT, along with a series models.Then also propose simple but effective model called MacBERT, which improves upon RoBERTa in several ways.Especially, new MLM as correction...

10.1109/taslp.2021.3124365 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2021-01-01

Cloze-style reading comprehension is a representative problem in mining relationship between document and query. In this paper, we present simple but novel model called attention-over-attention reader for better solving cloze-style task. The proposed aims to place another attention mechanism over the document-level induces “attended attention” final answer predictions. One advantage of our that it simpler than related works while giving excellent performance. addition primary model, also...

10.18653/v1/p17-1055 preprint EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

Duyu Tang, Bing Qin, Ting Liu. Proceedings of the 53rd Annual Meeting Association for Computational Linguistics and 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015.

10.3115/v1/p15-1098 article EN cc-by 2015-01-01

We propose learning sentiment-specific word embeddings dubbed sentiment in this paper. Existing embedding algorithms typically only use the contexts of words but ignore texts. It is problematic for analysis because with similar opposite polarity, such as <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">good</i> and xmlns:xlink="http://www.w3.org/1999/xlink">bad</i> , are mapped to neighboring vectors. address issue by encoding information...

10.1109/tkde.2015.2489653 article EN IEEE Transactions on Knowledge and Data Engineering 2015-10-13

In this paper, we develop a deep learning system for message-level Twitter sentiment classification. Among the 45 submitted systems including SemEval 2013 participants, our (Coooolll) is ranked 2nd on Twitter2014 test set of 2014 Task 9. Coooolll built in supervised framework by concatenating sentiment-specific word embedding (SSWE) features with state-of-the-art hand-crafted features. We neural network hybrid loss function 1 to learn SSWE, which encodes information tweets continuous...

10.3115/v1/s14-2033 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2014-01-01

Sentiment analysis (also known as opinion mining) is an active research area in natural language processing. It aims at identifying, extracting and organizing sentiments from user generated texts social networks, blogs or product reviews. A lot of studies literature exploit machine learning approaches to solve sentiment tasks different perspectives the past 15 years. Since performance a learner heavily depends on choices data representation, many devote building powerful feature extractor...

10.1002/widm.1171 article EN Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery 2015-10-23

In this paper, we explore the slot tagging with only a few labeled support sentences (a.k.a. few-shot). Few-shot faces unique challenge compared to other fewshot classification problems as it calls for modeling dependencies between labels. But is hard apply previously learned label an unseen domain, due discrepancy of sets. To tackle this, introduce collapsed dependency transfer mechanism into conditional random field (CRF) abstract patterns transition scores. few-shot setting, emission...

10.18653/v1/2020.acl-main.128 article EN cc-by 2020-01-01

Yiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1600 preprint EN cc-by 2019-01-01

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural codesearch, code documentation generation, etc. develop with Transformer-based neural architecture, train it hybrid objective function incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us utilize...

10.48550/arxiv.2002.08155 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Intent detection and slot filling are two main tasks for building a spoken language understanding (SLU) system. The closely related the information of one task can benefit other. Previous studies either implicitly model with multi-task framework or only explicitly consider single flow from intent to slot. None prior approaches bidirectional connection between simultaneously in unified framework. In this paper, we propose Co-Interactive Transformer which considers cross-impact tasks. Instead...

10.1109/icassp39728.2021.9414110 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021-05-13

Event detection remains a challenge due to the difficulty at encoding word semantics in various contexts.Previous approaches heavily depend on languagespecific knowledge and pre-existing natural language processing (NLP) tools.However, compared English, not all languages have such resources tools available.A more promising approach is automatically learn effective features from data, without relying resources.In this paper, we develop hybrid neural network capture both sequence chunk...

10.18653/v1/p16-2011 article EN cc-by 2016-01-01

Deep pretrained language models have achieved great success in the way of pretraining first and then fine-tuning. But such a sequential transfer learning paradigm often confronts catastrophic forgetting problem leads to sub-optimal performance. To fine-tune with less forgetting, we propose recall learn mechanism, which adopts idea multi-task jointly learns tasks downstream tasks. Specifically, introduce Pretraining Simulation mechanism knowledge from without data, an Objective Shifting focus...

10.18653/v1/2020.emnlp-main.634 article EN 2020-01-01

Recent work has shown success in using continuous word embeddings learned from unlabeled data as features to improve supervised NLP systems, which is regarded a simple semi-supervised learning mechanism. However, fundamental problems on effectively incorporating the embedding within framework of linear models remain. In this study, we investigate and analyze three different approaches, including new proposed distributional prototype approach, for utilizing features. The presented approaches...

10.3115/v1/d14-1012 article EN 2014-01-01

State-of-the-art studies have demonstrated the superiority of joint modeling over pipeline implementation for medical named entity recognition and normalization due to mutual benefits between two processes. To exploit these in a more sophisticated way, we propose novel deep neural multi-task learning framework with explicit feedback strategies jointly model normalization. On one hand, our method from general representations both tasks provided by learning. other successfully converts...

10.1609/aaai.v33i01.3301817 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

This paper describes our system (HIT-SCIR) submitted to the CoNLL 2018 shared task on Multilingual Parsing from Raw Text Universal Dependencies. We base submission Stanford's winning for 2017 and make two effective extensions: 1) incorporating deep contextualized word embeddings into both part of speech tagger parser; 2) ensembling parsers trained with different initialization. also explore ways concatenating treebanks further improvements. Experimental results development data show...

10.48550/arxiv.1807.03121 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We introduce N-LTP, an open-source neural language technology platform supporting six fundamental Chinese NLP tasks: lexical analysis (Chinese word segmentation, part-of-speech tagging, and named entity recognition), syntactic parsing (dependency parsing), semantic (semantic dependency role labeling). Unlike the existing state-of-the-art toolkits, such as Stanza, that adopt independent model for each task, N-LTP adopts multi-task framework by using a shared pre-trained model, which has...

10.18653/v1/2021.emnlp-demo.6 preprint EN cc-by 2021-01-01

Most pre-trained language models (PLMs) construct word representations at subword level with Byte-Pair Encoding (BPE) or its variations, by which OOV (out-of-vocab) words are almost avoidable. However, those methods split a into units and make the representation incomplete fragile.In this paper, we propose character-aware model named CharBERT improving on previous (such as BERT, RoBERTa) to tackle these problems. We first contextual embedding for each token from sequential character...

10.18653/v1/2020.coling-main.4 article EN cc-by Proceedings of the 17th international conference on Computational linguistics - 2020-01-01

10.1007/s13042-020-01069-8 article EN International Journal of Machine Learning and Cybernetics 2020-02-17

Research into the area of multiparty dialog has grown considerably over recent years. We present Molweni dataset, a machine reading comprehension (MRC) dataset with discourse structure built dialog. Molweni’s source samples from Ubuntu Chat Corpus, including 10,000 dialogs comprising 88,303 utterances. annotate 30,066 questions on this corpus, both answerable and unanswerable questions. also uniquely contributes dependency annotations in modified Segmented Discourse Representation Theory...

10.18653/v1/2020.coling-main.238 article EN cc-by Proceedings of the 17th international conference on Computational linguistics - 2020-01-01

Xiachong Feng, Xiaocheng Libo Qin, Bing Ting Liu. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.117 article EN cc-by 2021-01-01
Coming Soon ...