Yankai Lin

ORCID: 0000-0002-0151-6178
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Graph Neural Networks
  • Multimodal Machine Learning Applications
  • Advanced Text Analysis Techniques
  • Speech Recognition and Synthesis
  • Semantic Web and Ontologies
  • Software Engineering Research
  • Domain Adaptation and Few-Shot Learning
  • Recommender Systems and Techniques
  • Speech and dialogue systems
  • Adversarial Robustness in Machine Learning
  • Data Quality and Management
  • Neural Networks and Applications
  • Text Readability and Simplification
  • Text and Document Classification Technologies
  • Scientific Computing and Data Management
  • Advanced Malware Detection Techniques
  • Complex Network Analysis Techniques
  • Ferroelectric and Negative Capacitance Devices
  • Online Learning and Analytics
  • Artificial Intelligence in Healthcare and Education
  • Machine Learning in Healthcare
  • Intelligent Tutoring Systems and Adaptive Learning
  • Sentiment Analysis and Opinion Mining

Renmin University of China
2022-2024

Tencent (China)
2019-2023

Tsinghua University
2015-2023

Peng Cheng Laboratory
2023

Beijing Institute of Big Data Research
2022-2023

Group Image (Poland)
2022

Peking University
2019-2021

Université de Montréal
2019

Beijing Academy of Artificial Intelligence
2018-2019

National University of Singapore
2019

Knowledge graph completion aims to perform link prediction between entities. In this paper, we consider the approach of knowledge embeddings. Recently, models such as TransE and TransH build entity relation embeddings by regarding a translation from head tail entity. We note that these simply put both entities relations within same semantic space. fact, an may have multiple aspects various focus on different entities, which makes common space insufficient for modeling. propose TransR in...

10.1609/aaai.v29i1.9491 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2015-02-19

Distant supervised relation extraction has been widely used to find novel relational facts from text.However, distant supervision inevitably accompanies with the wrong labelling problem, and these noisy data will substantially hurt performance of extraction.To alleviate this issue, we propose a sentence-level attention-based model for extraction.In model, employ convolutional neural networks embed semantics sentences.Afterwards, build attention over multiple instances, which is expected...

10.18653/v1/p16-1200 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016-01-01

Graph Neural Networks (GNNs) have achieved promising performance on a wide range of graph-based tasks. Despite their success, one severe limitation GNNs is the over-smoothing issue (indistinguishable representations nodes in different classes). In this work, we present systematic and quantitative study GNNs. First, introduce two metrics, MAD MADGap, to measure smoothness over-smoothness graph representations, respectively. Then, verify that smoothing nature critical factor leading low...

10.1609/aaai.v34i04.5747 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Representation learning of knowledge bases aims to embed both entities and relations into a low-dimensional space.Most existing methods only consider direct in representation learning.We argue that multiple-step relation paths also contain rich inference patterns between entities, propose path-based model.This model considers as translations for learning, addresses two key challenges: (1) Since not all are reliable, we design path-constraint resource allocation algorithm measure the...

10.18653/v1/d15-1082 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2015-01-01

Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Lixin Huang, Jie Zhou, Maosong Sun. Proceedings of the 57th Annual Meeting Association for Computational Linguistics. 2019.

10.18653/v1/p19-1074 preprint EN 2019-01-01

Document-level sentiment classification aims to predict user's overall in a document about product.However, most of existing methods only focus on local text information and ignore the global user preference product characteristics.Even though some works take such into account, they usually suffer from high model complexity consider wordlevel rather than semantic levels.To address this issue, we propose hierarchical neural network incorporate classification.Our first builds LSTM generate...

10.18653/v1/d16-1171 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2016-01-01

We release an open toolkit for knowledge embedding (OpenKE), which provides a unified framework and various fundamental models to embed graphs into continuous low-dimensional space. OpenKE prioritizes operational efficiency support quick model validation large-scale representation learning. Meanwhile, maintains sufficient modularity extensibility easily incorporate new the framework. Besides toolkit, embeddings of some existing pre-trained by are also available, can be directly applied many...

10.18653/v1/d18-2024 article EN cc-by 2018-01-01

Abstract Autonomous agents have long been a research focus in academic and industry communities. Previous often focuses on training with limited knowledge within isolated environments, which diverges significantly from human learning processes, makes the hard to achieve human-like decisions. Recently, through acquisition of vast amounts Web knowledge, large language models (LLMs) shown potential human-level intelligence, leading surge LLM-based autonomous agents. In this paper, we present...

10.1007/s11704-024-40231-1 article EN cc-by Frontiers of Computer Science 2024-03-22

Distantly supervised open-domain question answering (DS-QA) aims to find answers in collections of unlabeled text. Existing DS-QA models usually retrieve related paragraphs from a large-scale corpus and apply reading comprehension technique extract the most relevant paragraph. They ignore rich information contained other paragraphs. Moreover, distant supervision data inevitably accompanies with wrong labeling problem, these noisy will substantially degrade performance DS-QA. To address...

10.18653/v1/p18-1161 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018-01-01

Neural models have achieved remarkable success on relation extraction (RE) benchmarks. However, there is no clear understanding what information in text affects existing RE to make decisions and how further improve the performance of these models. To this end, we empirically study effect two main sources text: textual context entity mentions (names). We find that (i) while source support predictions, also heavily rely from mentions, most which type information, (ii) datasets may leak shallow...

10.18653/v1/2020.emnlp-main.298 article EN 2020-01-01

Language representation models such as BERT could effectively capture contextual semantic information from plain text, and have been proved to achieve promising results in lots of downstream NLP tasks with appropriate fine-tuning. However, most existing language cannot explicitly handle coreference, which is essential the coherent understanding whole discourse. To address this issue, we present CorefBERT, a novel model that can coreferential relations context. The experimental show that,...

10.18653/v1/2020.emnlp-main.582 article EN cc-by 2020-01-01

In this paper, we propose a novel graph neural network with generated parameters (GP-GNNs). The in the propagation module, i.e. transition matrices used message passing procedure, are produced by generator taking natural language sentences as inputs. We verify GP-GNNs relation extraction from text, both on bag- and instance-settings. Experimental results human-annotated dataset two distantly supervised datasets show that multi-hop reasoning mechanism yields significant improvements. also...

10.18653/v1/p19-1128 preprint EN 2019-01-01

Recent entity and relation extraction works focus on investigating how to obtain a better span representation from the pre-trained encoder. However, major limitation of existing is that they ignore interrelation between spans (pairs). In this work, we propose novel approach, named Packed Levitated Markers (PL-Marker), consider (pairs) by strategically packing markers in particular, neighborhood-oriented strategy, which considers neighbor integrally model boundary information. Furthermore,...

10.18653/v1/2022.acl-long.337 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous this field often focuses on training with limited knowledge within isolated environments, which diverges significantly from human learning processes, thus makes the hard to achieve human-like decisions. Recently, through acquisition of vast amounts web knowledge, large language models (LLMs) demonstrated remarkable potential achieving human-level intelligence. This has sparked an...

10.48550/arxiv.2308.11432 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Xiaozhi Wang, Ziqi Xu Han, Wangyi Jiang, Rong Zhiyuan Liu, Juanzi Li, Peng Yankai Lin, Jie Zhou. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.

10.18653/v1/2020.emnlp-main.129 article EN cc-by 2020-01-01

Relation extraction has been widely used for finding unknown relational facts from plain text. Most existing methods focus on exploiting mono-lingual data relation extraction, ignoring massive information the texts in various languages. To address this issue, we introduce a multi-lingual neural framework, which employs attention to utilize within and further proposes cross-lingual consider consistency complementarity among texts. Experimental results real-world datasets show that, our model...

10.18653/v1/p17-1004 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017-01-01

Qiu Ran, Yankai Lin, Peng Li, Jie Zhou, Zhiyuan Liu. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1251 article EN cc-by 2019-01-01

Distantly supervised relation extraction has been widely used to find novel relational facts from plain text. To predict the between a pair of two target entities, existing methods solely rely on those direct sentences containing both entities. In fact, there are also many only one which provide rich useful information but not yet employed by extraction. address this issue, we build inference chains entities via intermediate and propose path-based neural model encode semantics chains....

10.18653/v1/d17-1186 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2017-01-01

Ziqi Wang, Xiaozhi Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Jie Zhou. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.491 article EN cc-by 2021-01-01

Yujia Qin, Yankai Lin, Ryuichi Takanobu, Zhiyuan Liu, Peng Li, Heng Ji, Minlie Huang, Maosong Sun, Jie Zhou. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.

10.18653/v1/2021.acl-long.260 article EN cc-by 2021-01-01

Continual relation learning aims to continually train a model on new data learn incessantly emerging novel relations while avoiding catastrophically forgetting old relations. Some pioneering work has proved that storing handful of historical examples in episodic memory and replaying them subsequent training is an effective solution for such challenging problem. However, these memory-based methods usually suffer from overfitting the few memorized relations, which may gradually cause...

10.18653/v1/2020.acl-main.573 article EN cc-by 2020-01-01

This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for NLP. It also benefit related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining computational biology.

10.1007/978-981-15-5573-2 preprint EN cc-by 2020-01-01

Non-autoregressive neural machine translation (NAT) generates each target word in parallel and has achieved promising inference acceleration. However, existing NAT models still have a big gap quality compared to autoregressive due the multimodality problem: words may come from multiple feasible translations. To address this problem, we propose novel framework ReorderNAT which explicitly reordering information guide decoding of NAT. Specially, utilizes deterministic non-deterministic...

10.1609/aaai.v35i15.17618 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based graph completion (KGC) models. However, these are still quite behind SOTA KGC in terms performance. this work, we find two main reasons for weak performance: (1) Inaccurate evaluation setting. The setting under closed-world assumption (CWA) may underestimate since they introduce more external knowledge; (2) Inappropriate utilization...

10.18653/v1/2022.findings-acl.282 article EN cc-by Findings of the Association for Computational Linguistics: ACL 2022 2022-01-01
Coming Soon ...