Qikai Cheng

ORCID: 0000-0003-3904-8901
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Text Analysis Techniques
  • Topic Modeling
  • Biomedical Text Mining and Ontologies
  • scientometrics and bibliometrics research
  • Semantic Web and Ontologies
  • Scientific Computing and Data Management
  • Data Quality and Management
  • Complex Network Analysis Techniques
  • Knowledge Management and Sharing
  • Information Retrieval and Search Behavior
  • Natural Language Processing Techniques
  • Artificial Intelligence in Healthcare and Education
  • Web Data Mining and Analysis
  • Advanced Computational Techniques and Applications
  • COVID-19 and healthcare impacts
  • Ethics in Clinical Research
  • Data Visualization and Analytics
  • COVID-19 epidemiological studies
  • AI in Service Interactions
  • Computational and Text Analysis Methods
  • Bioinformatics and Genomic Networks
  • Speech and dialogue systems
  • Software Engineering Research
  • Explainable Artificial Intelligence (XAI)
  • Multimodal Machine Learning Applications

Wuhan University
2012-2024

Demonstration ordering, which is an important strategy for in-context learning (ICL), can significantly affects the performance of large language models (LLMs). However, most current approaches ordering require additional knowledge and similarity calculation. We advocate few-shot curriculum (ICCL), a simple but effective demonstration method ICL, implies gradually increasing complexity prompt demonstrations during inference process. Then we design three experiments to discuss effectiveness...

10.48550/arxiv.2402.10738 preprint EN arXiv (Cornell University) 2024-02-16

Abstract Each section header of an article has its distinct communicative function. Citations from sections may be different regarding citing motivation. In this paper, we grouped headers with similar functions as a structural function and defined the distribution citations for paper citation structure. We aim to explore relationship between structure future impact publication disclose relative importance among functions. Specifically, proposed two counting methods life cycle identification...

10.1002/asi.24610 article EN Journal of the Association for Information Science and Technology 2021-12-16

Abstract Informal knowledge constantly transitions into formal domain in the dynamic base. This article focuses on an integrative understanding of role transition from perspective codification. The process is characterized by several dynamics involving a variety bibliometric entities, such as authors, keywords, institutions, and venues. We thereby designed series temporal cumulative indicators to respectively explore possibility (whether new could be transitioned knowledge) pace (how long it...

10.1162/qss_a_00221 article EN cc-by Quantitative Science Studies 2022-01-01

The unprecedented COVID-19 outbreak at the end of 2019 has produced a worldwide health crisis. Scientific research, especially international research collaboration, is crucial to deal successfully with epidemic. This article aims review response modes, and collaboration characteristic, academic community similar public events in past. Based on relevant studies four major emergencies past, were regarded as ‘new knowledge’ field. By using knowledge diffusion indicators, such breadth speed...

10.1177/01655515211030866 article EN Journal of Information Science 2021-07-29

Abstract Purpose Our study proposes a bootstrapping-based method to automatically extract data-usage statements from academic texts. Design/methodology/approach The for extraction starts with seed entities and iteratively learns patterns unlabeled text. In each iteration, new are constructed added the pattern list based on their calculated score. Three seed-selection strategies also proposed in this paper. Findings performance of is verified by means experiments real data collected computer...

10.20309/jdis.201606 article EN Journal of Data and Information Science 2016-02-01

Author identifier (ID) is essential for many downstream tasks, such as co-author network and scientist mobility analysis. As a widely used database, author ID of PubMed not officially provided by National Institutes Health (NIH), that restrict some identifier-based researches or systems. This study exploited three open bibliographic databases Aminer, Microsoft Academic Graph (MAG) Semantic Scholar (S2) to associate PubMed. For this purpose, paper linking was performed in order mine links...

10.1109/qrs-c51114.2020.00043 article EN 2020-12-01

Purpose This paper aims to identify data set entities in scientific literature. To address poor recognition caused by a lack of training corpora existing studies, distant supervised learning-based approach is proposed automatically from large-scale literature an open domain. Design/methodology/approach Firstly, the authors use dictionary combined with bootstrapping strategy create labelled corpus apply learning. Secondly, bidirectional encoder representation transformers (BERT)-based neural...

10.1108/el-10-2020-0301 article EN The Electronic Library 2021-07-26

TextRank is a variant of PageRank typically used in graphs that represent documents, and where vertices denote terms edges relations between terms. Quite often the relation simple term co-occurrence within fixed window k The output when applied iteratively score for each vertex, i.e. weight, can be information retrieval (IR) just like conventional frequency based weights.

10.1145/2348283.2348478 preprint EN Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012-08-12

This brief communication finds a clear and universal inequality of authors’ reference reuse behaviour. We observe that few references are reused many times in an author’s oeuvre while most his or her only occur the list for quite limited number times. A power law distribution depicts such inequality. particularly utilise value, [Formula: see text], to characterise nuanced difference inequalities. pilot study based upon Microsoft Academic Graph (MAG) shows text] tends be normally distributed,...

10.1177/01655515221111062 article EN Journal of Information Science 2022-07-23

Abstract This study quantifies and analyzes individual-level abilities of scientists from utilizing either an exploration or exploitation strategy. Specifically, we present a Research Strategy Q model, which untangles the coupling effect scientists’ research ability (Qα) strategy (Eαπ) on performance. Qα indicates fundamental to publish high-quality papers, while Eαπ proficiency in terms strategies. Five strategies proposed by our previous are employed. We generate synthetic data collect...

10.1162/qss_a_00342 article EN cc-by Quantitative Science Studies 2024-12-17

Abstract Topic analysis aims to study topic evolution and trends in order help researchers understand the process of knowledge creation. This paper develops a novel framework, which we use demonstrate, forecast, explain from perspective geometrical motion embeddings generated by pretrained language models. Our dataset comprises approximately 15 million papers computer science field, with 7,000 “fields study” represent topics. First, demonstrated that over 80% topics had undergone obvious...

10.1162/qss_a_00344 article EN cc-by Quantitative Science Studies 2024-12-17
Coming Soon ...