- Topic Modeling
- Natural Language Processing Techniques
- Advanced Text Analysis Techniques
- Multimodal Machine Learning Applications
- Speech and dialogue systems
- Semantic Web and Ontologies
- Machine Learning in Healthcare
- Speech Recognition and Synthesis
- Sentiment Analysis and Opinion Mining
- Quantum Information and Cryptography
- Text and Document Classification Technologies
- Software Engineering Research
- Biomedical Text Mining and Ontologies
- Advanced Graph Neural Networks
- Text Readability and Simplification
- Quantum and electron transport phenomena
- Handwritten Text Recognition Techniques
- Radiomics and Machine Learning in Medical Imaging
- Computational and Text Analysis Methods
- Educational Technology and Pedagogy
- Advanced Computational Techniques and Applications
- Artificial Intelligence in Healthcare and Education
- Quantum optics and atomic interactions
- Explainable Artificial Intelligence (XAI)
- Phonetics and Phonology Research
Micron (United States)
2025
Institute of Technology of Cambodia
2025
Xiamen University
2021-2025
Peng Cheng Laboratory
2020-2024
Hohai University
2024
University of Hong Kong
2023
Hong Kong University of Science and Technology
2023
National Institute of Advanced Industrial Science and Technology
2023
Beijing Information Science & Technology University
2022
China Electronics Technology Group Corporation
2022
Article Free Access Share on A re-examination of text categorization methods Authors: Yiming Yang School Computer Science, Carnegie Mellon University, Pittsburgh, PA PAView Profile , Xin Liu Authors Info & Claims SIGIR '99: Proceedings the 22nd annual international ACM conference Research and development in information retrievalAugust 1999Pages 42–49https://doi.org/10.1145/312624.312647Published:01 August 1999Publication History 1,646citation8,975DownloadsMetricsTotal Citations1,646Total...
In this paper, we propose two generic text summarization methods that create summaries by ranking and extracting sentences from the original documents. The first method uses standard IR to rank sentence relevances, while second latent semantic analysis technique identify semantically important sentences, for summary creations. Both strive select are highly ranked different each other. This is an attempt a with wider coverage of document's main content less redundancy. Performance evaluations...
This paper introduces the Bank Question (BQ) corpus, a Chinese corpus for sentence semantic equivalence identification (SSEI). The BQ contains 120,000 question pairs from 1-year online bank custom service logs. To efficiently process and annotate questions such large scale of logs, this proposes clustering based annotation method to achieve with same intent. First, deduplicated answer are clustered into stacks by Word Mover’s Distance (WMD) Affinity Propagation (AP) algorithm. Then,...
This paper aims to quantitatively evaluate the performance of ChatGPT, an interactive large language model, on inter-sentential relations such as temporal relations, causal and discourse relations. Given ChatGPT's promising across various tasks, we proceed carry out thorough evaluations whole test sets 11 datasets, including PDTB2.0-based, dialogue-based To ensure reliability our findings, employ three tailored prompt templates for each task, zero-shot template, engineering (PE) in-context...
Despite recent progress in prediction and prevention, heart disease remains a leading cause of death. One preliminary step prevention is risk factor identification. Many studies have been proposed to identify factors associated with disease; however, none attempted all factors. In 2014, the National Center Informatics for Integrating Biology Beside (i2b2) issued clinical natural language processing (NLP) challenge that involved track (track 2) identifying texts over time. This aimed...
We propose a novel model, called stroke sequence-dependent deep convolutional neural network (SSDCNN), which uses the sequence information and eight-directional features of Chinese characters for online handwritten character recognition (OLHCCR). SSDCNN learns representation OLHCCs by incorporating natural strokes. Furthermore, it naturally incorporates features. First, inputs transforms into stacks feature maps following writing order Second, fixed-length, representations OLHCC are derived...
Multimodal Large Language Models (MLLMs) have showcased impressive skills in tasks related to visual understanding and reasoning. Yet, their widespread application faces obstacles due the high computational demands during both training inference phases, restricting use a limited audience within research user communities. In this paper, we investigate design aspects of Small (MSLMs) propose an efficient multimodal assistant named Mipha, which is designed create synergy among various aspects:...
Stretchable artificial skins have garnered great interest for their potential applications in real-time human-machine interaction and equipment operation status monitoring. The local stiffer structure areas on the substrates functional elements been verified to improve robustness of skins, but it remains challenging achieve robust sensing performance under mechanical deformation due large mismatch intricate fabrication process. Herein, we propose an easy strategy fabricating a substrate with...
Traditional models of systems biology describe dynamic biological phenomena as solutions to ordinary differential equations, which, when parameters in them are set correct values, faithfully mimic observations. Often parameter values tweaked by hand until desired results achieved, or computed from biochemical experiments carried out vitro. Of interest this article, is the use probabilistic modelling tools with which and unobserved variables, modelled hidden states, can be estimated limited...
Automatic diagnosis has attracted increasing attention but remains challenging due to multi-step reasoning. Recent works usually address it by reinforcement learning methods. However, these methods show low efficiency and require task-specific reward functions. Considering the conversation between doctor patient allows doctors probe for symptoms make diagnoses, process can be naturally seen as generation of a sequence including diagnoses. Inspired this, we reformulate automatic Sequence...
Complex scene character recognition is a challenging yet important task in machine learning, especially for languages with large sets, such as Chinese, which composed of hieroglyphics large-scale categories and similar glyphs. Recently, state-of-the-art methods based on semantic segmentation have achieved great success parsing been applied text recognition. However, because limitations terms memory computation, they are only the small category tasks, tasks involving English alphabets digits....
Implicit Discourse Relation Recognition (IDRR) is a sophisticated and challenging task to recognize the discourse relations between arguments with absence of connectives. The sense labels for each relation follow hierarchical classification scheme in annotation process (Prasad et al., 2008), forming hierarchy structure. Most existing works do not well incorporate structure but focus on syntax features prior knowledge connectives manner pure text classification. We argue that it more...
Understanding users' intentions in e-commerce platforms requires commonsense knowledge. In this paper, we present FolkScope, an intention knowledge graph construction framework, to reveal the structure of humans' minds about purchasing items. As is usually ineffable and not expressed explicitly, it challenging perform information extraction. Thus, propose a new approach that leverages generation power large language models (LLMs) human-in-the-loop annotation semi-automatically construct...
Commonsense knowledge acquisition and reasoning have long been a core artificial intelligence problem. However, in the past, there has lack of scalable methods to collect commonsense knowledge. In this paper, we propose develop principles for collecting based on selectional preference. We generalize definition preference from one-hop linguistic syntactic relations higher-order over graphs. Unlike previous (e.g., ConceptNet), (SP) only relies statistical distribution graphs, which can be...
In the past year, there has been a growing trend in applying Large Language Models (LLMs) to field of medicine, particularly with advent advanced language models such as ChatGPT developed by OpenAI. However, is limited research on LLMs specifically addressing oncology-related queries. The primary aim this was develop specialized model that demonstrates improved accuracy providing advice related oncology. We performed an extensive data collection online question-answer interactions centered...
In the era of Web 2.0, people have become accustomed to expressing their attitudes and exchanging opinions on social media sites such as Twitter. It is critical for security business related applications make sense public implied in users' texts. Stance detection aims classify stances users hold towards certain targets FAVOR, AGAINST or NONE. literature, many efforts been paid neural network based stance avoid hand-crafted features. As a widely used structure, convolutional (CNN) can mine...
Xin Liu, Baosong Yang, Dayiheng Haibo Zhang, Weihua Luo, Min Haiying Jinsong Su. Proceedings of the 59th Annual Meeting Association for Computational Linguistics and 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021.