- Topic Modeling
- Natural Language Processing Techniques
- Artificial Intelligence in Healthcare and Education
- Multimodal Machine Learning Applications
- Advanced Graph Neural Networks
- Sentiment Analysis and Opinion Mining
- Text and Document Classification Technologies
- Machine Learning in Healthcare
- Robotic Path Planning Algorithms
- Retinal Imaging and Analysis
- Face and Expression Recognition
- Tensor decomposition and applications
- COVID-19 diagnosis using AI
- Biomedical Text Mining and Ontologies
- Advanced Text Analysis Techniques
- Modular Robots and Swarm Intelligence
- Image Retrieval and Classification Techniques
- Privacy-Preserving Technologies in Data
- Gene expression and cancer classification
- Advanced Manufacturing and Logistics Optimization
- Domain Adaptation and Few-Shot Learning
- Digital Imaging for Blood Diseases
- Mental Health via Writing
- Sparse and Compressive Sensing Techniques
- Proteoglycans and glycosaminoglycans research
South China University of Technology
2022-2025
Guangdong University of Technology
2020-2022
National Taiwan Normal University
2007
Text data augmentation is an effective strategy for overcoming the challenge of limited sample sizes in many natural language processing (NLP) tasks. This especially prominent few-shot learning scenario, where target domain generally much scarcer and lowered quality. A widely-used to mitigate such challenges perform better capture invariance increase size. However, current text methods either can't ensure correct labeling generated (lacking faithfulness) or sufficient diversity compactness),...
Large language models, such as ChatGPT, are capable of generating grammatically perfect and human-like text content, a large number ChatGPT-generated texts have appeared on the internet. However, medical texts, clinical notes diagnoses, require rigorous validation, erroneous content generated by ChatGPT could potentially lead to disinformation that poses significant harm health care general public.
Background: Large language models such as ChatGPT are capable of generating grammatically perfect and human-like text content, a large number ChatGPT-generated texts have appeared on the Internet. However, medical clinical notes diagnoses require rigorous validation, erroneous content generated by could potentially lead to disinformation that poses significant harm healthcare general public. Objective: This research is among first studies responsible ethical AIGC (Artificial Intelligence...
<sec> <title>BACKGROUND</title> Large language models, such as ChatGPT, are capable of generating grammatically perfect and human-like text content, a large number ChatGPT-generated texts have appeared on the internet. However, medical texts, clinical notes diagnoses, require rigorous validation, erroneous content generated by ChatGPT could potentially lead to disinformation that poses significant harm health care general public. </sec> <title>OBJECTIVE</title> This study is among first...
The Segment Anything Model (SAM) has gained popularity as a versatile image segmentation method, thanks to its strong generalization capabilities across various domains. However, when applied optic disc (OD) and cup (OC) tasks, SAM encounters challenges due the complex structures, low contrast, blurred boundaries typical of fundus images, leading suboptimal performance. To overcome these challenges, we introduce novel model, FunduSAM, which incorporates several Adapters into create deep...
Alzheimer's disease (AD) is a common form of dementia that severely impacts patient health. As AD impairs the patient's language understanding and expression ability, speech patients can serve as an indicator this disease. This study investigates various methods for detecting using patients' transcripts data from DementiaBank Pitt database. The proposed approach involves pre-trained models Graph Neural Network (GNN) constructs graph transcript, extracts features GNN detection. Data...
In this study, we leverage LLM to enhance the semantic analysis and develop similarity metrics for texts, addressing limitations of traditional unsupervised NLP like ROUGE BLEU. We a framework where LLMs such as GPT-4 are employed zero-shot text identification label generation radiology reports, labels then used measurements similarity. By testing proposed on MIMIC data, find that generated can significantly improve assessment, with scores more closely aligned clinical ground truth than...
We present the Radiation Oncology NLP Database (ROND), first dedicated Natural Language Processing (NLP) dataset for radiation oncology, an important medical specialty that has received limited attention from community in past. With advent of Artificial General Intelligence (AGI), there is increasing need specialized datasets and benchmarks to facilitate research development. ROND specifically designed address this gap domain a field offers many opportunities exploration. It encompasses...
Zero-shot relation triplet extraction (ZeroRTE) endeavors to extract triplets from a test set using model trained on training with disjoint relations the set. Current ZeroRTE approaches primarily rely two strategies: 1) Combining pre-trained language models generate additional samples; 2) Adding large number of parameters that require scratch top model. However, former approach doesn't ensure quality generated samples, and latter often struggles generalize unseen in set, particularly when is...
Intent detection and slot filling are recognized as two very important tasks in a spoken language understanding (SLU) system. In order to model these at the same time, many joint models based on deep neural networks have been proposed recently archived excellent results. addition, graph network has made good achievements field of vision. Therefore, we combine advantages propose new with wheel-graph attention (Wheel-GAT), which is able interrelated connections directly for single intent...
The knowledge graph (KG) is a highly needed basis to support the high-fidelity and high-interpretability modeling of various tasks in healthcare artificial intelligence. In this work, we focus on constructing an oncology that will be used downstream cancer research solution development. Modern supervised learning for construction requires large amount manually labeled data, which makes process time-consuming labor-intensive. Although there exists multiple named entity recognition relation...
Since the outbreak of COVID-19, small and medium-sized enterprises have been greatly affected. In order to cope with difficulty capital turnover for enterprises, government has successively introduced a series financial policies increase credit support reduce financing costs. The rapid development technology also prompted further innovations in operating models banks other platforms. However, platforms must consider practical issues such as their own costs risk assessment while they help...
Transformer-based language models have achieved significant success in various domains. However, the data-intensive nature of transformer architecture requires much labeled data, which is challenging low-resource scenarios (i.e., few-shot learning (FSL)). The main challenge FSL difficulty training robust on small amounts samples, frequently leads to overfitting. Here we present Mask-BERT, a simple and modular framework help BERT-based architectures tackle FSL. proposed approach fundamentally...
Due to the inherent high-dimensional characteristics of genomic data, traditional single metric/kernel-based clustering methods fail accurately perform data analysis. To address this issue, we propose a multi-kernel with tensor fusion on Grassmann manifold (MKCTM). Specifically, multiple kernel functions are employed map into different spaces and utilize representations capture their high-order relationships. By introducing low-rank constraint, maximize correlation among kernels while...
Some properties and an algorithm of motion planning problem heterogeneous combinatorial point robots are presented. Heterogeneous can be combined separated freely during moving. It is proven that the in a static discrete environment compliant to principle optimality. Dynamic programming algorithms used solve this problem. The superposition property time complexity <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$O\left( {\left| {qV} \right|^{2k}...