- Topic Modeling
- Natural Language Processing Techniques
- Sentiment Analysis and Opinion Mining
- Text Readability and Simplification
- Spam and Phishing Detection
- Advanced Text Analysis Techniques
- Biomedical Text Mining and Ontologies
- Web Data Mining and Analysis
- Text and Document Classification Technologies
- Internet Traffic Analysis and Secure E-voting
- EEG and Brain-Computer Interfaces
- Emotion and Mood Recognition
- Second Language Acquisition and Learning
- Functional Brain Connectivity Studies
- Mental Health via Writing
- Hate Speech and Cyberbullying Detection
- Caching and Content Delivery
- Network Security and Intrusion Detection
- Recommender Systems and Techniques
- Machine Learning in Healthcare
- Gaze Tracking and Assistive Technology
- Semantic Web and Ontologies
- ECG Monitoring and Analysis
- Advanced Malware Detection Techniques
- Information Retrieval and Search Behavior
National Yang Ming Chiao Tung University
2023-2024
Institute of Art
2024
National Central University
2014-2023
Pervasive Artificial Intelligence Research Labs
2021
Kaohsiung Medical University
2021
National Taiwan Normal University
2012-2018
National Taiwan University
2011-2015
Institute of Linguistics, Academia Sinica
2009
Academia Sinica
2008-2009
Yuan Ze University
2008
This paper introduces the SIGHAN 2015 Bake-off for Chinese Spelling Check, including task description, data preparation, performance metrics, and evaluation results. The competition reveals current state-of-the-art NLP techniques in dealing with spelling checking. All sets gold standards tool used this bake-off are publicly available future research.
This paper introduces a Chinese Spelling Check campaign organized for the SIGHAN 2014 bake-off, including task description, data preparation, performance metrics, and evaluation results based on essays written by as foreign language learners.The hope is that such evaluations can produce more advanced spelling check techniques.
An increasing amount of research has recently focused on dimensional sentiment analysis that represents affective states as continuous numerical values multiple dimensions, such valence-arousal (VA) space. Compared to the categorical approach distinct classes (e.g., positive and negative), can provide more fine-grained (real-valued) analysis. However, resources with ratings are very rare, especially for Chinese language. Therefore, this study aims to: (1) Build a resource called EmoBank,...
This paper introduces the NLP-TEA 2015 shared task for Chinese grammatical error diagnosis.We describe task, data preparation, performance metrics, and evaluation results.The hope is that such an campaign may produce more advanced diagnosis techniques.All sets with gold standards tools are publicly available research purposes.
Many deep-learning-based seizure detection algorithms have achieved good classification, which usually outperformed traditional machine-learning-based algorithms. However, the hand-engineered features increase computational complexity and potentially an ineffectiveness problem for category. Therefore, this paper proposes a novel end-to-end deep-learning model comprising inception module residual to analyze multi-scales of original EEG signals realize without feature extraction. Experiments...
This study presents the Chinese Open Relation Extraction (CORE) system that is able to extract entity-relation triples from free texts based on a series of NLP techniques, i.e., word segmentation, POS tagging, syntactic parsing, and extraction rules. We employ proposed CORE techniques more than 13 million entity-relations for an open domain question answering application. To our best knowledge, first IE knowledge acquisition.
Named Entity Recognition (NER) is a natural language processing task for recognizing named entities in given sentence. Chinese NER difficult due to the lack of delimited spaces and conventional features determining entity boundaries categories. This study proposes ME-MGNN (Multiple Embeddings enhanced Multi-Graph Neural Networks) model healthcare domain. We integrate multiple embeddings at different granularities from radical, character word levels an extended representation, this fed into...
Remote sensing of life detection or a non-contact monitor vital signals is an important application for Ultra-wideband (UWB) radar, such as health monitoring vehicle driver. Using the UWB radar to detect physiological dynamic human, three kind movement features (body motion, breathing, and heartbeat) must be considered generally extracted from echo pulses. Usually, moving body signal much larger than other twos, which will cause interference interaction problems. Meanwhile, since causes...
In recent years, many studies have proposed seizure detection algorithms, but most of them require high computing resources and a large amount memory, which are difficult to implement in wearable devices. This paper proposes algorithm that uses small number features reduce the memory requirements algorithm. During feature extraction, this an entropy estimation method bitwise operations instead logarithmic algorithm's demand for resources. The experimental results show time can be reduced by...
This study describes the model design of NCUEE system for MEDIQA challenge at ACL-BioNLP 2019 workshop. We use BERT (Bidirectional Encoder Representations from Transformers) as word embedding method to integrate BiLSTM Long Short-Term Memory) network with an attention mechanism medical text inferences. A total 42 teams participated in natural language inference task 2019. Our best accuracy score 0.84 ranked top-third among all submissions leaderboard.
This study explores the existing blacklists to discover suspected URLs that refer on-the-fly phishing threats in real time. We propose a PhishTrack framework includes redirection tracking and form components update blacklists. It actively finds as early possible. Experimental results show our proactive method is an effective efficient approach for improving coverage of In practice, solution complementary anti-phishing techniques providing secured web surfing.
Abstract Background Support vector machines (SVMs) based on brain-wise functional connectivity (FC) have been widely adopted for single-subject prediction of patients with schizophrenia, but most them had small sample size. This study aimed to evaluate the performance SVMs a large single-site dataset and investigate effects demographic homogeneity training size classification accuracy. Methods The resting Magnetic Resonance Imaging (fMRI) comprised 220 schizophrenia healthy controls....
This study describes the construction of TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including collection and grammatical error annotation 2,837 essays written by language learners originating from total 46 different mother-tongue languages. We propose hierarchical tagging sets to manually annotate errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for shared tasks on diagnosis. These demonstrate usability our annotation.
This paper presents the IALP 2016 shared task on Dimensional Sentiment Analysis for Chinese Words (DSAW) which seeks to identify a real-value sentiment score of words in both valence and arousal dimensions. Valence represents degree pleasant unpleasant (or positive negative) feelings, excitement calm. Of 22 teams registered this two-dimensional analysis, 16 submitted results. We expected that evaluation campaign could produce more advanced dimensional analysis techniques, especially...
This study explores the users' web browsing behaviors that confront phishing situations for context-aware detection. We extract discriminative features of each clicked URL, i.e., domain name, bag-of-words, generic Top-Level Domains, IP address, and port number, to develop a linear chain CRF model behavioral prediction. Large-scale experiments show our method achieves promising performance predicting threats next accesses. Error analysis indicates results in favorably low false positive rate....
This study describes the model design of NCUEE-NLP system for SemEval-2023 NLI4CT task that focuses on multi-evidence natural language inference clinical trial data. We use LinkBERT transformer in biomedical domain (denoted as BioLinkBERT) our main architecture. First, a set sentences reports is extracted evidence premise-statement inference. identified then used to determine relation (i.e., entailment or contradiction). Finally, soft voting ensemble mechanism applied enhance performance....
Steady-state visual evoked potential (SSVEP) has been used to implement brain-computer interface (BCI) due its advantages of high information transfer rate (ITR) and accuracy. In recent years, owing the developments head-mounted device (HMD), HMD become a popular SSVEP-based BCI. However, an with fixed frame only can flash at subharmonic frequencies which limits available number stimulation for order increase commands BCI, we proposed phase-approaching (PA) method generate sequences...
This article explores users' browsing intents to predict the category of a user's next access during web surfing and applies results filter objectionable content, such as pornography, gambling, violence, drugs. Users' trails in terms sequences click‐through data are employed mine behaviors. Contextual relationships URL categories learned by hidden Markov model. The top‐level domains ( TLDs ) extracted from URLs themselves corresponding caught TLD Given be predicted, its current context...