Lung‐Hao Lee

ORCID: 0000-0003-0472-7429
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Sentiment Analysis and Opinion Mining
  • Text Readability and Simplification
  • Spam and Phishing Detection
  • Advanced Text Analysis Techniques
  • Biomedical Text Mining and Ontologies
  • Web Data Mining and Analysis
  • Text and Document Classification Technologies
  • Internet Traffic Analysis and Secure E-voting
  • EEG and Brain-Computer Interfaces
  • Emotion and Mood Recognition
  • Second Language Acquisition and Learning
  • Functional Brain Connectivity Studies
  • Mental Health via Writing
  • Hate Speech and Cyberbullying Detection
  • Caching and Content Delivery
  • Network Security and Intrusion Detection
  • Recommender Systems and Techniques
  • Machine Learning in Healthcare
  • Gaze Tracking and Assistive Technology
  • Semantic Web and Ontologies
  • ECG Monitoring and Analysis
  • Advanced Malware Detection Techniques
  • Information Retrieval and Search Behavior

National Yang Ming Chiao Tung University
2023-2024

Institute of Art
2024

National Central University
2014-2023

Pervasive Artificial Intelligence Research Labs
2021

Kaohsiung Medical University
2021

National Taiwan Normal University
2012-2018

National Taiwan University
2011-2015

Institute of Linguistics, Academia Sinica
2009

Academia Sinica
2008-2009

Yuan Ze University
2008

This paper introduces the SIGHAN 2015 Bake-off for Chinese Spelling Check, including task description, data preparation, performance metrics, and evaluation results. The competition reveals current state-of-the-art NLP techniques in dealing with spelling checking. All sets gold standards tool used this bake-off are publicly available future research.

10.18653/v1/w15-3106 article EN 2015-01-01

This paper introduces a Chinese Spelling Check campaign organized for the SIGHAN 2014 bake-off, including task description, data preparation, performance metrics, and evaluation results based on essays written by as foreign language learners.The hope is that such evaluations can produce more advanced spelling check techniques.

10.3115/v1/w14-6820 article EN cc-by 2014-01-01

An increasing amount of research has recently focused on dimensional sentiment analysis that represents affective states as continuous numerical values multiple dimensions, such valence-arousal (VA) space. Compared to the categorical approach distinct classes (e.g., positive and negative), can provide more fine-grained (real-valued) analysis. However, resources with ratings are very rare, especially for Chinese language. Therefore, this study aims to: (1) Build a resource called EmoBank,...

10.1145/3489141 article EN ACM Transactions on Asian and Low-Resource Language Information Processing 2022-01-19

This paper introduces the NLP-TEA 2015 shared task for Chinese grammatical error diagnosis.We describe task, data preparation, performance metrics, and evaluation results.The hope is that such an campaign may produce more advanced diagnosis techniques.All sets with gold standards tools are publicly available research purposes.

10.18653/v1/w15-4401 article EN cc-by 2015-01-01

Many deep-learning-based seizure detection algorithms have achieved good classification, which usually outperformed traditional machine-learning-based algorithms. However, the hand-engineered features increase computational complexity and potentially an ineffectiveness problem for category. Therefore, this paper proposes a novel end-to-end deep-learning model comprising inception module residual to analyze multi-scales of original EEG signals realize without feature extraction. Experiments...

10.1109/access.2023.3277634 article EN cc-by IEEE Access 2023-01-01

This study presents the Chinese Open Relation Extraction (CORE) system that is able to extract entity-relation triples from free texts based on a series of NLP techniques, i.e., word segmentation, POS tagging, syntactic parsing, and extraction rules. We employ proposed CORE techniques more than 13 million entity-relations for an open domain question answering application. To our best knowledge, first IE knowledge acquisition.

10.3115/v1/e14-4003 article EN 2014-01-01

Named Entity Recognition (NER) is a natural language processing task for recognizing named entities in given sentence. Chinese NER difficult due to the lack of delimited spaces and conventional features determining entity boundaries categories. This study proposes ME-MGNN (Multiple Embeddings enhanced Multi-Graph Neural Networks) model healthcare domain. We integrate multiple embeddings at different granularities from radical, character word levels an extended representation, this fed into...

10.1109/jbhi.2020.3048700 article EN IEEE Journal of Biomedical and Health Informatics 2021-01-01

Remote sensing of life detection or a non-contact monitor vital signals is an important application for Ultra-wideband (UWB) radar, such as health monitoring vehicle driver. Using the UWB radar to detect physiological dynamic human, three kind movement features (body motion, breathing, and heartbeat) must be considered generally extracted from echo pulses. Usually, moving body signal much larger than other twos, which will cause interference interaction problems. Meanwhile, since causes...

10.1109/jsen.2020.2992687 article EN IEEE Sensors Journal 2020-05-05

In recent years, many studies have proposed seizure detection algorithms, but most of them require high computing resources and a large amount memory, which are difficult to implement in wearable devices. This paper proposes algorithm that uses small number features reduce the memory requirements algorithm. During feature extraction, this an entropy estimation method bitwise operations instead logarithmic algorithm's demand for resources. The experimental results show time can be reduced by...

10.1109/access.2023.3235913 article EN cc-by IEEE Access 2023-01-01

This study describes the model design of NCUEE system for MEDIQA challenge at ACL-BioNLP 2019 workshop. We use BERT (Bidirectional Encoder Representations from Transformers) as word embedding method to integrate BiLSTM Long Short-Term Memory) network with an attention mechanism medical text inferences. A total 42 teams participated in natural language inference task 2019. Our best accuracy score 0.84 ranked top-third among all submissions leaderboard.

10.18653/v1/w19-5058 article EN cc-by 2019-01-01

This study explores the existing blacklists to discover suspected URLs that refer on-the-fly phishing threats in real time. We propose a PhishTrack framework includes redirection tracking and form components update blacklists. It actively finds as early possible. Experimental results show our proactive method is an effective efficient approach for improving coverage of In practice, solution complementary anti-phishing techniques providing secured web surfing.

10.1145/2660267.2662362 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2014-11-03

Abstract Background Support vector machines (SVMs) based on brain-wise functional connectivity (FC) have been widely adopted for single-subject prediction of patients with schizophrenia, but most them had small sample size. This study aimed to evaluate the performance SVMs a large single-site dataset and investigate effects demographic homogeneity training size classification accuracy. Methods The resting Magnetic Resonance Imaging (fMRI) comprised 220 schizophrenia healthy controls....

10.1192/j.eurpsy.2021.2248 article EN cc-by-nc-nd European Psychiatry 2021-12-23

This study describes the construction of TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including collection and grammatical error annotation 2,837 essays written by language learners originating from total 46 different mother-tongue languages. We propose hierarchical tagging sets to manually annotate errors, resulting in 33,835 inappropriate usages. Our built corpus has been provided for shared tasks on diagnosis. These demonstrate usability our annotation.

10.1109/ialp.2016.7875980 article EN 2016-11-01

This paper presents the IALP 2016 shared task on Dimensional Sentiment Analysis for Chinese Words (DSAW) which seeks to identify a real-value sentiment score of words in both valence and arousal dimensions. Valence represents degree pleasant unpleasant (or positive negative) feelings, excitement calm. Of 22 teams registered this two-dimensional analysis, 16 submitted results. We expected that evaluation campaign could produce more advanced dimensional analysis techniques, especially...

10.1109/ialp.2016.7875957 article EN 2016-11-01

This study explores the users' web browsing behaviors that confront phishing situations for context-aware detection. We extract discriminative features of each clicked URL, i.e., domain name, bag-of-words, generic Top-Level Domains, IP address, and port number, to develop a linear chain CRF model behavioral prediction. Large-scale experiments show our method achieves promising performance predicting threats next accesses. Error analysis indicates results in favorably low false positive rate....

10.1145/2567948.2577320 article EN 2014-04-07

This study describes the model design of NCUEE-NLP system for SemEval-2023 NLI4CT task that focuses on multi-evidence natural language inference clinical trial data. We use LinkBERT transformer in biomedical domain (denoted as BioLinkBERT) our main architecture. First, a set sentences reports is extracted evidence premise-statement inference. identified then used to determine relation (i.e., entailment or contradiction). Finally, soft voting ensemble mechanism applied enhance performance....

10.18653/v1/2023.semeval-1.107 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2023-01-01

Steady-state visual evoked potential (SSVEP) has been used to implement brain-computer interface (BCI) due its advantages of high information transfer rate (ITR) and accuracy. In recent years, owing the developments head-mounted device (HMD), HMD become a popular SSVEP-based BCI. However, an with fixed frame only can flash at subharmonic frequencies which limits available number stimulation for order increase commands BCI, we proposed phase-approaching (PA) method generate sequences...

10.1109/tnsre.2021.3131779 article EN cc-by-nc-nd IEEE Transactions on Neural Systems and Rehabilitation Engineering 2021-01-01

This article explores users' browsing intents to predict the category of a user's next access during web surfing and applies results filter objectionable content, such as pornography, gambling, violence, drugs. Users' trails in terms sequences click‐through data are employed mine behaviors. Contextual relationships URL categories learned by hidden Markov model. The top‐level domains ( TLDs ) extracted from URLs themselves corresponding caught TLD Given be predicted, its current context...

10.1002/asi.23217 article EN Journal of the Association for Information Science and Technology 2014-06-04
Coming Soon ...