- Data Management and Algorithms
- Data Mining Algorithms and Applications
- Advanced Database Systems and Queries
- Machine Learning in Bioinformatics
- Artificial Intelligence in Healthcare
- Time Series Analysis and Forecasting
- Rough Sets and Fuzzy Logic
- Biomedical Text Mining and Ontologies
- Geographic Information Systems Studies
- Recommender Systems and Techniques
- Protein Structure and Dynamics
- Bioinformatics and Genomic Networks
- Topic Modeling
- Imbalanced Data Classification Techniques
- Gene expression and cancer classification
- Web Data Mining and Analysis
- Genomics and Phylogenetic Studies
- ECG Monitoring and Analysis
- Traditional Chinese Medicine Studies
- Customer churn and segmentation
- Technology and Data Analysis
- Semantic Web and Ontologies
- Non-Invasive Vital Sign Monitoring
- Energy Load and Power Forecasting
- Advanced Text Analysis Techniques
Chungbuk National University
2016-2025
Ton Duc Thang University
2018-2025
Chiang Mai University
2020-2024
Chong Kun Dang Bio (South Korea)
2019-2023
Korea Institute of Science and Technology
2023
Korean Association Of Science and Technology Studies
2023
National University College
2022
ORCID
2019
SK Group (South Korea)
2009-2017
Seoul National University
2009-2015
The automatic extraction of chemical information from text requires the recognition entity mentions as one its key steps. When developing supervised named (NER) systems, availability a large, manually annotated corpus is desirable. Furthermore, large corpora permit robust evaluation and comparison different approaches that detect chemicals in documents. We present CHEMDNER corpus, collection 10,000 PubMed abstracts contain total 84,355 labeled by expert chemistry literature curators,...
Emotion detection and recognition from text is a recent essential research area in Natural Language Processing (NLP) which may reveal some valuable input to variety of purposes. Nowadays, writings take many forms social media posts, micro-blogs, news articles, customer review, etc., the content these short-texts can be useful resource for mining discover an unhide various aspects, including emotions. The previously presented models mainly adopted word embedding vectors that represent rich...
Recently, rapid improvements in technology and decrease sequencing costs have made RNA-Seq a widely used technique to quantify gene expression levels. Various normalization approaches been proposed, owing the importance of analysis data. A comparison recently proposed methods is required generate suitable guidelines for selection most appropriate approach future experiments. In this paper, we compared eight non-abundance (RC, UQ, Med, TMM, DESeq, Q, RPKM, ERPKM) two abundance estimation...
Machine learning and artificial intelligence have achieved a human-level performance in many application domains, including image classification, speech recognition machine translation. However, the financial domain expert-based credit risk models still been dominating. Establishing meaningful benchmark comparisons on machine-learning approaches human is prerequisite further introducing novel methods. Therefore, our main goal this study to establish new using real consumer data provide that...
Novelty detection is a classification problem to identify abnormal patterns; therefore, it an important task for applications such as fraud detection, fault diagnosis and disease detection. However, when there no label that indicates normal data, will need expensive domain professional knowledge, so unsupervised novelty approach be used. On the other hand, nowadays, using on high dimensional data big challenge previous research suggests approaches based principal component analysis (PCA)...
Abstract Alzheimer's disease (AD) is an age‐related neurodegenerative disease. The most common pathological hallmarks are amyloid plaques and neurofibrillary tangles in the brain. In brains of patients with AD, tau abnormally accumulated causing neuronal loss, synaptic dysfunction, cognitive decline. We found a histone deacetylase 6 (HDAC6) inhibitor, CKD‐504, changed interactome dramatically to degrade not only AD animal model (ADLP APT ) containing both but also patient‐derived brain...
Defective die on a wafer map tend to cluster in distinguishable patterns, and such defect patterns can provide crucial information identify equipment problems or process failures the semiconductor manufacturing. Therefore, it is important accurately efficiently classify patterns. In this paper, we propose novel clustering-based pattern detection classification framework for bin (WBM). The proposed has many advantages. Outlier extraction be done at same time; arbitrarily shaped detected;...
An accurate exchange rate forecasting and its decision-making to buy or sell are critical issues in the Forex market. Short-term currency is a challenging task due inherent characteristics, which include high volatility, trend, noise, market shocks. We propose novel deep learning architecture consisting of an adaptive activation function selection mechanism achieve higher predictive accuracy. The proposed composed seven neural networks that have different functions as well softmax layer...
Smoking-induced noncommunicable diseases (SiNCDs) have become a significant threat to public health and cause of death globally. In the last decade, numerous studies been proposed using artificial intelligence techniques predict risk developing SiNCDs. However, determining most features interpretable models are rather challenging in such systems. this study, we propose an efficient extreme gradient boosting (XGBoost) based framework incorporated with hybrid feature selection (HFS) method for...
DNA methylation patterns are associated with the development and prognosis of cancer. The aim this study was to identify novel markers for prediction patient outcomes using microarray analysis RNA expression in samples from long‐term follow‐up patients nonmuscle invasive bladder cancer (NMIBC). A total 187 human specimens were used array or pyrosequencing (PSQ) analyses: 6 normal controls (NC) 181 NMIBC. Tumor‐specific hypermethylated genes selected a data set comprising 24 matched...
Abstract Motivation: Gene selection for cancer classification is one of the most important topics in biomedical field. However, microarray data pose a severe challenge computational techniques. We need dimension reduction techniques that identify small set genes to achieve better learning performance. From perspective machine learning, can be considered feature problem aims find subset features has discriminative information target. Results: In this article, we proposed an Ensemble...
Named Entity Recognition (NER) in the healthcare domain involves identifying and categorizing disease, drugs, symptoms for biosurveillance, extracting their related properties activities, adverse drug events appearing texts. These tasks are important challenges healthcare. Analyzing user messages social media networks such as Twitter can provide opportunities to detect manage public health events. provides a broad range of short that contain interesting information extraction. In this paper,...
A multivariate time series forecasting is critical in many applications, such as signal processing, finance, air quality forecasting, and pattern recognition. In particular, determining the most relevant variables proper lag length from challenging. This paper proposes an end-to-end recurrent neural network framework equipped with adaptive input selection mechanism to improve prediction performance for forecasting. The proposed model, named AIS-RNN, consists of two main components: first...
This study proposes an efficient prediction method for coronary heart disease risk based on two deep neural networks trained well-ordered training datasets. Most real datasets include irregular subset with higher variance than most data, and predictive models do not learn well from these While existing learned the whole or randomly sampled datasets, our suggested draws up by separating regular highly biased subsets to build accurate models. We use a two-step approach prepare dataset: (1)...
Environment Observation and Forecasting System(EOFS) is a application for monitoring providing forecasting about environmental phenomena. We design an air pollution system which involves context model flexible data acquisition policy. The used understanding the status of on remote place. It can provide alarm safety guideline depending condition model. also supports sampling interval change effective tradeoff between rates battery lifetimes. This changed conditions derived from save limited...
Customers of different contract types have shapes in daily load profiles the manner characteristics. Therefore, maximally capture local and global shape variability is essential profiling which exhibits customers' behaviors Existing approaches are focusing on property by considering all dimensions data set. However, determined subspace most time. In this paper, we use projection methods (subspace clustering projected clustering) to such subspaces diagrams maximize difference between...