- Topic Modeling
- Natural Language Processing Techniques
- Recommender Systems and Techniques
- Web Data Mining and Analysis
- Sentiment Analysis and Opinion Mining
- Text Readability and Simplification
- Complex Network Analysis Techniques
- Multimodal Machine Learning Applications
- Text and Document Classification Technologies
- Data Quality and Management
- Online Learning and Analytics
- Advanced Graph Neural Networks
- Semantic Web and Ontologies
- Data Mining Algorithms and Applications
- Biomedical Text Mining and Ontologies
- Expert finding and Q&A systems
- Domain Adaptation and Few-Shot Learning
- Hate Speech and Cyberbullying Detection
- Misinformation and Its Impacts
- Advanced Bandit Algorithms Research
- Gene Regulatory Network Analysis
- Peer-to-Peer Network Technologies
- Gene expression and cancer classification
- Customer churn and segmentation
- Adversarial Robustness in Machine Learning
Saigon International University
2022-2023
Vietnam National University Ho Chi Minh City
2010-2022
University Of Information Technology
2012-2022
Wyższa Szkoła Technologii Informatycznych w Warszawie
2012-2016
Ho Chi Minh City University of Technology
2014
Determining the job is suitable for a student or person looking work based on their descriptions such as knowledge and skills that are difficult, well how employers must find ways to choose candidates match they require. In this paper, we focus studying prediction using different deep neural network models including TextCNN, Bi-GRU-LSTM-CNN, Bi-GRU-CNN with various pre-trained word embeddings IT dataset. addition, proposed simple effective ensemble model combining models. Our experimental...
In recent years, Hate Speech Detection has become one of the interesting fields in natural language processing or computational linguistics. this paper, we present description our system to solve problem at VLSP shared task 2019: on Social Networks with corpus which contains 20,345 human-labeled comments/posts for training and 5,086 public-testing. We implement a deep learning method based Bi-GRU-LSTM-CNN classifier into task. Our result is 70.576% F1-score, ranking 5th performance public-test set.
Machine reading comprehension is a natural language understanding task where the computing system required to read text and then find answer specific question posed by human. Large-scale high-quality corpora are necessary for evaluating machine models. Furthermore, (MRC) health sector has potential practical applications; nevertheless, MRC research in this domain currently scarce. This article presents UIT-ViNewsQA, new corpus Vietnamese evaluate models healthcare textual domain. The...
To learn about the state of art for a research project, researchers must conduct literature survey by searching for, collecting, and reading related scientific articles. Popular search systems, online digital libraries, Web Science (WoS) sources such as IEEE Explorer, ACM, SpringerLink, Google Scholar typically return results or articles that are similar to keywords in user's query. Some libraries also include content-based recommenders suggest papers one user likes based on contents paper,...
Successful research collaborations may facilitate major outcomes in science and their applications. Thus, identifying effective collaborators be a key factor that affects success. However, it is very difficult to identify potential particularly for young researchers who have less knowledge about other experts domain. This study introduces defines the problem of collaborator recommendation 'isolated' no links with others co author networks. Existing approaches such as link-based content-based...
One of the emerging research trends in natural language understanding is machine reading comprehension (MRC) which task to find answers human questions based on textual data. Existing Vietnamese datasets for MRC concentrate solely answerable questions. However, reality, can be unanswerable correct answer not stated given To address weakness, we provide community with a benchmark dataset named UIT-ViQuAD 2.0 evaluating and question answering systems language. We use as challenge at Eighth...
Job recommender is a system that automatically returns ranked list of suitable, prospective jobs for employees. It plays significant role in connecting employees and employers. In order to choose suitable algorithm build the system, comparison study popular recommendation methods conducted reported this paper. The experimental data crawled from vietnamworks.com, itviec.com careerlink.vn. A subset includes 7623 extracted running experiment. There are totally 59 users who have joint rating as...
Large-scale and high-quality corpora are necessary for evaluating machine reading comprehension models on a low-resource language like Vietnamese. Besides, (MRC) the health domain offers great potential practical applications; however, there is still very little MRC research in this domain. This paper presents ViNewsQA as new corpus Vietnamese to evaluate healthcare models. The comprises 22,057 human-generated question-answer pairs. Crowd-workers create questions their answers based...
In this paper we propose a method to extract automatically metadata (title, authors, affiliation, email, references, etc) from science papers by combining the layout information of with rules which are defined using JAPE Grammar GATE. After extracted digital documents, user can interact and correct them before they exported XML files. Developing tool documents is very necessary useful task for building collections, organizing searching in libraries. The extraction tested on computer...
Job recommender systems are designed to suggest a ranked list of jobs that could be associated with employee's interest. Most existing use only one approach make recommendation for all employees, while specific method normally is good enough group employees. Therefore, this study proposes an adaptive solution job different groups user. The proposed methods based on employee clustering. Firstly, we employees into clusters. Then, select suitable each user cluster empirical evaluation. include...
The outbreak COVID-19 virus caused a significant impact on the health of people all over world. Therefore, it is essential to have piece constant and accurate information about disease with everyone. This paper describes our prediction system for WNUT-2020 Task 2: Identification Informative English Tweets. dataset this task contains size 10,000 tweets in labeled by humans. ensemble model from three transformer deep learning models used final prediction. experimental result indicates that we...
Data are essential for the experiments of relevant scientific publication recommendation methods but it is difficult to build ground truth data. A naturally promising solution using publications that referenced by researchers their Unfortunately, this approach has not been explored in literature, so its applicability still a gap our knowledge. In research, we systematically study theoretical and empirical analyses. general, results show reasonable many advantages. However, analysis shows...
In this paper, we propose a framework to integrate bibliographical data of computer science publications from heterogeneous digital libraries. The consists three key components: publication collector, parser and duplicated checker. order analyze efficiency our in integrating sources, conduct experiment with different libraries: Microsoft Academic Search, CiteSeerX DBLP. At time, integrated dataset contains 5.320.539 1.723.148 authors their metadata. Our increases quantity rows columns...