- Topic Modeling
- Advanced Graph Neural Networks
- Natural Language Processing Techniques
- Recommender Systems and Techniques
- Complex Network Analysis Techniques
- Sentiment Analysis and Opinion Mining
- Semantic Web and Ontologies
- Domain Adaptation and Few-Shot Learning
- Handwritten Text Recognition Techniques
- Software Engineering Research
- Text and Document Classification Technologies
- Graph Theory and Algorithms
- Speech and dialogue systems
- Stock Market Forecasting Methods
- Software Reliability and Analysis Research
- Information and Cyber Security
- Advanced Malware Detection Techniques
- Brain Tumor Detection and Classification
- Speech Recognition and Synthesis
- Machine Learning in Healthcare
- Multimodal Machine Learning Applications
- Text Readability and Simplification
- Web Data Mining and Analysis
- Machine Learning in Bioinformatics
- Data Quality and Management
Vietnam National University Ho Chi Minh City
2013-2022
Monash University
2018-2021
Oracle (United States)
2021
Australian Regenerative Medicine Institute
2021
Deakin University
2017-2018
Saarland University
2014-2017
Vietnam National University, Hanoi
2009-2013
University of Engineering and Technology Lahore
2011-2013
Dai Quoc Nguyen, Tu Dinh Dat Phung. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018.
Dai Quoc Nguyen, Thanh Vu, Tu Dinh Dat Phung. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.
Thanh Vu, Dat Quoc Nguyen, Dai Mark Dras, Johnson. Proceedings of the 2018 Conference North American Chapter Association for Computational Linguistics: Demonstrations. 2018.
Identifying vulnerabilities in the source code is essential to protect software systems from cyber security attacks. It, however, also a challenging step that requires specialized expertise and representation. To this end, we aim develop general, practical, programming language-independent model capable of running on various codes libraries without difficulty. Therefore, consider vulnerability detection as an inductive text classification problem propose ReGVD, simple yet effective graph...
We introduce a transformer-based GNN model, named UGformer, to learn graph representations. In particular, we present two UGformer variants, wherein the first variant (publicized in September 2019) is leverage transformer on set of sampled neighbors for each input node, while second May 2021) all nodes. Experimental results demonstrate that achieves state-of-the-art accuracies benchmark datasets classification both inductive setting and unsupervised transductive setting; obtains text...
This paper describes our robust, easyto-use and language independent toolkit namely RDRPOSTagger which employs an error-driven approach to automatically construct a Single Classification Ripple Down Rules tree of transformation rules for POS tagging task.During the demonstration session, we will run tagger on data sets in 15 different languages.
In this paper, we propose a new approach to construct system of transformation rules for the Part-of-Speech (POS) tagging task. Our is based on an incremental knowledge acquisition method where are stored in exception structure and only added correct errors existing rules; thus allowing systematic control interaction between rules. Experimental results 13 languages show that our fast terms training time speed. Furthermore, obtains very competitive accuracy comparison state-of-the-art POS...
Identifying vulnerabilities in the source code is essential to protect software systems from cyber security attacks. It, however, also a challenging step that requires specialized expertise and representation. To this end, we aim develop general, practical, programming language-independent model capable of running on various codes libraries without difficulty. Therefore, consider vulnerability detection as an inductive text classification problem propose ReGVD, simple yet effective graph...
In this paper, we propose a novel embedding model, named ConvKB, for knowledge base completion. Our model ConvKB advances state-of-the-art models by employing convolutional neural network, so that it can capture global relationships and transitional characteristics between entities relations in bases. each triple (head entity, relation, tail entity) is represented as 3-column matrix where column vector represents element. This then fed to convolution layer multiple filters are operated on...
We propose a simple yet effective embedding model to learn quaternion embeddings for entities and relations in knowledge graphs. Our aims enhance correlations between head tail given relation within the Quaternion space with Hamilton product. The achieves this goal by further associating each two relation-aware rotations, which are used rotate of entities, respectively. Experimental results show that our proposed produces state-of-the-art performances on well-known benchmark datasets graph...
We introduce a novel embedding model, named NoGE, which aims to integrate co-occurrence among entities and relations into graph neural networks improve knowledge completion (i.e., link prediction). Given graph, NoGE constructs single considering as individual nodes. then computes weights for edges nodes based on the of relations. Next, proposes Dual Quaternion Graph Neural Networks (DualQGNN) utilizes DualQGNN update vector representations entity relation adopts score function produce triple...
We present a new feature type named rating-based and evaluate the contribution of this to task document-level sentiment analysis. achieve state-of-the-art results on two publicly available standard polarity movie datasets: dataset consisting 2000 reviews produced by Pang Lee (2004) we obtain an accuracy 91.6% while it is 89.87% evaluated 50000 created Maas et al. (2011). also get performance at 93.24% our own 233600 reviews, aim share for further research in analysis task.
Word embeddings are now a standard technique for inducing meaning representations words. For getting good representations, it is important to take into account different senses of word. In this paper, we propose mixture model learning multi-sense word embeddings. Our generalizes the previous works in that allows induce weights The experimental results show our outperforms models on evaluation tasks.
We propose a novel approach to Vietnamese word segmentation. Our is based on the Single Classification Ripple Down Rules methodology (Compton and Jansen, 1990), where rules are stored in an exception structure new only added correct segmentation errors given by existing rules. Experimental results benchmark treebank show that our outperforms previous state-of-the-art approaches JVnSegmenter, vnTokenizer, DongDu UETsegmenter terms of both accuracy performance speed. code open-source available...
This paper describes our NIHRIO system for SemEval-2018 Task 3 “Irony detection in English tweets.” We propose to use a simple neural network architecture of Multilayer Perceptron with various types input features including: lexical, syntactic, semantic and polarity features. Our achieves very high performance both subtasks binary multi-class irony tweets. In particular, we rank at least fourth using the accuracy metric sixth F1 metric. code is available at:...
Recent years have witnessed a new trend of building ontology-based question answering systems. These systems use semantic web information to produce more precise answers users' queries. However, these are mostly designed for English. In this paper, we introduce an system named KbQAS which, the best our knowledge, is first one made Vietnamese. employs analysis approach that systematically constructs knowledge base grammar rules convert each input into intermediate representation element. then...
Knowledge graph embedding methods often suffer from a limitation of memorizing valid triples to predict new ones for triple classification and search personalization problems. To this end, we introduce novel model, named R-MeN, that explores relational memory network encode potential dependencies in relationship triples. R-MeN considers each as sequence 3 input vectors recurrently interact with using transformer self-attention mechanism. Thus encodes information interactions between the...
Question answering systems aim to produce exact answers users' questions instead of a list related documents as used by current search engines. In this paper, we propose an ontology-based Vietnamese question system that allows users express their in natural language. To the best our knowledge, is first attempt enable query ontological knowledge base using Experiments on organizational ontology show promising results.
This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output word segmenter as input POS tagger, and (ii) joint predict combined segmentation tag each syllable. We also make between state-of-the-art (SOTA) feature-based neural network-based models. On benchmark treebank (Nguyen et al., 2009), experimental results show that produces better scores text than strategy, highest...
This paper presents an approach to the task of predicting event description from a preceding sentence in text. Our explores sequence-to-sequence learning using bidirectional multi-layer recurrent neural network. substantially outperforms previous work terms BLEU score on two datasets derived WikiHow and DeScript respectively. Since is not easy interpret as measure prediction, we complement our study with second evaluation that exploits rich linguistic annotation gold paraphrase sets events.