Duy-Cat Can

ORCID: 0000-0002-6861-2893
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Sentiment Analysis and Opinion Mining
  • Biomedical Text Mining and Ontologies
  • Text and Document Classification Technologies
  • Speech Recognition and Synthesis
  • Computational and Text Analysis Methods
  • Information Retrieval and Search Behavior
  • Advanced Malware Detection Techniques
  • Recommender Systems and Techniques
  • Imbalanced Data Classification Techniques
  • Flood Risk Assessment and Management
  • Machine Learning in Healthcare
  • Internet Traffic Analysis and Secure E-voting
  • Network Security and Intrusion Detection
  • Meteorological Phenomena and Simulations
  • Hydrological Forecasting Using AI
  • Multimodal Machine Learning Applications
  • Advanced Graph Neural Networks

VNU University of Science
2018-2025

University of Engineering and Technology Lahore
2023

Vietnam National University, Hanoi
2019-2023

Nanyang Technological University
2018-2019

Objective: Assessing Alzheimer's disease (AD) using high-dimensional radiology images is clinically important but challenging. Although Artificial Intelligence (AI) has advanced AD diagnosis, it remains unclear how to design AI models embracing predictability and explainability. Here, we propose VisTA, a multimodal language-vision model assisted by contrastive learning, optimize prediction evidence-based, interpretable explanations for clinical decision-making. Methods: We developed VisTA...

10.48550/arxiv.2502.01535 preprint EN arXiv (Cornell University) 2025-02-03

Duy-Cat Can, Hoang-Quynh Le, Quang-Thuy Ha, Nigel Collier. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1298 article EN 2019-01-01

Experimental performance on the task of relation classification has generally improved using deep neural network architectures. One major drawback reported studies is that individual models have been evaluated a very narrow range datasets, raising questions about adaptability architectures, while making comparisons between approaches difficult. In this work, we present systematic large-scale analysis architectures six benchmark datasets with widely varying characteristics. We propose novel...

10.18653/v1/d18-1250 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

There have been an increasing number of various machine learning-based models successfully proposed and applied for automatic chemical-induced disease (CID) relation extraction. They, however, usually require carefully handcrafted rich feature sets, which rely on expert knowledge, thus expensive human labor but normally still cannot generalize data well enough. In this paper, we propose a CID extraction model that learns features automatically through Convolutional Neural Network (CNN)...

10.1109/kse.2017.8119474 article EN 2017-10-01

Automatic speech recognition systems currently deliver an unpunctuated sequence of words which is hard to peruse for human and degrades the performance downstream natural language processing tasks. In this paper, we propose a hybrid approach Sentence Unit Detection, in focus on adding full stop [.]to unstructured text. Our model profits from advantage two dominant deep learning architectures: (i)the ability learn long dependencies both directions bidirectional Long Short-Term Memory; (ii)the...

10.1109/ialp.2018.8629178 article EN 2018-11-01

Duy-Cat Can, Quoc-An Nguyen, Quoc-Hung Duong, Minh-Quang Huy-Son Linh Nguyen Tran Ngoc, Quang-Thuy Ha, Mai-Vu Tran. Proceedings of the 20th Workshop on Biomedical Language Processing. 2021.

10.18653/v1/2021.bionlp-1.36 article EN cc-by 2021-01-01

Most previous relation extraction (RE) studies have focused on intra sentence relations and ignored that span sentences, i.e. inter relations. Such connect entities at the document level rather than as relational facts in a single sentence. Extracting are expressed across sentences leads to some challenges requires different approaches those usually applied recent extraction. Despite results, there still limitations be overcome.We present novel representation for sequence of consecutive...

10.1186/s13326-022-00267-3 article EN cc-by Journal of Biomedical Semantics 2022-06-03

Climate change poses significant challenges for society, particularly in mitigating the impacts of extreme weather events. Accurate and timely forecasts phenomena are crucial effective adaptation strategies to cope with disasters minimize serious consequences. This paper presents a hybrid approach, FOREcaST, enhanced forecasting, which leverages power deep neural networks decision forests improve prediction accuracy Specifically, we proposed using Deep Neural Decision Forest regression...

10.1109/kse59128.2023.10299427 article EN 2023-10-18

Extractive Multi-Document Summarization (EMDS) plays a pivotal role in distilling information from multiple sources, enabling efficient knowledge synthesis and document retrieval. However, achieving high-quality EMDS, particularly languages with unique linguistic characteristics such as Vietnamese, remains challenge. In this paper, we adapt the Contrastive Hierarchical Discourse Graph (CHDG), novel approach designed to address these challenges. CHDG operates at levels, including sentence,...

10.1109/ialp61005.2023.10337087 article EN 2023-11-18

This paper presents a comprehensive overview of the Comparative Opinion Mining from Vietnamese Product Reviews shared task (ComOM), held as part 10$^{th}$ International Workshop on Language and Speech Processing (VLSP 2023). The primary objective this is to advance field natural language processing by developing techniques that proficiently extract comparative opinions product reviews. Participants are challenged propose models adeptly "quintuple" sentence, encompassing Subject, Object,...

10.48550/arxiv.2402.13613 preprint EN arXiv (Cornell University) 2024-02-21

This article illustrates a system developed to tackle Aspect-based sentiment classification for Vietnamese E-commerce reviews. We employ supervised learning models based on Deep Learning application and multiple classic classifiers such as Random Forest, Decision Tree, Support Vector Machine, etc. sort out the model performs best with our dataset. Our method obtained maximum Micro-Average Macro-Average Performance of 95%. Furthermore, we present how manually-annotated multi-aspect dataset in...

10.1109/kse53942.2021.9648637 article EN 2021-11-10

In natural language processing problems, text summarization is a difficult problem and always attracts attention from the research community, especially working on biomedical data which lacks supporting tools techniques. this scientific report, we propose multi-document model for responses in question answer system. Our includes components combination of many advanced techniques as well some improved methods proposed by authors. We present applied to two main approaches: an extractive...

10.1109/kse53942.2021.9648640 article EN 2021-11-10

Currently, research models that effectively predict cross-selling products while utilizing multimodal data sources are limited, and similarly, focusing on recommendation do not adequately address cross-selling. To this gap, our study introduces the model HHMC: A Heterogeneous x Homogeneous Graph-based Network for Multimodal Cross-Selling Recommendation. This innovative approach leverages historical order's diverse to recommend products. The architecture of HHMC is thoughtfully designed...

10.1109/kse59128.2023.10299431 article EN 2023-10-18

This work investigates the effectiveness of using word based and sub-word embedding representations as input for a deep bidirectional Long Short-Term Memory Network Sentence Unit Detection in Automatic Speech Recognition transcription. Our experimental results show that can significantly improve SUD performance when limited text is used to train both model. The model gains up 2.07% absolute improvement F1-score compared best trained with word-based embedding. When tested on domain-mismatch...

10.1109/ialp.2018.8629114 article EN 2018-11-01

This article introduces several methods for aspect classification using machine learning models data collected from customers' reviews on two e-commerce sites and mainly focuses handling imbalance to improve classifier performance. To this end, we describe the problem as a binary at sentence level. Sentences will be expressed feature vectors One-hot combined with Chi-square statistics, use basic such Naive Bayes, SVM, Random Forest, Linear Regression training before classifying aspects...

10.1109/ialp54817.2021.9675163 article EN 2021-12-11

For information consumers, being able to obtain a short and accurate answer for query is one of the most desirable features. This motivation, along with rise deep learning, has led boom in open-domain Question Answering (QA) research. While problem machine comprehension received multiple success help large training corpora emergence attention mechanism, development document retrieval QA lagged behind. In this work, we propose novel encoding method learning question-aware self-attentive...

10.1145/3310986.3310999 article EN 2019-01-25

The performance of automatic summarization systems has improved significantly with the development supervised approaches. However, in Vietnamese abstractive multi-document task, available datasets are insufficient for training model. With this motivation, we contribute a new gold standard dataset, named Abmusu. Following collecting and clustering articles, have built hierarchical annotation process to generate summaries, three roles: annotator, supervisor, curator. As result, dataset...

10.1142/s2717554523500030 article EN International Journal of Asian Language Processing 2022-09-01

This article introduces methods for applying Deep Learning in identifying aspects from written commentaries on Shopee e-commerce sites. The used datasets are two sets of Vietnamese consumers' comments about purchased products domains. Words and sentences will be performed as vectors, or characteristic matrices through language models such one-hot, fastText, PhoBERT. We then Convolutional Neural Network (CNN) the Fully Connected (Multilayer perceptron - MLP) to learn which mentioned comments....

10.1109/kse53942.2021.9648690 article EN 2021-11-10

Clinical note generation from doctor-patient con-versations is an essential task that helps to maintain the medical records of patients. The process writing clinical notes a time-consuming for doctors, and redundant or inaccurate information in may have adverse consequences. In this paper, we propose novel approach generation. Our contribution lies proposing semantic partitioning clustering method extractive module task. We show our semantic-based partition provided way extract relevant...

10.1109/kse59128.2023.10299512 article EN 2023-10-18
Coming Soon ...