NFDI4DS | UHH-SEMS - Publication Details

Duy-Cat Can

ORCID: 0000-0002-6861-2893

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5086574827

Research Areas

Topic Modeling
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Sentiment Analysis and Opinion Mining
Biomedical Text Mining and Ontologies
Text and Document Classification Technologies
Speech Recognition and Synthesis
Computational and Text Analysis Methods
Information Retrieval and Search Behavior
Advanced Malware Detection Techniques
Recommender Systems and Techniques
Imbalanced Data Classification Techniques
Flood Risk Assessment and Management
Machine Learning in Healthcare
Internet Traffic Analysis and Secure E-voting
Network Security and Intrusion Detection
Meteorological Phenomena and Simulations
Hydrological Forecasting Using AI
Multimodal Machine Learning Applications
Advanced Graph Neural Networks

VNU University of Science
2018-2025

University of Engineering and Technology Lahore
2023

Vietnam National University, Hanoi
2019-2023

Nanyang Technological University
2018-2019

VisTA: Vision-Text Alignment Model with Contrastive Learning using Multimodal Data for Evidence-Driven, Reliable, and Explainable Alzheimer's Disease Diagnosis

OPENALEX - Publications

Duy-Cat Can Linh D. Dang Quang-Huy Tang Dang Minh Ly Huang Ha and 3 more

Objective: Assessing Alzheimer's disease (AD) using high-dimensional radiology images is clinically important but challenging. Although Artificial Intelligence (AI) has advanced AD diagnosis, it remains unclear how to design AI models embracing predictability and explainability. Here, we propose VisTA, a multimodal language-vision model assisted by contrastive learning, optimize prediction evidence-based, interpretable explanations for clinical decision-making. Methods: We developed VisTA...

10.48550/arxiv.2502.01535 preprint EN arXiv (Cornell University) 2025-02-03

Vista: Vision-Text Alignment Model with Contrastive Learning Using Multimodal Data for Evidence-Driven, Reliable, and Explainable Alzheimer's Disease Diagnosis

OPENALEX - Publications

Duy-Cat Can Linh D. Dang Quang-Huy Tang Dang Minh Ly Huang Ha and 3 more

10.2139/ssrn.5149003 preprint EN 2025-01-01

Towards Effective Comparative Opinion Mining: A Novel Vietnamese Product Review Corpus and Benchmark Approach

OPENALEX - Publications

Duy-Cat Can Khanh-Vinh Nguyen Huu Phuong Hoang D.T. Vu Mai-Vu Tran and 1 more

10.1109/access.2025.3567845 article EN cc-by IEEE Access 2025-01-01

A Richer-but-Smarter Shortest Dependency Path with Attentive Augmentation for Relation Extraction

OPENALEX - Publications

Duy-Cat Can Hoang-Quynh Le Quang-Thuy Ha Nigel Collier

Duy-Cat Can, Hoang-Quynh Le, Quang-Thuy Ha, Nigel Collier. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1298 article EN 2019-01-01

Large-scale Exploration of Neural Relation Classification Architectures

OPENALEX - Publications

Hoang-Quynh Le Duy-Cat Can Tien-Sinh Vu Thanh Hai Dang Mohammad Taher Pilehvar and 1 more

Experimental performance on the task of relation classification has generally improved using deep neural network architectures. One major drawback reported studies is that individual models have been evaluated a very narrow range datasets, raising questions about adaptability architectures, while making comparisons between approaches difficult. In this work, we present systematic large-scale analysis architectures six benchmark datasets with widely varying characteristics. We propose novel...

10.18653/v1/d18-1250 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2018-01-01

Improving chemical-induced disease relation extraction with learned features based on convolutional neural network

OPENALEX - Publications

Hoang-Quynh Le Duy-Cat Can Thanh Hai Dang Mai-Vu Tran Quang-Thuy Ha and 1 more

There have been an increasing number of various machine learning-based models successfully proposed and applied for automatic chemical-induced disease (CID) relation extraction. They, however, usually require carefully handcrafted rich feature sets, which rely on expert knowledge, thus expensive human labor but normally still cannot generalize data well enough. In this paper, we propose a CID extraction model that learns features automatically through Convolutional Neural Network (CNN)...

10.1109/kse.2017.8119474 article EN 2017-10-01

A Hybrid Deep Learning Architecture for Sentence Unit Detection

OPENALEX - Publications

Duy-Cat Can Thi-Nga Ho Eng Siong Chng

Automatic speech recognition systems currently deliver an unpunctuated sequence of words which is hard to peruse for human and degrades the performance downstream natural language processing tasks. In this paper, we propose a hybrid approach Sentence Unit Detection, in focus on adding full stop [.]to unstructured text. Our model profits from advantage two dominant deep learning architectures: (i)the ability learn long dependencies both directions bidirectional Long Short-Term Memory; (ii)the...

10.1109/ialp.2018.8629178 article EN 2018-11-01

UETrice at MEDIQA 2021: A Prosper-thy-neighbour Extractive Multi-document Summarization Model

OPENALEX - Publications

Duy-Cat Can Quoc-An Nguyen Quoc-Hung Duong Minh‐Quang Nguyen Huy-Son Nguyen and 3 more

Duy-Cat Can, Quoc-An Nguyen, Quoc-Hung Duong, Minh-Quang Huy-Son Linh Nguyen Tran Ngoc, Quang-Thuy Ha, Mai-Vu Tran. Proceedings of the 20th Workshop on Biomedical Language Processing. 2021.

10.18653/v1/2021.bionlp-1.36 article EN cc-by 2021-01-01

Exploiting document graphs for inter sentence relation extraction

OPENALEX - Publications

Hoang-Quynh Le Duy-Cat Can Nigel Collier

Most previous relation extraction (RE) studies have focused on intra sentence relations and ignored that span sentences, i.e. inter relations. Such connect entities at the document level rather than as relational facts in a single sentence. Extracting are expressed across sentences leads to some challenges requires different approaches those usually applied recent extraction. Despite results, there still limitations be overcome.We present novel representation for sequence of consecutive...

10.1186/s13326-022-00267-3 article EN cc-by Journal of Biomedical Semantics 2022-06-03

FOREcaST: Improving Extreme Weather Forecasts with Deep Neural Decision Forest for Climate Change Adaptation

OPENALEX - Publications

Khanh-Vinh Nguyen Quoc-An Nguyen Hoang-Quynh Le Duy-Cat Can

Climate change poses significant challenges for society, particularly in mitigating the impacts of extreme weather events. Accurate and timely forecasts phenomena are crucial effective adaptation strategies to cope with disasters minimize serious consequences. This paper presents a hybrid approach, FOREcaST, enhanced forecasting, which leverages power deep neural networks decision forests improve prediction accuracy Specifically, we proposed using Deep Neural Decision Forest regression...

10.1109/kse59128.2023.10299427 article EN 2023-10-18

Contrastive Hierarchical Discourse Graph for Vietnamese Extractive Multi-Document Summarization

OPENALEX - Publications

Tu-Phuong Mai Quoc-An Nguyen Duy-Cat Can Hoang-Quynh Le

Extractive Multi-Document Summarization (EMDS) plays a pivotal role in distilling information from multiple sources, enabling efficient knowledge synthesis and document retrieval. However, achieving high-quality EMDS, particularly languages with unique linguistic characteristics such as Vietnamese, remains challenge. In this paper, we adapt the Contrastive Hierarchical Discourse Graph (CHDG), novel approach designed to address these challenges. CHDG operates at levels, including sentence,...

10.1109/ialp61005.2023.10337087 article EN 2023-11-18

Overview of the VLSP 2023 -- ComOM Shared Task: A Data Challenge for Comparative Opinion Mining from Vietnamese Product Reviews

OPENALEX - Publications

Hoang-Quynh Le Duy-Cat Can Khanh-Vinh Nguyen Mai-Vu Tran

This paper presents a comprehensive overview of the Comparative Opinion Mining from Vietnamese Product Reviews shared task (ComOM), held as part 10$^{th}$ International Workshop on Language and Speech Processing (VLSP 2023). The primary objective this is to advance field natural language processing by developing techniques that proficiently extract comparative opinions product reviews. Participants are challenged propose models adeptly "quintuple" sentence, encompassing Subject, Object,...

10.48550/arxiv.2402.13613 preprint EN arXiv (Cornell University) 2024-02-21

Aspect-Based Sentiment Analysis Using Mini-Window Locating Attention for Vietnamese E-commerce Reviews

OPENALEX - Publications

Binh Le-Minh Thi-Phuong Le Khanh-Hung Tran Khanh-Huyen Bui Hoang-Quynh Le and 3 more

This article illustrates a system developed to tackle Aspect-based sentiment classification for Vietnamese E-commerce reviews. We employ supervised learning models based on Deep Learning application and multiple classic classifiers such as Random Forest, Decision Tree, Support Vector Machine, etc. sort out the model performs best with our dataset. Our method obtained maximum Micro-Average Macro-Average Performance of 95%. Furthermore, we present how manually-annotated multi-aspect dataset in...

10.1109/kse53942.2021.9648637 article EN 2021-11-10

A Hybrid Multi-answer Summarization Model for the Biomedical Question-Answering System

OPENALEX - Publications

Quoc-An Nguyen Quoc-Hung Duong Minh‐Quang Nguyen Huy-Son Nguyen Hoang-Quynh Le and 3 more

In natural language processing problems, text summarization is a difficult problem and always attracts attention from the research community, especially working on biomedical data which lacks supporting tools techniques. this scientific report, we propose multi-document model for responses in question answer system. Our includes components combination of many advanced techniques as well some improved methods proposed by authors. We present applied to two main approaches: an extractive...

10.1109/kse53942.2021.9648640 article EN 2021-11-10

HHMC: A Heterogeneous x Homogeneous Graph-Based Network for Multimodal Cross-Selling Recommendation

OPENALEX - Publications

Huy-Son Nguyen Tuan-Nghia Bui Long-Hai Nguyen Duy-Cat Can Cam-Van Thi Nguyen and 2 more

Currently, research models that effectively predict cross-selling products while utilizing multimodal data sources are limited, and similarly, focusing on recommendation do not adequately address cross-selling. To this gap, our study introduces the model HHMC: A Heterogeneous x Homogeneous Graph-based Network for Multimodal Cross-Selling Recommendation. This innovative approach leverages historical order's diverse to recommend products. The architecture of HHMC is thoughtfully designed...

10.1109/kse59128.2023.10299431 article EN 2023-10-18

An Investigation of Word Embeddings with Deep Bidirectional LSTM for Sentence Unit Detection in Automatic Speech Transcription

OPENALEX - Publications

Thi-Nga Ho Duy-Cat Can Eng Siong Chng

This work investigates the effectiveness of using word based and sub-word embedding representations as input for a deep bidirectional Long Short-Term Memory Network Sentence Unit Detection in Automatic Speech Recognition transcription. Our experimental results show that can significantly improve SUD performance when limited text is used to train both model. The model gains up 2.07% absolute improvement F1-score compared best trained with word-based embedding. When tested on domain-mismatch...

10.1109/ialp.2018.8629114 article EN 2018-11-01

THANOS: The Aspect Classification Model for Imbalanced Vietnamese E-commerce Review Data

OPENALEX - Publications

Minh-Son Cong Tuan-Minh Dam Anh-My Phuong Thi-Quyen Nguyen Duy-Cat Can and 3 more

This article introduces several methods for aspect classification using machine learning models data collected from customers' reviews on two e-commerce sites and mainly focuses handling imbalance to improve classifier performance. To this end, we describe the problem as a binary at sentence level. Sentences will be expressed feature vectors One-hot combined with Chi-square statistics, use basic such Naive Bayes, SVM, Random Forest, Linear Regression training before classifying aspects...

10.1109/ialp54817.2021.9675163 article EN 2021-12-11

QASA

OPENALEX - Publications

Trang Minh Nguyen Van-Lien Tran Duy-Cat Can Quang-Thuy Ha Ly Vu and 1 more

For information consumers, being able to obtain a short and accurate answer for query is one of the most desirable features. This motivation, along with rise deep learning, has led boom in open-domain Question Answering (QA) research. While problem machine comprehension received multiple success help large training corpora emergence attention mechanism, development document retrieval QA lagged behind. In this work, we propose novel encoding method learning question-aware self-attentive...

10.1145/3310986.3310999 article EN 2019-01-25

VLSP 2022 Abmusu Task Dataset: A Resource for Vietnamese Abstractive Multi-Document Summarization

OPENALEX - Publications

Quoc-An Nguyen Duy-Cat Can Hoang-Quynh Le Mai-Vu Tran

The performance of automatic summarization systems has improved significantly with the development supervised approaches. However, in Vietnamese abstractive multi-document task, available datasets are insufficient for training model. With this motivation, we contribute a new gold standard dataset, named Abmusu. Following collecting and clustering articles, have built hierarchical annotation process to generate summaries, three roles: annotator, supervisor, curator. As result, dataset...

10.1142/s2717554523500030 article EN International Journal of Asian Language Processing 2022-09-01

Attention-Based Deep Learning Model for Aspect Classification on Vietnamese E-commerce Data

OPENALEX - Publications

Tu N. Nguyen Trong-Dat Nguyen Duy-Cat Can Mai-Vu Tran Ha Manh and 1 more

This article introduces methods for applying Deep Learning in identifying aspects from written commentaries on Shopee e-commerce sites. The used datasets are two sets of Vietnamese consumers' comments about purchased products domains. Words and sentences will be performed as vectors, or characteristic matrices through language models such one-hot, fastText, PhoBERT. We then Convolutional Neural Network (CNN) the Fully Connected (Multilayer perceptron - MLP) to learn which mentioned comments....

10.1109/kse53942.2021.9648690 article EN 2021-11-10

Enhancing Clinical Note Generation from Doctor-Patient Conversations through Semantic Partition-Oriented Summarization

OPENALEX - Publications

Binh-Nguyen Nguyen Hoang-Quynh Le Duy-Cat Can

Clinical note generation from doctor-patient con-versations is an essential task that helps to maintain the medical records of patients. The process writing clinical notes a time-consuming for doctors, and redundant or inaccurate information in may have adverse consequences. In this paper, we propose novel approach generation. Our contribution lies proposing semantic partitioning clustering method extractive module task. We show our semantic-based partition provided way extract relevant...

10.1109/kse59128.2023.10299512 article EN 2023-10-18

Coming Soon ...