Luan Thanh Nguyen

ORCID: 0000-0003-4882-8336
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Hate Speech and Cyberbullying Detection
  • Topic Modeling
  • Sentiment Analysis and Opinion Mining
  • Natural Language Processing Techniques
  • Text and Document Classification Technologies
  • Spam and Phishing Detection
  • Software Engineering Research
  • Names, Identity, and Discrimination Research
  • RNA Research and Splicing
  • Engineering Applied Research
  • Facial Rejuvenation and Surgery Techniques
  • Innovations in Concrete and Construction Materials
  • Metallurgy and Cultural Artifacts
  • Advanced biosensing and bioanalysis techniques
  • Authorship Attribution and Profiling
  • Photovoltaic Systems and Sustainability
  • Advanced Machining and Optimization Techniques
  • Biopolymer Synthesis and Applications
  • Expert finding and Q&A systems
  • Aluminum Alloys Composites Properties
  • Lipid metabolism and biosynthesis
  • Additive Manufacturing and 3D Printing Technologies
  • Polymer Science and PVC
  • IoT and GPS-based Vehicle Safety Systems
  • Garlic and Onion Studies

Japan Advanced Institute of Science and Technology
2023-2025

University of Maryland, Baltimore County
2022-2024

Touro University California
2022

University Of Information Technology
2022

Vietnam National University Ho Chi Minh City
2021

Ho Chi Minh City Medicine and Pharmacy University
2018

Institut des Matériaux Jean Rouxel
2015

Centre National de la Recherche Scientifique
2015

Délégation Paris 5
2010

Université Paris Cité
2010

Speech-to-speech translation (S2ST) has emerged as a practical solution for overcoming linguistic barriers, enabling direct between spoken languages without relying on intermediate text representations. However, existing S2ST systems face significant challenges, including the requirement extensive parallel speech data and limitations of known written languages. This paper proposes ZeST, novel zero-resourced approach to speech-to-speech that addresses challenges processing unknown, unpaired,...

10.1109/access.2025.3527012 article EN cc-by IEEE Access 2025-01-01

One of the emerging research trends in natural language understanding is machine reading comprehension (MRC) which task to find answers human questions based on textual data. Existing Vietnamese datasets for MRC concentrate solely answerable questions. However, reality, can be unanswerable correct answer not stated given To address weakness, we provide community with a benchmark dataset named UIT-ViQuAD 2.0 evaluating and question answering systems language. We use as challenge at Eighth...

10.25073/2588-1086/vnucsce.340 article EN VNU Journal of Science Computer Science and Communication Engineering 2022-12-16

The synthesis of poly(L-glutamic acid) (PG) was investigated. Reduction poly(benzyl-L-glutamate) by the palladium/charcoal catalyst proved to be an effective method for obtaining polyglutamic acid pure and particularly exhibiting in α-helix secondary structure. structure this synthetic polypeptide assessed infrared spectroscopy, gel permeation chromatography, proton nuclear magnetic resonance temperature-modulated differential scanning calorimetry wide-angle powder X-ray diffraction methods....

10.1590/1980-5373-mr-2020-0321 article EN cc-by Materials Research 2021-01-01

The increment of toxic comments on online space is causing tremendous effects other vulnerable users. For this reason, considerable efforts are made to deal with this, and SemEval-2021 Task 5: Toxic Spans Detection one those. This task asks competitors extract spans that have toxicity from the given texts, we done several analyses understand its structure before doing experiments. We solve by two approaches, Named Entity Recognition spaCy's library Question-Answering RoBERTa combining...

10.18653/v1/2021.semeval-1.125 article EN cc-by Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) 2021-01-01

Recent advancements in hate speech detection (HSD) Vietnamese have made significant progress, primarily attributed to the emergence of transformer-based pre-trained language models, particularly those built on BERT architecture. However, necessity for specialized fine-tuned models has resulted complexity and fragmentation developing a multitasking HSD system. Moreover, most current methodologies focus fine-tuning general trained formal textual datasets like Wikipedia, which may not...

10.48550/arxiv.2405.14141 preprint EN arXiv (Cornell University) 2024-05-22

Human liver-type phosphofructokinase 1 (PFKL) has been shown to regulate glucose flux as a scaffolder arranging glycolytic and gluconeogenic enzymes into multienzyme metabolic condensate, the glucosome. However, it remained elusive of how phase separation PFKL is governed initiates glucosome formation in living cells, thus hampering understand mechanism its functional contribution human cells. In this work, we developed stochastic model silico using principle Langevin dynamics investigate...

10.1038/s41598-024-69534-w article EN cc-by-nc-nd Scientific Reports 2024-08-16

10.18653/v1/2024.findings-acl.355 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none them has provided fine-grained classification 63 dialects specific individual provinces To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, novel...

10.48550/arxiv.2410.03458 preprint EN arXiv (Cornell University) 2024-10-04

10.18653/v1/2024.emnlp-main.426 article Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

Question answering (QA) systems have gained explosive attention in recent years. However, QA tasks Vietnamese do not many datasets. Significantly, there is mostly no dataset the medical domain. Therefore, we built a Healthcare Answering (ViHealthQA), including 10,015 question-answer passage pairs for this task, which questions from health-interested users were asked on prestigious health websites and answers highly qualified experts. This paper proposes two-stage system based Sentence-BERT...

10.48550/arxiv.2206.09600 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Text classification is a typical natural language processing or computational linguistics task with various interesting applications. As the number of users on social media platforms increases, data acceleration promotes emerging studies Social Media Classification (SMTC) text mining these valuable resources. In contrast to English, Vietnamese, one low-resource languages, still not concentrated and exploited thoroughly. Inspired by success GLUE, we introduce Evaluation (SMTCE) benchmark, as...

10.48550/arxiv.2209.10482 preprint EN other-oa arXiv (Cornell University) 2022-01-01

A high power level with even-order suppression over the broadband amplifier is presented. In this study, multi-octave bandwidth achieved use of a distributed architecture. Besides, presence balun, properties. The design fabricated using 450 nm Gallium Nitride (GaN) process bare-die size 4.5 mm * 3 mm. EM simulated results in 2-6 GHz band show that PA exhibits minimum linear gain 10.8 dB, saturated output 44.6 dBm (28W). 2nd harmonic rejection at least -33 dBc.

10.1109/isee51682.2021.9418689 article EN 2021-04-15
Coming Soon ...