Junyoung Son

ORCID: 0000-0002-4142-6927
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Machine Learning in Materials Science
  • Intellectual Property and Patents
  • Speech and dialogue systems
  • Technology and Data Analysis
  • Explainable Artificial Intelligence (XAI)
  • Data Quality and Management

Korea University
2021-2023

Tokenization is a significant primary step for the training of Pre-trained Language Model (PLM), which alleviates challenging Out-of-Vocabulary problem in area Natural Processing. As tokenization strategies can change linguistic understanding, it essential to consider composition input features based on characteristics language model performance. This study answers question "Which strategy enhances Korean Named Entity Recognition (NER) task model?" focusing tokenization, significantly...

10.1109/access.2021.3126882 article EN cc-by IEEE Access 2021-01-01

Patents provide inventors exclusive rights to their inventions by protecting intellectual property rights. However, analyzing patent documents generally requires knowledge of various fields, considerable human labor, and expertise. Recent studies alleviate this problem on analysis deal only with the claims abstract parts, neglecting descriptions that contain essential technical cores. Moreover, few use a deep learning approach handle entire process, including preprocessing, summarization,...

10.1109/access.2022.3176877 article EN cc-by IEEE Access 2022-01-01

Dialogue relation extraction identifies semantic relations between entity pairs in dialogues. This research explores a methodology harnessing the potential of prompt-based fine-tuning paired with trigger-generation approach. Capitalizing on intrinsic knowledge pre-trained language models, this strategy employs triggers that underline entities decisively. In particular, diverging from conventional extractive methods seen earlier research, our study leans towards generative manner for trigger...

10.3390/app132212414 article EN cc-by Applied Sciences 2023-11-16

Despite the striking advances in recent language generation performance, model-generated responses have suffered from chronic problem of hallucinations that are either untrue or unfaithful to a given source. Especially task knowledge grounded conversation, models required generate informative responses, but hallucinated utterances lead miscommunication. In particular, entity-level hallucination causes critical misinformation and undesirable conversation is one major concerns. To address this...

10.48550/arxiv.2406.10809 preprint EN arXiv (Cornell University) 2024-06-16

Cross-document relation extraction (CodRED) task aims to infer the between two entities mentioned in different documents within a reasoning path. Previous studies have concentrated on merely capturing implicit relations entities. However, humans usually utilize explicit information chains such as hyperlinks or additional searches find Inspired by this, we propose Path wIth expLOraTion (PILOT) that provides enhanced path exploring clue documents. PILOT finds bridging which directly guide...

10.18653/v1/2023.findings-emnlp.450 article EN cc-by 2023-01-01

Yoonna Jang, Suhyune Son, Jeongwoo Lee, Junyoung Yuna Hur, Jungwoo Lim, Hyeonseok Moon, Kisu Yang, Heuiseok Lim. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.

10.18653/v1/2023.emnlp-main.295 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01
Coming Soon ...