Jaekyeom Kim

ORCID: 0000-0003-4538-8398
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing Techniques
  • Reinforcement Learning in Robotics
  • Adversarial Robustness in Machine Learning
  • Domain Adaptation and Few-Shot Learning
  • Topic Modeling
  • Advanced Neural Network Applications
  • Semantic Web and Ontologies
  • Model Reduction and Neural Networks
  • Generative Adversarial Networks and Image Synthesis
  • Advanced Memory and Neural Computing
  • Service-Oriented Architecture and Web Services
  • Music and Audio Processing
  • Business Process Modeling and Analysis
  • Industrial Vision Systems and Defect Detection
  • Machine Learning and Data Classification
  • Robot Manipulation and Learning
  • Color Science and Applications
  • Evolutionary Algorithms and Applications
  • Artificial Intelligence in Games
  • Human Pose and Action Recognition
  • Model-Driven Software Engineering Techniques
  • Geophysical Methods and Applications
  • Multi-Agent Systems and Negotiation
  • Surface Roughness and Optical Measurements
  • Neural Networks and Applications

Seoul National University
2019-2021

Reinforcement learning algorithms struggle when the reward signal is very sparse. In these cases, naive random exploration methods essentially rely on a walk to stumble onto rewarding state. Recent works utilize intrinsic motivation guide via generative models, predictive forward or discriminative modeling of novelty. We propose EMI, which an method that constructs embedding representation states and actions does not decoding full observation but extracts signals can be used based prediction...

10.48550/arxiv.1810.01176 preprint EN other-oa arXiv (Cornell University) 2018-01-01

10.18653/v1/2024.findings-acl.924 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

We propose a novel information bottleneck (IB) method named Drop-Bottleneck, which discretely drops features that are irrelevant to the target variable. Drop-Bottleneck not only enjoys simple and tractable compression objective but also additionally provides deterministic compressed representation of input variable, is useful for inference tasks require consistent representation. Moreover, it can jointly learn feature extractor select considering each dimension's relevance task, unattainable...

10.48550/arxiv.2103.12300 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We study the problem of unsupervised skill discovery, whose goal is to learn a set diverse and useful skills with no external reward. There have been number discovery methods based on maximizing mutual information (MI) between states. However, we point out that their MI objectives usually prefer static dynamic ones, which may hinder application for downstream tasks. To address this issue, propose Lipschitz-constrained Skill Discovery (LSD), encourages agent discover more diverse, dynamic,...

10.48550/arxiv.2202.00914 preprint EN other-oa arXiv (Cornell University) 2022-01-01

In reinforcement learning, continuous time is often discretized by a scale $\delta$, to which the resulting performance known be highly sensitive. this work, we seek find $\delta$-invariant algorithm for policy gradient (PG) methods, performs well regardless of value $\delta$. We first identify underlying reasons that cause PG methods fail as $\delta \to 0$, proving variance estimator can diverge infinity in stochastic environments under certain assumption stochasticity. While durative...

10.48550/arxiv.2111.03941 preprint EN other-oa arXiv (Cornell University) 2021-01-01

The primary limitation of large language models (LLMs) is their restricted understanding the world. This poses significant difficulties for LLM-based agents, particularly in domains where pre-trained LLMs lack sufficient knowledge. In this paper, we introduce a novel framework, called AutoGuide, that bridges knowledge gap by leveraging implicit offline experiences. Specifically, AutoGuide effectively extracts embedded data extracting set state-aware guidelines. Importantly, each guideline...

10.48550/arxiv.2403.08978 preprint EN arXiv (Cornell University) 2024-03-13

Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint errors. This work explores whether smaller-size (<= 13B) (LMs) have ability self-correction on tasks with minimal inputs from stronger LMs. We propose novel pipeline prompts smaller LMs collect data supports training self-refinement abilities. First, we leverage correct guide model in critiquing...

10.48550/arxiv.2404.17140 preprint EN arXiv (Cornell University) 2024-04-25

In this paper, we introduce Auto-Intent, a method to adapt pre-trained large language model (LLM) as an agent for target domain without direct fine-tuning, where empirically focus on web navigation tasks. Our approach first discovers the underlying intents from demonstrations unsupervisedly, in highly compact form (up three words). With extracted intents, train our intent predictor predict next given agent's past observations and actions. particular, propose self-exploration top-k probable...

10.48550/arxiv.2410.22552 preprint EN arXiv (Cornell University) 2024-10-29

Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making, but often struggle with complex, long-horizon planning tasks. Recent techniques have sought to structure LLM outputs using control flow other code-adjacent improve performance. These include variables (to track important information) functions divide complex tasks into smaller re-usable sub-tasks). However, purely code-based approaches can be error-prone insufficient for...

10.48550/arxiv.2411.13826 preprint EN arXiv (Cornell University) 2024-11-20

Having the ability to acquire inherent skills from environments without any external rewards or supervision like humans is an important problem. We propose a novel unsupervised skill discovery method named Information Bottleneck Option Learning (IBOL). On top of linearization that promotes more various and distant state transitions, IBOL enables diverse skills. It provides abstraction learned with information bottleneck framework for options improved stability encouraged disentanglement....

10.48550/arxiv.2106.14305 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Many display manufacturers have studied RGBW pixel structure adding a white sub-pixel to RGB LCD and recently revealed UHD TVs based on novel LCD. The has 50% higher luminance 25% lower primary color compared In this paper, the image quality of was dealt with. Before evaluating them, TV broadcast video IEC-62087 were analyzed for test clips. order analyze reference from content in Korea firstly collected. As result analysis, expected improve more because most colors are distributed around...

10.1117/12.2077071 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2015-02-08
Coming Soon ...