NFDI4DS | UHH-SEMS - Publication Details

Jaekyeom Kim

ORCID: 0000-0003-4538-8398

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5006868457

Research Areas

Natural Language Processing Techniques
Reinforcement Learning in Robotics
Adversarial Robustness in Machine Learning
Domain Adaptation and Few-Shot Learning
Topic Modeling
Advanced Neural Network Applications
Semantic Web and Ontologies
Model Reduction and Neural Networks
Generative Adversarial Networks and Image Synthesis
Advanced Memory and Neural Computing
Service-Oriented Architecture and Web Services
Music and Audio Processing
Business Process Modeling and Analysis
Industrial Vision Systems and Defect Detection
Machine Learning and Data Classification
Robot Manipulation and Learning
Color Science and Applications
Evolutionary Algorithms and Applications
Artificial Intelligence in Games
Human Pose and Action Recognition
Model-Driven Software Engineering Techniques
Geophysical Methods and Applications
Multi-Agent Systems and Negotiation
Surface Roughness and Optical Measurements
Neural Networks and Applications

Seoul National University
2019-2021

EMI: Exploration with Mutual Information

OPENALEX - Publications

Hyoungseok Kim Jaekyeom Kim Yeonwoo Jeong Sergey Levine Hyun Oh Song

Reinforcement learning algorithms struggle when the reward signal is very sparse. In these cases, naive random exploration methods essentially rely on a walk to stumble onto rewarding state. Recent works utilize intrinsic motivation guide via generative models, predictive forward or discriminative modeling of novelty. We propose EMI, which an method that constructs embedding representation states and actions does not decoding full observation but extracts signals can be used based prediction...

10.48550/arxiv.1810.01176 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Small Language Models Need Strong Verifiers to Self-Correct Reasoning

OPENALEX - Publications

Yunxiang Zhang Muhammad Khalifa Lajanugen Logeswaran Jaekyeom Kim Moontae Lee and 2 more

10.18653/v1/2024.findings-acl.924 article EN Findings of the Association for Computational Linguistics: ACL 2022 2024-01-01

Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration

OPENALEX - Publications

Jaekyeom Kim Minjung Kim Dongyeon Woo Gunhee Kim

We propose a novel information bottleneck (IB) method named Drop-Bottleneck, which discretely drops features that are irrelevant to the target variable. Drop-Bottleneck not only enjoys simple and tractable compression objective but also additionally provides deterministic compressed representation of input variable, is useful for inference tasks require consistent representation. Moreover, it can jointly learn feature extractor select considering each dimension's relevance task, unattainable...

10.48550/arxiv.2103.12300 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Lipschitz-constrained Unsupervised Skill Discovery

OPENALEX - Publications

Seohong Park Jongwook Choi Jaekyeom Kim Honglak Lee Gun-Hee Kim

We study the problem of unsupervised skill discovery, whose goal is to learn a set diverse and useful skills with no external reward. There have been number discovery methods based on maximizing mutual information (MI) between states. However, we point out that their MI objectives usually prefer static dynamic ones, which may hinder application for downstream tasks. To address this issue, propose Lipschitz-constrained Skill Discovery (LSD), encourages agent discover more diverse, dynamic,...

10.48550/arxiv.2202.00914 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

OPENALEX - Publications

Seohong Park Jaekyeom Kim Gunhee Kim

In reinforcement learning, continuous time is often discretized by a scale $\delta$, to which the resulting performance known be highly sensitive. this work, we seek find $\delta$-invariant algorithm for policy gradient (PG) methods, performs well regardless of value $\delta$. We first identify underlying reasons that cause PG methods fail as $\delta \to 0$, proving variance estimator can diverge infinity in stochastic environments under certain assumption stochasticity. While durative...

10.48550/arxiv.2111.03941 preprint EN other-oa arXiv (Cornell University) 2021-01-01

AutoGuide: Automated Generation and Selection of State-Aware Guidelines for Large Language Model Agents

OPENALEX - Publications

Yao Fu Dong Ki Kim Jaekyeom Kim Sungryull Sohn Lajanugen Logeswaran and 2 more

The primary limitation of large language models (LLMs) is their restricted understanding the world. This poses significant difficulties for LLM-based agents, particularly in domains where pre-trained LLMs lack sufficient knowledge. In this paper, we introduce a novel framework, called AutoGuide, that bridges knowledge gap by leveraging implicit offline experiences. Specifically, AutoGuide effectively extracts embedded data extracting set state-aware guidelines. Importantly, each guideline...

10.48550/arxiv.2403.08978 preprint EN arXiv (Cornell University) 2024-03-13

Small Language Models Need Strong Verifiers to Self-Correct Reasoning

OPENALEX - Publications

Yunxiang Zhang Muhammad Khalifa Lajanugen Logeswaran Jaekyeom Kim Moontae Lee and 2 more

Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint errors. This work explores whether smaller-size (<= 13B) (LMs) have ability self-correction on tasks with minimal inputs from stronger LMs. We propose novel pipeline prompts smaller LMs collect data supports training self-refinement abilities. First, we leverage correct guide model in critiquing...

10.48550/arxiv.2404.17140 preprint EN arXiv (Cornell University) 2024-04-25

Auto-Intent: Automated Intent Discovery and Self-Exploration for Large Language Model Web Agents

OPENALEX - Publications

Jaekyeom Kim Dong Ki Kim Lajanugen Logeswaran Sungryull Sohn Honglak Lee

In this paper, we introduce Auto-Intent, a method to adapt pre-trained large language model (LLM) as an agent for target domain without direct fine-tuning, where empirically focus on web navigation tasks. Our approach first discovers the underlying intents from demonstrations unsupervisedly, in highly compact form (up three words). With extracted intents, train our intent predictor predict next given agent's past observations and actions. particular, propose self-exploration top-k probable...

10.48550/arxiv.2410.22552 preprint EN arXiv (Cornell University) 2024-10-29

Interactive and Expressive Code-Augmented Planning with Large Language Models

OPENALEX - Publications

Anthony Z. Liu Xinhe Wang Jacob Sansom Yao Fu Jongwook Choi and 3 more

Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making, but often struggle with complex, long-horizon planning tasks. Recent techniques have sought to structure LLM outputs using control flow other code-adjacent improve performance. These include variables (to track important information) functions divide complex tasks into smaller re-usable sub-tasks). However, purely code-based approaches can be error-prone insufficient for...

10.48550/arxiv.2411.13826 preprint EN arXiv (Cornell University) 2024-11-20

Auto-Intent: Automated Intent Discovery and Self-Exploration for Large Language Model Web Agents

OPENALEX - Publications

Jaekyeom Kim Dong Ki Kim Lajanugen Logeswaran Sungryull Sohn Honglak Lee

10.18653/v1/2024.findings-emnlp.964 article EN 2024-01-01

Unsupervised Skill Discovery with Bottleneck Option Learning

OPENALEX - Publications

Jaekyeom Kim Seohong Park Gunhee Kim

Having the ability to acquire inherent skills from environments without any external rewards or supervision like humans is an important problem. We propose a novel unsupervised skill discovery method named Information Bottleneck Option Learning (IBOL). On top of linearization that promotes more various and distant state transitions, IBOL enables diverse skills. It provides abstraction learned with information bottleneck framework for options improved stability encouraged disentanglement....

10.48550/arxiv.2106.14305 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Image quality evaluation of LCDs based on novel RGBW sub-pixel structure

OPENALEX - Publications

Sung-Jin Kim Dongwoo Kang Jinsang Lee Jaekyeom Kim Yongmin Park and 5 more

Many display manufacturers have studied RGBW pixel structure adding a white sub-pixel to RGB LCD and recently revealed UHD TVs based on novel LCD. The has 50% higher luminance 25% lower primary color compared In this paper, the image quality of was dealt with. Before evaluating them, TV broadcast video IEC-62087 were analyzed for test clips. order analyze reference from content in Korea firstly collected. As result analysis, expected improve more because most colors are distributed around...

10.1117/12.2077071 article EN Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE 2015-02-08

Coming Soon ...