Moontae Lee

ORCID: 0000-0001-5542-3463
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Natural Language Processing Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced Graph Neural Networks
  • Text and Document Classification Technologies
  • Multimodal Machine Learning Applications
  • Speech and dialogue systems
  • Expert finding and Q&A systems
  • Complex Network Analysis Techniques
  • VLSI and FPGA Design Techniques
  • VLSI and Analog Circuit Testing
  • Explainable Artificial Intelligence (XAI)
  • Interconnection Networks and Systems
  • AI-based Problem Solving and Planning
  • Persona Design and Applications
  • Recommender Systems and Techniques
  • Advanced Data Processing Techniques
  • Machine Learning and Data Classification
  • Human Pose and Action Recognition
  • Advanced Text Analysis Techniques
  • Image and Video Quality Assessment
  • Advanced Manufacturing and Logistics Optimization
  • Information Systems Theories and Implementation
  • Optimization and Packing Problems
  • Mental Health Research Topics

University of Illinois Chicago
2019-2024

Decision Sciences (United States)
2022-2024

Tokyo Institute of Technology
2023

Administration for Community Living
2023

IT University of Copenhagen
2023

American Jewish Committee
2023

University of Michigan
2023

RIKEN Center for Advanced Intelligence Project
2023

Mongolia International University
2023

University of Illinois Urbana-Champaign
2022

The anchor words algorithm performs provably efficient topic model inference by finding an approximate convex hull in a high-dimensional word co-occurrence space.However, the existing greedy often selects poor words, reducing quality and interpretability.Rather than space, we propose to find exact visualizable 2-or 3-dimensional space.Such low-dimensional embeddings both improve topics clearly show users why certain words.

10.3115/v1/d14-1138 article EN cc-by 2014-01-01

Question answering tasks have shown remarkable progress with distributed vector representation. In this paper, we investigate the recently proposed Facebook bAbI which consist of twenty different categories questions that require complex reasoning. Because previous work on are all end-to-end models, errors could come from either an imperfect understanding semantics or in certain steps For clearer analysis, propose two space models inspired by Tensor Product Representation (TPR) to perform...

10.48550/arxiv.1511.06426 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Mixed reality (MR) devices provide real-time environments for physical-digital interactions across many domains. Owing to the unprecedented COVID-19 pandemic, MR technologies have supported new use cases in health care industry, enabling social distancing practices minimize risk of contact and transmission. Despite their novelty increasing popularity, public evaluations are sparse often rely on among users, developers, researchers, potential buyers.The purpose this study is aspect-based...

10.2196/36850 article EN cc-by JMIR Serious Games 2022-06-12

In this paper we present the initial development of a general theory for mapping inference in predicate logic to computation over Tensor Product Representations (TPRs; Smolensky (1990), & Legendre (2006)). After an brief synopsis TPRs (Section 0), begin with particular examples 'bAbI' question-answering task Weston et al. (2015) 1). We then simplification analysis that suffices bAbI 2). Finally, lay out treatment 3). also show Section 2 derives methods described Lee (2016); shows how...

10.48550/arxiv.1601.02745 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Open-domain question answering (QA) systems are often built with retrieval modules. However, retrieving passages from a given source is known to suffer insufficient knowledge coverage. Alternatively, prompting large language models (LLMs) generate contextual based on their parametric has been shown improve QA performance. Yet, LLMs tend “hallucinate” content that conflicts the retrieved knowledge. Based intuition answers supported by both sources more likely be correct, we propose COMBO,...

10.18653/v1/2023.emnlp-main.286 article EN cc-by Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2023-01-01

Many online communities present user-contributed responses such as reviews of products and answers to questions. User-provided helpfulness votes can highlight the most useful responses, but voting is a social process that gain momentum based on popularity polarity existing votes. We propose Chinese Voting Process (CVP) which models evolution self-reinforcing dependent position presentation biases. evaluate this model Amazon product more than 80 StackExchange forums, measuring intrinsic...

10.48550/arxiv.1610.09428 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint errors. This work explores whether smaller-size (<= 13B) (LMs) have ability self-correction on tasks with minimal inputs from stronger LMs. We propose novel pipeline prompts smaller LMs collect data supports training self-refinement abilities. First, we leverage correct guide model in critiquing...

10.48550/arxiv.2404.17140 preprint EN arXiv (Cornell University) 2024-04-25

Interactions with billion-scale large language models typically yield long-form responses due to their extensive parametric capacities, along retrieval-augmented features. While detailed provide insightful viewpoint of a specific subject, they frequently generate redundant and less engaging content that does not meet user interests. In this work, we focus on the role query outlining (i.e., selected sequence queries) in scenarios users request range information, namely coverage-conditioned...

10.48550/arxiv.2407.01158 preprint EN arXiv (Cornell University) 2024-07-01

We introduce EXAONE 3.0 instruction-tuned language model, the first open model in family of Large Language Models (LLMs) developed by LG AI Research. Among different sizes, we publicly release 7.8B to promote research and innovations. Through extensive evaluations across a wide range public in-house benchmarks, demonstrates highly competitive real-world performance with instruction-following capability against other state-of-the-art models similar size. Our comparative analysis shows that...

10.48550/arxiv.2408.03541 preprint EN arXiv (Cornell University) 2024-08-07

Spectral inference provides fast algorithms and provable optimality for latent topic analysis. But real data these require additional ad-hoc heuristics, even then often produce unusable results. We explain this poor performance by casting the problem of in framework Joint Stochastic Matrix Factorization (JSMF) showing that previous methods violate theoretical conditions necessary a good solution to exist. propose novel rectification method learns high quality topics their interactions on...

10.48550/arxiv.1611.00175 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Moontae Lee, Sungjun Cho, David Bindel, Mimno. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint (EMNLP-IJCNLP). 2019.

10.18653/v1/d19-1504 article EN cc-by 2019-01-01

We study few-shot reranking for multi-hop QA (MQA) with open-domain questions. To alleviate the need a large number of labeled question-document pairs retriever training, we propose PromptRank, which relies on language model prompting path reranking. PromptRank first constructs an instruction-based prompt that includes candidate document and then computes relevance score between given question based conditional likelihood according to model. yields strong retrieval performance HotpotQA only...

10.18653/v1/2023.acl-long.885 article EN cc-by 2023-01-01

Novelty detection is a fundamental task of machine learning which aims to detect abnormal ($\textit{i.e.}$ out-of-distribution (OOD)) samples. Since diffusion models have recently emerged as the de facto standard generative framework with surprising generation results, novelty via has also gained much attention. Recent methods mainly utilized reconstruction property in-distribution However, they often suffer from detecting OOD samples that share similar background information data. Based on...

10.48550/arxiv.2312.02615 preprint EN cc-by arXiv (Cornell University) 2023-01-01

In the context of multi-step reasoning, e.g., with chain-of-thought, language models (LMs) can easily assign a high likelihood to incorrect steps. As result, decoding strategies that optimize for solution often yield solutions. To address this issue, we propose Guiding chain-of-thought ReAsoning CorrectnEss Discriminator (GRACE), stepwise approach steers process towards producing correct reasoning GRACE employs discriminator trained contrastive loss over and steps, which is used during score...

10.18653/v1/2023.findings-emnlp.1022 article EN cc-by 2023-01-01

The anchor words algorithm performs provably efficient topic model inference by finding an approximate convex hull in a high-dimensional word co-occurrence space. However, the existing greedy often selects poor words, reducing quality and interpretability. Rather than space, we propose to find exact visualizable 2- or 3-dimensional Such low-dimensional embeddings both improve topics clearly show users why certain words.

10.48550/arxiv.1711.06826 preprint EN other-oa arXiv (Cornell University) 2017-01-01
Coming Soon ...