Jaywon Koo

ORCID: 0000-0002-5539-5244
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Interactive and Immersive Displays
  • Computer Graphics and Visualization Techniques
  • Teaching and Learning Programming
  • Topic Modeling
  • Genomics and Rare Diseases
  • Natural Language Processing Techniques
  • Simulation and Modeling Applications
  • Genetics, Bioinformatics, and Biomedical Research
  • Data Quality and Management
  • Video Surveillance and Tracking Methods
  • Speech and Audio Processing
  • Semantic Web and Ontologies
  • Biomedical and Engineering Education
  • Multimodal Machine Learning Applications
  • Gait Recognition and Analysis
  • Video Analysis and Summarization
  • Anomaly Detection Techniques and Applications

Columbia University
2024

Ewha Womans University
2020

In this work we investigate the optimal selection and fusion of features across multiple modalities combine these in a neural network to improve emotion detection. We compare different methods examine impact multi-loss training within multi-modality network, identifying useful findings relating subnet performance. Our best model achieves state-of-the-art performance for three datasets (CMU-MOSI, CMU-MOSEI CH-SIMS), outperforms other most metrics. have found that on multimodal improves single...

10.48550/arxiv.2308.00264 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Events describe happenings in our world that are of importance. Naturally, understanding events mentioned multimedia content and how they related forms an important way comprehending world. Existing literature can infer if across textual visual (video) domains identical (via grounding) thus, on the same semantic level. However, grounding fails to capture intricate cross-event relations exist due being referred many levels. For example, abstract event "war'' manifests at a lower level through...

10.1609/aaai.v38i16.29718 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

Visual Programming has emerged as an alternative to end-to-end black-box visual reasoning models. This type of methods leverage Large Language Models (LLMs) decompose a problem and generate the source code for executable computer program. strategy advantage offering interpretable path does not require finetuning model with task-specific data. We propose PropTest, general that improves programming by further using LLM tests properties in initial round proposed solutions. Particularly, our...

10.48550/arxiv.2403.16921 preprint EN arXiv (Cornell University) 2024-03-25

This paper describes a community effort to improve earlier versions of the full-text corpus Genomics & Informatics by semi-automatically detecting and correcting PDF-to-text conversion errors optical character recognition during first hackathon Annotation Hackathon (GIAH) event. Extracting text from multi-column biomedical documents such as is known be notoriously difficult. The was piloted part coding competition ELTEC College Engineering at Ewha Womans University in order enable...

10.5808/gi.2020.18.3.e33 article EN Genomics & Informatics 2020-09-29

Events describe happenings in our world that are of importance. Naturally, understanding events mentioned multimedia content and how they related forms an important way comprehending world. Existing literature can infer if across textual visual (video) domains identical (via grounding) thus, on the same semantic level. However, grounding fails to capture intricate cross-event relations exist due being referred many levels. For example, Figure 1, abstract event "war" manifests at a lower...

10.48550/arxiv.2206.07207 preprint EN cc-by-nc-sa arXiv (Cornell University) 2022-01-01
Coming Soon ...