- Topic Modeling
- Natural Language Processing Techniques
- Advanced Adaptive Filtering Techniques
- Blind Source Separation Techniques
- Video Analysis and Summarization
- Speech and dialogue systems
- Neural Networks and Applications
- Face recognition and analysis
- Tactile and Sensory Interactions
- Music and Audio Processing
- Advanced Vision and Imaging
- Human Motion and Animation
- Multimodal Machine Learning Applications
- Speech and Audio Processing
- Spectroscopy Techniques in Biomedical and Chemical Research
- Biosensors and Analytical Detection
- Interactive and Immersive Displays
- Video Surveillance and Tracking Methods
- Image and Signal Denoising Methods
- Control Systems and Identification
- Multimedia Communication and Technology
- Domain Adaptation and Few-Shot Learning
- Video Coding and Compression Technologies
- Text and Document Classification Technologies
- Human Pose and Action Recognition
Kongju National University
2024
Kyung Hee University
2021-2024
Chungbuk National University
2024
Samsung (South Korea)
2010-2023
Seoul National University
2023
Allen Institute for Artificial Intelligence
2023
National University
2023
University of Washington
2023
Inje University Busan Paik Hospital
2022
Samsung (United States)
2021
The current evaluation protocol of long-tailed visual recognition trains the classification model on source label distribution and evaluates its performance uniform target distribution. Such has questionable practicality since may also be long-tailed. Therefore, we formulate as a shift problem where tar-get distributions are different. One significant hurdles in dealing with is entanglement between prediction. In this paper, focus disentangling from We first introduce simple but over-looked...
Accurate facial landmarks are essential prerequisites for many tasks related to human faces. In this paper, an accurate landmark detector is proposed based on cascaded transformers. We formulate detection as a coordinate regression task such that the model can be trained end-to-end. With self-attention in transformers, our inherently exploit structured relationships between landmarks, which would benefit under challenging conditions large pose and occlusion. During refinement, able extract...
Measuring, recording and analyzing spectral information of materials as its unique finger print using a ubiquitous smartphone has been desired by scientists consumers. We demonstrated it drug classification chemical components with Raman spectrometer. The spectrometer is based on the CMOS image sensor periodic array band pass filters, capturing 2D intensity map, newly defined barcode in this work. Here we show 11 major drugs are classified high accuracy, 99.0%, aid convolutional neural...
On account of growing demands for personalization, the need a so-called few-shot TTS system that clones speakers with only few data is emerging.To address this issue, we propose Attentron, model voices unseen during training.It introduces two special encoders, each serving different purposes.A fine-grained encoder extracts variable-length style information via an attention mechanism, and coarse-grained greatly stabilizes speech synthesis, circumventing unintelligible gibberish even...
With the continual expansion of face datasets, feature-based distillation prevails for large-scale recognition. In this work, we attempt to remove identity supervision in student training, spare GPU memory from saving massive class centers. However, naive removal leads inferior result. We carefully inspect performance degradation perspective intrinsic dimension, and argue that gap namely gap, is intimately connected infamous capacity problem. By constraining teacher's search space with...
Neural networks are often prone to bias toward spurious correlations inherent in a dataset, thus failing generalize unbiased test criteria. A key challenge resolving the issue is significant lack of bias-conflicting training data (i. e., samples without correlations). In this paper, we propose novel augmentation approach termed Bias-Adversarial (BiasAdv) that supplements with adversarial images. Our idea an attack on biased model makes decisions based may generate syn-thetic samples, which...
Despite the remarkable performance of deep models on image recognition tasks, they are known to be susceptible common corruptions such as blur, noise, and low-resolution. Data augmentation is a conventional way build robust model by considering these during training. However, naive data scheme may result in non-specialized for particular corruptions, tends learn averaged distribution among corruptions. To mitigate issue, we propose new paradigm training networks that produce clean-like...
Scene text recognition (STR) attracts much attention over the years because of its wide application. Most methods train STR model in a fully supervised manner which requires large amounts labeled data. Although synthetic data contributes lot to STR, it suffers from real-to-synthetic domain gap restricts performance. In this work, we aim boost models by leveraging both and numerous real unlabeled images, exempting human annotation cost thoroughly. A robust con-sistency regularization based...
While interacting with mobile devices, users may press against touch screens and also exert tangential force to the display in a sliding manner. We seek guide UI design based on applied by user surface of hand-held device. A prototype an interface using input was implemented utilizing sensitive layer elastic used for experiment. investigated controllability reach maintain target levels considered effects hand pose direction input. Our results imply no significant difference performance when...
Silicon nanowires (SiNWs) are emerging as versatile components in the fabrication of sensors for implantable medical devices because their exceptional electrical, optical, and mechanical properties. This paper presents a novel top-down method vertically stacked SiNWs, eliminating need wet oxidation, etching, nanolithography. The integration these SiNWs into body channel communication (BCC) circuits was also explored. fabricated were confirmed to be capable forming arrays with multiple layers...
Seungju Han, Beomsu Kim, Jin Yong Yoo, Seokjun Seo, Sangbum Enkhbayar Erdenee, Buru Chang. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.
In retinal prosthetic systems on multi-channel microelectrodes to effectively stimulate neurons, the electrode-electrolyte interface impedance of a microelectrode should be minimized drive sufficiently large current at given supply voltage.This paper presents fabrication nanostructured array with simplified and its characteristic evaluation using biphasic stimulator.The base diameter 25 μm, 50 75 μm are fabricated, maximum allowable injection limits measured verify estimated limit. Also,...
Visual information is central to conversation: body gestures and physical behaviour, for example, contribute meaning that transcends words alone. To date, however, most neural conversational models are limited just text. We introduce Champagne, a generative model of conversations can account visual contexts. train we collect release YTD-18M, large-scale corpus 18M video-based dialogues. YTD-18M constructed from web videos: crucial our data collection pipeline pretrained language converts...
The goal of Multi-label learning is to predict multiple labels each single instance. This a challenging problem since the training data limited, long-tail label distribution, and complicated correlations. Generally, more samples correlation knowledge would benefit performance. However, it difficult obtain large-scale well-labeled datasets, building such map requires sophisticated semantic knowledge. To this end, we propose an end-to-end Generative Correlation Discovery Network (GCDN) method...
Deep learning algorithms require large amounts of labeled data for effective performance, but the presence noisy labels often significantly degrade their performance. Although recent studies on designing a robust objective function to label noise, known as loss method, have shown promising results with labels, they suffer from issue underfitting not only samples also clean ones, leading suboptimal model To address this issue, we propose novel framework that selectively suppresses while...