Minseung Kim

ORCID: 0000-0002-2270-9382
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Advanced Adaptive Filtering Techniques
  • Hearing Loss and Rehabilitation
  • Speech Recognition and Synthesis
  • Robotic Mechanisms and Dynamics
  • Image Processing Techniques and Applications
  • Advanced Data Compression Techniques
  • Prosthetics and Rehabilitation Robotics
  • Advanced Vision and Imaging
  • Robotic Locomotion and Control
  • Advanced Neural Network Applications
  • Autonomous Vehicle Technology and Safety
  • Direction-of-Arrival Estimation Techniques
  • Advanced Image Processing Techniques

University of Ulsan
2023-2024

Gwangju Institute of Science and Technology
2022-2024

Illinois Institute of Technology
2007

We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean and culture, along with competitive capabilities in English, math, coding. X was trained on balanced mix Korean, code data, followed by instruction-tuning high-quality human-annotated datasets while abiding strict safety guidelines reflecting our commitment responsible AI. The model is evaluated across various benchmarks, including comprehensive reasoning, knowledge, commonsense, factuality, coding,...

10.48550/arxiv.2404.01954 preprint EN arXiv (Cornell University) 2024-04-02

Speech enhancement based on statistical models has been studied for several decades. Recently, the speech adopting a power spectral density (PSD) uncertainty model proposed. This approach distinguishes true PSD from its estimate and considers both as random variables. It incorporates prior distribution of spectra estimators to derive uncertainty-aware counterpart conventional clean estimators, which results in performance improvement. However, not yet adopted parameter estimations such...

10.1109/taslp.2022.3180676 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

A multichannel speech enhancement system usually consists of spatial filters such as adaptive beamformers followed by postfilters, which suppress remaining noise. Accurate estimation the power spectral density (PSD) residual noise is crucial for successful reduction in postfilters. In this paper, we propose a postfilter utilizing proposed posteriori presence probability (SPP) and PSD estimators, are based on both coherence statistical models. We model coherence-based SPP simple function...

10.3390/s24123979 article EN cc-by Sensors 2024-06-19

The interchannel phase difference (IPD) may be one of the most widely-used spatial cues in multichannel speech processing, and has been used beamformers post filters for enhancement. coherence, which is also as a feature enhancement, can provide information on reliability IPD estimation presence probability (SPP). In this paper, we propose dual microphone enhancement adopting <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">a posteriori</i>...

10.1109/taslp.2022.3202121 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

Online multi-microphone speech enhancement aims to extract target from multiple noisy inputs by exploiting the spatial information as well spectro-temporal characteristics with low latency. Acoustic parameters such acoustic transfer function and noise covariance matrices (SCMs) should be estimated in a causal manner enable online estimation of clean spectra. In this paper, we propose an improved estimator for SCM, which can parameterized power spectral density (PSD) relative (RTF)....

10.3390/s23010111 article EN cc-by Sensors 2022-12-22

The Conformer has shown impressive performance for speech enhancement by exploiting the local and global contextual information, although it requires high computational complexity many parameters. Recently, multi-layer perceptron (MLP)-based models such as MLP-mixer gMLP have demonstrated comparable performances with much less in computer vision area. These showed that all-MLP architectures may perform good more advanced structures, but nature of MLP limits application these to input a...

10.1109/access.2022.3221440 article EN cc-by-nc-nd IEEE Access 2022-01-01

When creating a deep learning model for estimating the depth of images, constructing training dataset using stereo images presents significant challenge. Therefore, monocular estimation provides numerous benefits in terms acquisition. Monodepth2 is one prominent techniques estimation. By employing self-supervised approach, eliminates need ground truth, making acquisition much easier. Nonetheless, challenge faced by issue blurred boundaries output maps. To address this concern, paper proposes...

10.1109/iwis58789.2023.10284651 article EN 2023-08-09

본 논문은 음향학적 반향 제거를 위한 IP-INLMS 알고리즘을 제안하였다. 제안된 알고리즘은 원단 신호인 반향을 제외한 근단 신호가 존재하는 환경에서도 강인하게 동작하도록 스텝사이즈를 조절하는 INLMS 알고리즘에 필터의 수렴을 빠르게 하는 IP-NLMS 알고리즘의 개념을 결합하였다. 방법은 기존의 알고리즘 대비하여 제거 실험에서 부정합과 수렴속도 측면에서 모두 향상된 성능을 보여주었다.

10.7840/kics.2020.45.2.444 article KO The Journal of Korean Institute of Communications and Information Sciences 2020-02-25
Coming Soon ...