Jong Won Shin

ORCID: 0000-0002-8910-0264
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Speech and Audio Processing
  • Advanced Adaptive Filtering Techniques
  • Speech Recognition and Synthesis
  • Music and Audio Processing
  • Blind Source Separation Techniques
  • Hearing Loss and Rehabilitation
  • Metal complexes synthesis and properties
  • Metal-Organic Frameworks: Synthesis and Applications
  • Emotion and Mood Recognition
  • Molecular Sensors and Ion Detection
  • Crystal structures of chemical compounds
  • Crystallography and molecular interactions
  • Advanced Data Compression Techniques
  • Magnetism in coordination complexes
  • Image and Signal Denoising Methods
  • Metal-Catalyzed Oxygenation Mechanisms
  • Acoustic Wave Phenomena Research
  • Indoor and Outdoor Localization Technologies
  • Neural dynamics and brain function
  • Natural Language Processing Techniques
  • Sentiment Analysis and Opinion Mining
  • Luminescence and Fluorescent Materials
  • Advanced Nanomaterials in Catalysis
  • Gaussian Processes and Bayesian Inference
  • Diverse Topics in Contemporary Research

Gwangju Institute of Science and Technology
2015-2025

Kookmin University
2025

Korea Institute of Science & Technology Information
2016-2020

Kyung Hee University
2017

Kyungpook National University
2009-2015

Seoul National University
2004-2009

Seoul National University of Science and Technology
2008

University of California, Santa Barbara
2006

Seoul Media Institute of Technology
2005

Feasibility of a high speed pattern recognition system using 1k-bit cross-point synaptic RRAM array and CMOS-based neuron chip has been experimentally demonstrated. Learning capability neuromorphic comprising synapses CMOS neurons confirmed experimentally, for the first time.

10.1109/iedm.2012.6479016 article EN International Electron Devices Meeting 2012-12-01

A synthetic approach to highly efficient thermally activated delayed fluorescence (TADF) is proposed that uses ortho donor (D)–acceptor (A) compounds (PXZoB, DPAoB, and CzoB), wherein the acceptor based on triarylboron phenoxazine (PXZ), diphenylamine (DPA), or carbazole (Cz). Combined with D–A connectivity, bulky nature of endows dyads inherent steric "locking" for a twisted arrangement, leading small energy difference between singlet triplet excited states (ΔEST) thus exhibiting very TADF...

10.1021/acsami.7b05615 article EN ACS Applied Materials & Interfaces 2017-06-27

The performance of most the classical sound source localization algorithms degrades seriously in presence background noise or reverberation. Recently, deep neural networks (DNNs) have successfully been applied to localization, which mainly aim classify direction-of-arrival (DoA) into one candidate sectors. In this paper, we propose a DNN-based phase difference enhancement for DoA estimation, turned out be better than direct estimation DoAs from input interchannel differences (IPDs)....

10.1109/taslp.2019.2919378 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2019-05-28

In this letter, we propose a new statistical model, two-sided generalized gamma distribution (G/spl Gamma/D) for an efficient parametric characterization of speech spectra. G/spl Gamma/D forms class distributions, including the Gaussian, Laplacian, and Gamma probability density functions (pdfs) as special cases. We also computationally inexpensive online maximum likelihood (ML) parameter estimation algorithm Gamma/D. Likelihoods, coefficients variation (CVs), Kolmogorov-Smirnov (KS) tests...

10.1109/lsp.2004.840869 article EN IEEE Signal Processing Letters 2005-02-22

This letter presents a speech enhancement technique combining statistical models and non-negative matrix factorization (NMF) with on-line update of noise bases. The model-based methods have been known to be less effective non-stationary noises while the template-based techniques can deal them quite well. However, usually rely on priori information. To overcome shortcomings both approaches, we propose novel method that combines scheme NMF-based gain function. For better performance in...

10.1109/lsp.2014.2362556 article EN IEEE Signal Processing Letters 2014-10-09

In this paper, we propose a novel emotion recognition method to reflect affect salient information using acoustic and lexical features. The features are extracted from the speech signal by applying statistical functionals of emotionally high-level derived Deep Neural Network (DNN). These early fused with two types text transcription signal, which distributed representation affective lexicon-based dimensions. fed another DNN for utterance-level classification. Experimental results on...

10.1109/icassp.2019.8683077 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019-04-16

10.1109/icassp49660.2025.10888274 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

In this letter, we propose a novel approach to voice activity detection (VAD) based on the modified maximum posteriori (MAP) criterion conditioned decision made in previous frame. To exploit inter-frame correlation of activity, probability presence both observed spectrum and frame is employed instead conventional strategy that depends only current observation. The proposed conditional MAP incorporating temporal correlations leads two separate thresholds for likelihood ratio test (LRT)...

10.1109/lsp.2008.917027 article EN IEEE Signal Processing Letters 2008-01-01

From an investigation of a statistical model-based voice activity detection (VAD), it is discovered that simple heuristic way like geometric mean has been adopted for decision rule based on the likelihood ratio (LR) test. For successful VAD operation, authors first review behaviour mechanism support vector machine (SVM) and then propose novel technique, which employs function SVM using LRs, while conventional techniques perform comparing LRs with given threshold value. The proposed SVM-based...

10.1049/iet-spr.2008.0128 article EN IET Signal Processing 2009-04-28

Within a single speech emotion corpus, deep neural networks have shown decent performance in recognition. However, the of recognition based on data-driven learning methods degrades significantly for cross-corpus scenario. To relieve this issue without any labeled samples from target domain, we propose few-shot and unsupervised domain adaptation, which is trained to learn class (emotion) similarity source adapted domain. In addition, utilize multiple corpora training enhance robustness unseen...

10.1109/lsp.2021.3086395 article EN IEEE Signal Processing Letters 2021-01-01

There is a surge in interest self-supervised learning approaches for end-to-end speech encoding recent years as they have achieved great success. Especially, WavLM showed state-of-the-art performance on various processing tasks. To better understand the efficacy of models enhancement, this work, we design and conduct series experiments with three resource conditions by combining two high-quality enhancement systems. Also, We propose regression-based training objective noise-mixing data...

10.1109/slt54892.2023.10023356 article EN 2022 IEEE Spoken Language Technology Workshop (SLT) 2023-01-09

The reaction of N-(2-pyridylmethyl)iminodiethanol (H2pmide) and Fe(NO3)3·9H2O in MeOH led to the formation a dimeric iron(III) complex, [(Hpmide)Fe(NO3)]2(NO3)2·2CH3OH (1). Its anion-exchanged form, [(pmide)Fe(N3)]2 (2), was prepared by 1and NaN3 MeOH, during which Hpmide ligand 1 also deprotonated. These compounds were investigated single crystal X-ray diffraction magnetochemistry. In complex 1, one ion bonded with mono-deprotonated nitrate ion. two ions within dinuclear unit connected...

10.1039/c3dt53376j article EN Dalton Transactions 2014-01-01

The cobalt(ii) complex incorporating π-conjugated substituent, [Co(Naph-C2-terpy)2](BF4)2 (1; Naph-C2-terpy = 4'-(2-naphthoxy(ethoxy))-2,2':6',2''-terpyridine), exhibits an abrupt spin transition (ST) behavior (cooperative factor C 0.91) while its solvated product, 1·2MeOH, shows gradual crossover (SCO) (C 0.49). Single crystal X-ray structural analyses demonstrated that the octahedral coordination core [CoN6] in 1 larger distortion both high-spin and low-spin states than 1·2MeOH or another...

10.1039/c8dt02367k article EN Dalton Transactions 2018-01-01

In this letter, we propose results of distribution tests that indicate for many natural images, the statistics discrete cosine transform (DCT) coefficients are best approximated by a generalized gamma function (G/spl Gamma/F), which includes conventional Gaussian, Laplacian, and probability density functions. The major parameter G/spl Gamma/F is estimated according to maximum likelihood (ML) principle. Experimental on number /spl chi//sup 2/ can be used effectively modeling DCT compared...

10.1109/lsp.2005.843763 article EN IEEE Signal Processing Letters 2005-03-21

In this letter, we propose a spectro-temporal filtering algorithm for multichannel speech enhancement in the short-time Fourier transform (STFT) domain. Compared with traditional multiplicative technique, proposed method takes account of interdependencies between components adjacent frames and frequency bins. For filtering, noise power spectral density (PSD) matrices are estimated based on an extended formulation utilizing temporal correlations, parametric reduction filter these PSD is...

10.1109/lsp.2014.2302897 article EN IEEE Signal Processing Letters 2014-02-04

Speech enhancement based on statistical models has been studied for several decades. Recently, the speech adopting a power spectral density (PSD) uncertainty model proposed. This approach distinguishes true PSD from its estimate and considers both as random variables. It incorporates prior distribution of spectra estimators to derive uncertainty-aware counterpart conventional clean estimators, which results in performance improvement. However, not yet adopted parameter estimations such...

10.1109/taslp.2022.3180676 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

Time-domain approaches have shown the potential to improve performance of speaker verification, but still predominant utilize hand-crafted features such as mel filterbank energies. Although these are based on speech perception models and exhibited impressive performances, fixed frame size does not allow good temporal spectral resolutions at same time there is information loss when taking magnitude spectrum during frequency rescaling. In this paper, we propose incorporate multi-resolution...

10.1109/icassp49357.2023.10096839 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

A series of Zn (II), Pd (II) and Cd complexes, [(L) n MX 2 ] m (L = L‐a–L‐c; M Zn, Pd; X Cl; Cd; Br; , 1 or 2), containing 4‐methoxy‐ N ‐(pyridin‐2‐ylmethylene) aniline ( L‐a ), ‐(pyridin‐2‐ylmethyl) L‐b ) ‐methyl‐ L‐c have been synthesized characterized. The X‐ray crystal structures complexes [L PdCl L‐c) revealed distorted square planar geometries obtained via coordinative interaction the nitrogen atoms pyridine amine moieties two chloro ligands. geometry around center in [(L‐a)ZnCl...

10.1002/aoc.4797 article EN Applied Organometallic Chemistry 2019-02-05

We propose a voice activity detection (VAD) algorithm based on the generalized gamma distribution (G/spl Gamma/D). The distributions of noise spectra and noisy speech spectra, including speech-inactive intervals, are modeled by set G/spl Gamma/Ds applied to likelihood ratio test (LRT) for VAD. parameters Gamma/D estimated through an on-line maximum (ML) estimation procedure where global absence probability (GSAP) is incorporated under forgetting scheme. Experimental results show that...

10.1109/icassp.2005.1415230 article EN 2006-10-11

A novel approach to a voice activity detector (VAD) in noisy environments is presented. The generalised Gaussian distribution (GGD) employed as parametric model for speech, which enables tuning the actual data. According experimental results, it was discovered that proposed GGD more effective VAD algorithm compared conventional Laplacian model.

10.1049/el:20047090 article EN Electronics Letters 2004-11-25
Coming Soon ...