NFDI4DS | UHH-SEMS - Publication Details

Hang J. Kim

ORCID: 0000-0003-2029-2293

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5080201608

Research Areas

Statistical Methods and Inference
Bayesian Methods and Mixture Models
Privacy-Preserving Technologies in Data
Statistical Methods and Bayesian Inference
Genetic Associations and Epidemiology
Advanced Statistical Methods and Models
Gene Regulatory Network Analysis
Gene expression and cancer classification
Bioinformatics and Genomic Networks
Single-cell and spatial transcriptomics
Systemic Lupus Erythematosus Research
Data-Driven Disease Surveillance
Privacy, Security, and Data Protection
Light effects on plants
Markov Chains and Monte Carlo Methods
Gaussian Processes and Bayesian Inference
Cryptography and Data Security
Tryptophan and brain disorders
Bacterial Genetics and Biotechnology
Greenhouse Technology and Climate Control
Healthcare Policy and Management
Recommender Systems and Techniques
interferon and immune responses
Advanced Causal Inference Techniques
Mobile Crowdsensing and Crowdsourcing

University of Cincinnati
2016-2025

University of California, San Francisco
2018

Duke University
2014-2015

National Institute of Statistical Sciences
2015

FRQ-CK1 interaction determines the period of circadian rhythms in Neurospora

OPENALEX - Publications

Xiao Liu Ahai Chen Angélica Caicedo-Casso Guofei Cui Mingjian Du and 5 more

Abstract Circadian clock mechanisms have been extensively investigated but the main rate-limiting step that determines circadian period remains unclear. Formation of a stable complex between proteins and CK1 is conserved feature in eukaryotic mechanisms. Here we show FRQ-CK1 interaction, not FRQ stability, correlates with Neurospor mutants. Mutations specifically affect interaction lead to severe alterations period. The has two roles negative feedback loop. First, it phosphorylation profile,...

10.1038/s41467-019-12239-w article EN cc-by Nature Communications 2019-09-25

Multiple Imputation of Missing or Faulty Values Under Linear Constraints

OPENALEX - Publications

Hang J. Kim Jerome P. Reiter Quanli Wang Lawrence H. Cox Alan F. Karr

Many statistical agencies, survey organizations, and research centers collect data that suffer from item nonresponse erroneous or inconsistent values. These may be required to satisfy linear constraints, for example, bounds on individual variables inequalities ratios sums of variables. Often these constraints are designed identify faulty values, which then blanked imputed. The also exhibit complex distributional features, including nonlinear relationships highly nonnormal distributions. We...

10.1080/07350015.2014.885435 article EN Journal of Business and Economic Statistics 2014-02-20

Bayesian Model Calibration and Sensitivity Analysis for Oscillating Biological Experiments

OPENALEX - Publications

Youngdeok Hwang Hang J. Kim Won Chang Christian I. Hong Steven N. MacEachern

10.1080/00401706.2024.2444310 article EN Technometrics 2025-02-03

Simultaneous Edit-Imputation for Continuous Microdata

OPENALEX - Publications

Hang J. Kim Lawrence H. Cox Alan F. Karr Jerome P. Reiter Quanli Wang

Hang J. Kim, Lawrence H. Cox, Alan F. Karr, Jerome P. Reiter & Quanli WangHang Kim is Assistant Professor, Department of Mathematical Sciences, University Cincinnati, OH 45221, formerly Postdoctoral Associate, Statistical Science, Duke University, Durham, NC 27708, and National Institute Research Triangle Park, 27709 (E-mail: hang.kim0@uc.edu). Cox Director for Official Statistics, cox@niss.org). Karr Director, Center Excellence Complex Data Analysis, RTI International, karr@rti.org)....

10.1080/01621459.2015.1040881 article EN Journal of the American Statistical Association 2015-06-24

Statistical Disclosure Limitation in the Presence of Edit Rules

OPENALEX - Publications

Hang J. Kim Alan F. Karr Jerome P. Reiter

Abstract We compare two general strategies for performing statistical disclosure limitation (SDL) continuous microdata subject to edit rules. In the first, existing SDL methods are applied, and any constraint-violating values they produce replaced using a constraint-preserving imputation procedure. second, modified prevent them from generating violations. present simulation study, based on data Colombian Annual Manufacturing Survey, that evaluates performance of as applied several methods....

10.1515/jos-2015-0006 article EN Journal of Official Statistics 2015-03-01

A Bayesian Multivariate Mixture Model for High Throughput Spatial Transcriptomics

OPENALEX - Publications

Carter Allen Yuzhou Chang Brian Neelon Won Chang Hang J. Kim and 3 more

High throughput spatial transcriptomics (HST) is a rapidly emerging class of experimental technologies that allow for profiling gene expression in tissue samples at or near single-cell resolution while retaining the location each sequencing unit within sample. Through analyzing HST data, we seek to identify sub-populations cells sample may inform biological phenomena. Existing computational methods either ignore heterogeneity profiles, fail account important statistical features such as...

10.1111/biom.13727 article EN cc-by-nc-nd Biometrics 2022-07-27

Bayesian Causal Inference for Observational Studies with Missingness in Covariates and Outcomes

OPENALEX - Publications

Huaiyu Zang Hang J. Kim Bin Huang Rhonda D. Szczesniak

Abstract Missing data are a pervasive issue in observational studies using electronic health records or patient registries. It presents unique challenges for statistical inference, especially causal inference. Inappropriately handling missing inference could potentially bias estimation. Besides problems, structures typically have mixed-type variables - continuous and categorical covariates whose joint distribution is often too complex to be modeled by simple parametric models. The existence...

10.1111/biom.13918 article EN cc-by-nc Biometrics 2023-08-08

graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture

OPENALEX - Publications

Dongjun Chung Hang J. Kim Hongyu Zhao

Genome-wide association studies (GWAS) have identified tens of thousands genetic variants associated with hundreds phenotypes and diseases, which provided clinical medical benefits to patients novel biomarkers therapeutic targets. However, identification risk complex diseases remains challenging as they are often affected by many small or moderate effects. There has been accumulating evidence suggesting that different traits share common basis, namely pleiotropy. Recently, several...

10.1371/journal.pcbi.1005388 article EN cc-by PLoS Computational Biology 2017-02-17

The Generalized Multiset Sampler

OPENALEX - Publications

Hang J. Kim Steven N. MacEachern

The multiset sampler, an MCMC algorithm recently proposed by Leman and coauthors, is easy-to-implement which especially well-suited to drawing samples from a multimodal distribution. We generalize the redefining sampler with explicit link between target distribution sampling generalized formulation replaces K-tuple, allows us use on unbounded parameter spaces, improves estimation, sets up further extensions adaptive techniques. Theoretical properties of are provided guidance given its...

10.1080/10618600.2014.962701 article EN Journal of Computational and Graphical Statistics 2014-10-02

Simultaneous edit-imputation and disclosure limitation for business establishment data

OPENALEX - Publications

Hang J. Kim Jerome P. Reiter Alan F. Karr

Business establishment microdata typically are required to satisfy agency-specified edit rules, such as balance equations and linear inequalities. Inevitably some establishments' reported data violate the rules. Statistical agencies correct faulty values using a process known edit-imputation. also must be heavily redacted before being shared with public; indeed, confidentiality concerns lead many not share unrestricted access files. When redacted, one approach is create synthetic data, done...

10.1080/02664763.2016.1267123 article EN Journal of Applied Statistics 2016-12-15

Inferring delays in partially observed gene regulation processes

OPENALEX - Publications

Hyukpyo Hong Mark Jayson Cortez Yu‐Yu Cheng Hang J. Kim Boseung Choi and 2 more

Cell function is regulated by gene regulatory networks (GRNs) defined protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer properties GRNs using partial observation, unobserved sequential be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on resulting model suffer from curse dimensionality.

10.1093/bioinformatics/btad670 article EN cc-by Bioinformatics 2023-11-01

Bayesian pollution source identification via an inverse physics model

OPENALEX - Publications

Youngdeok Hwang Hang J. Kim Won Chang Kyongmin Yeo Yongku Kim

The behavior of air pollution is governed by complex dynamics in which the quality a site affected pollutants transported from neighboring locations via physical processes. To estimate sources observed pollution, it crucial to take atmospheric conditions into account. Traditional approaches building empirical models use observations, but do not extensively incorporate knowledge. Failure exploit such knowledge can be critically limiting, particularly situations where near-real-time estimation...

10.1016/j.csda.2018.12.003 article EN cc-by-nc-nd Computational Statistics & Data Analysis 2018-12-27

Synthetic Microdata for Establishment Surveys Under Informative Sampling

OPENALEX - Publications

Hang J. Kim Jörg Drechsler Katherine Jenny Thompson

Abstract Many agencies are investigating whether releasing synthetic microdata could be a viable dissemination strategy for highly sensitive data, such as business which disclosure avoidance regulations otherwise prohibit the release of public use microdata. However, existing methods assume that original data either cover entire population or comprise simple random sample, limits application these in context survey with unequal weights. This paper discusses generation under informative...

10.1111/rssa.12622 article EN Journal of the Royal Statistical Society Series A (Statistics in Society) 2020-11-10

ShinyGPA: An interactive visualization toolkit for investigating pleiotropic architecture using GWAS datasets

OPENALEX - Publications

Emma Kortemeier Paula S. Ramos Kelly J. Hunt Hang J. Kim Gary Hardiman and 1 more

In spite of accumulating evidence suggesting that different complex traits share a common risk basis, namely pleiotropy, effective investigation pleiotropic architecture still remains challenging. order to address this challenge, we developed ShinyGPA, an interactive and dynamic visualization toolkit investigate structure. ShinyGPA requires only the summary statistics from genome-wide association studies (GWAS), which reduces burden on researchers using tool. allows users effectively genetic...

10.1371/journal.pone.0190949 article EN cc-by PLoS ONE 2018-01-08

Estimating heterogeneous treatment effects for latent subgroups in observational studies

OPENALEX - Publications

Hang J. Kim Bo Lü Edward Nehus Mi‐Ok Kim

Individuals may vary in their responses to treatment, and identification of subgroups differentially affected by a treatment is an important issue medical research. The risk misleading subgroup analyses has become well known, some exploratory can be helpful clarifying how covariates potentially interact with the treatment. Motivated real data study pediatric kidney transplant, we consider semiparametric Bayesian latent model examine its utility for effect analysis using secondary data....

10.1002/sim.7970 article EN Statistics in Medicine 2018-09-19

Improving SNP prioritization and pleiotropic architecture estimation by incorporating prior knowledge using graph-GPA

OPENALEX - Publications

Hang J. Kim Zhenning Yu Andrew Lawson Hongyu Zhao Dongjun Chung

Integration of genetic studies for multiple phenotypes is a powerful approach to improving the identification variants associated with complex traits. Although it has been shown that leveraging shared basis among phenotypes, namely pleiotropy, can increase statistical power identify risk variants, remains challenging effectively integrate genome-wide association study (GWAS) datasets large number phenotypes. We previously developed graph-GPA, Bayesian hierarchical model integrates GWAS boost...

10.1093/bioinformatics/bty061 article EN Bioinformatics 2018-02-06

A Bayesian Multivariate Mixture Model for Spatial Transcriptomics Data

OPENALEX - Publications

Carter Allen Yuzhou Chang Brian Neelon Won Chang Hang J. Kim and 3 more

Abstract High throughput spatial transcriptomics (HST) is a rapidly emerging class of experimental technologies that allow for profiling gene expression in tissue samples at or near single-cell resolution while retaining the location each sequencing unit within sample. Through analyzing HST data, we seek to identify sub-populations sample reflect distinct cell types states. Existing methods either ignore heterogeneity profiles, fail account important statistical features such as skewness,...

10.1101/2021.06.23.449615 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2021-06-24

graph-GPA 2.0: improving multi-disease genetic analysis with integration of functional annotation data

OPENALEX - Publications

Qiaolan Deng Arkobrato Gupta Hyeongseon Jeon Jin Hyun Nam Ayse Selen Yilmaz and 5 more

Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many variants. This is especially case when we are interested in shared across multiple phenotypes. To address this challenge, propose graph-GPA 2.0 (GGPA 2.0), statistical framework integrate GWAS datasets for phenotypes incorporate annotations within unified...

10.3389/fgene.2023.1079198 article EN cc-by Frontiers in Genetics 2023-07-12

GPA-Tree: statistical approach for functional-annotation-tree-guided prioritization of GWAS results

OPENALEX - Publications

Aastha Khatiwada Bethany J. Wolf Ayse Selen Yilmaz Paula S. Ramos Maciej Pietrzak and 4 more

In spite of great success genome-wide association studies (GWAS), multiple challenges still remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), each small or moderate effect sizes. Second, our understanding the functional mechanisms through which genetic variants is limited. To address these challenges, we propose GPA-Tree and it simultaneously implements mapping identifies key combinations annotations related to risk-associated SNPs by...

10.1093/bioinformatics/btab802 article EN Bioinformatics 2021-11-23

Inferring delays in partially observed gene regulatory networks

OPENALEX - Publications

Hyukpyo Hong Mark Jayson Cortez Yu‐Yu Cheng Hang J. Kim Boseung Choi and 2 more

Abstract Motivation Cell function is regulated by gene regulatory networks (GRNs) defined protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer properties GRNs using partial observation, unobserved sequential be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on resulting model suffer from curse dimensionality....

10.1101/2022.11.27.518074 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2022-11-27

Disseminating massive frequency tables by masking aggregated cell frequencies

OPENALEX - Publications

Min-Jeong Park Hang J. Kim Sunghoon Kwon

10.1007/s42952-023-00248-x article EN Journal of the Korean Statistical Society 2024-01-30

Correction: Disseminating massive frequency tables by masking aggregated cell frequencies

OPENALEX - Publications

Min-Jeong Park Hang J. Kim Sunghoon Kwon

10.1007/s42952-024-00267-2 article EN Journal of the Korean Statistical Society 2024-04-03

Statistical disclosure control for public microdata: present and future

OPENALEX - Publications

Min-Jeong Park Hang J. Kim

학술 연구나 정책 입안 등을 위한 심층적 자료 활용의 확대는 동시에 개별 정보 노출에 대한 염려도 증가시킨다. 때문에 최근 이십여 년 간 통계적 노출제어(정보보호) 분야에서 많은 논문들이 발표되었다. 본 논문은 그러한 연구 내용들을 정리하여 국내 통계인들과 기관들에게 소개하고자 한다. 주요 내용으로 국소통합이나 잡음추가와 같은 전통적인 매스킹 기법 뿐만 아니라, 온라인 분석 시스템에서의 정보보호 처리, 차등정보보호를 통한 노출제어 및 재현자료를 활용한 대안 모색에 대해 다룬다. 또한 각각의 주제에 방법론 소개와 함께 활용 사례 장단점을 논의하였다. 논문이 실제적인 문제를 고민하는 통계인들에게 도움이 되기를 바란다. The increasing demand from researchers and policy makers for microdata has also increased related privacy security concerns. During the past two...

10.5351/kjas.2016.29.6.1041 article EN Korean Journal of Applied Statistics 2016-10-31

Coming Soon ...