Hang J. Kim

ORCID: 0000-0003-2029-2293
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Bayesian Methods and Mixture Models
  • Privacy-Preserving Technologies in Data
  • Statistical Methods and Bayesian Inference
  • Genetic Associations and Epidemiology
  • Advanced Statistical Methods and Models
  • Gene Regulatory Network Analysis
  • Gene expression and cancer classification
  • Bioinformatics and Genomic Networks
  • Single-cell and spatial transcriptomics
  • Systemic Lupus Erythematosus Research
  • Data-Driven Disease Surveillance
  • Privacy, Security, and Data Protection
  • Light effects on plants
  • Markov Chains and Monte Carlo Methods
  • Gaussian Processes and Bayesian Inference
  • Cryptography and Data Security
  • Tryptophan and brain disorders
  • Bacterial Genetics and Biotechnology
  • Greenhouse Technology and Climate Control
  • Healthcare Policy and Management
  • Recommender Systems and Techniques
  • interferon and immune responses
  • Advanced Causal Inference Techniques
  • Mobile Crowdsensing and Crowdsourcing

University of Cincinnati
2016-2025

University of California, San Francisco
2018

Duke University
2014-2015

National Institute of Statistical Sciences
2015

Abstract Circadian clock mechanisms have been extensively investigated but the main rate-limiting step that determines circadian period remains unclear. Formation of a stable complex between proteins and CK1 is conserved feature in eukaryotic mechanisms. Here we show FRQ-CK1 interaction, not FRQ stability, correlates with Neurospor mutants. Mutations specifically affect interaction lead to severe alterations period. The has two roles negative feedback loop. First, it phosphorylation profile,...

10.1038/s41467-019-12239-w article EN cc-by Nature Communications 2019-09-25

Many statistical agencies, survey organizations, and research centers collect data that suffer from item nonresponse erroneous or inconsistent values. These may be required to satisfy linear constraints, for example, bounds on individual variables inequalities ratios sums of variables. Often these constraints are designed identify faulty values, which then blanked imputed. The also exhibit complex distributional features, including nonlinear relationships highly nonnormal distributions. We...

10.1080/07350015.2014.885435 article EN Journal of Business and Economic Statistics 2014-02-20

Hang J. Kim, Lawrence H. Cox, Alan F. Karr, Jerome P. Reiter & Quanli WangHang Kim is Assistant Professor, Department of Mathematical Sciences, University Cincinnati, OH 45221, formerly Postdoctoral Associate, Statistical Science, Duke University, Durham, NC 27708, and National Institute Research Triangle Park, 27709 (E-mail: hang.kim0@uc.edu). Cox Director for Official Statistics, cox@niss.org). Karr Director, Center Excellence Complex Data Analysis, RTI International, karr@rti.org)....

10.1080/01621459.2015.1040881 article EN Journal of the American Statistical Association 2015-06-24

Abstract We compare two general strategies for performing statistical disclosure limitation (SDL) continuous microdata subject to edit rules. In the first, existing SDL methods are applied, and any constraint-violating values they produce replaced using a constraint-preserving imputation procedure. second, modified prevent them from generating violations. present simulation study, based on data Colombian Annual Manufacturing Survey, that evaluates performance of as applied several methods....

10.1515/jos-2015-0006 article EN Journal of Official Statistics 2015-03-01

High throughput spatial transcriptomics (HST) is a rapidly emerging class of experimental technologies that allow for profiling gene expression in tissue samples at or near single-cell resolution while retaining the location each sequencing unit within sample. Through analyzing HST data, we seek to identify sub-populations cells sample may inform biological phenomena. Existing computational methods either ignore heterogeneity profiles, fail account important statistical features such as...

10.1111/biom.13727 article EN cc-by-nc-nd Biometrics 2022-07-27

Abstract Missing data are a pervasive issue in observational studies using electronic health records or patient registries. It presents unique challenges for statistical inference, especially causal inference. Inappropriately handling missing inference could potentially bias estimation. Besides problems, structures typically have mixed-type variables - continuous and categorical covariates whose joint distribution is often too complex to be modeled by simple parametric models. The existence...

10.1111/biom.13918 article EN cc-by-nc Biometrics 2023-08-08

Genome-wide association studies (GWAS) have identified tens of thousands genetic variants associated with hundreds phenotypes and diseases, which provided clinical medical benefits to patients novel biomarkers therapeutic targets. However, identification risk complex diseases remains challenging as they are often affected by many small or moderate effects. There has been accumulating evidence suggesting that different traits share common basis, namely pleiotropy. Recently, several...

10.1371/journal.pcbi.1005388 article EN cc-by PLoS Computational Biology 2017-02-17

The multiset sampler, an MCMC algorithm recently proposed by Leman and coauthors, is easy-to-implement which especially well-suited to drawing samples from a multimodal distribution. We generalize the redefining sampler with explicit link between target distribution sampling generalized formulation replaces K-tuple, allows us use on unbounded parameter spaces, improves estimation, sets up further extensions adaptive techniques. Theoretical properties of are provided guidance given its...

10.1080/10618600.2014.962701 article EN Journal of Computational and Graphical Statistics 2014-10-02

Business establishment microdata typically are required to satisfy agency-specified edit rules, such as balance equations and linear inequalities. Inevitably some establishments' reported data violate the rules. Statistical agencies correct faulty values using a process known edit-imputation. also must be heavily redacted before being shared with public; indeed, confidentiality concerns lead many not share unrestricted access files. When redacted, one approach is create synthetic data, done...

10.1080/02664763.2016.1267123 article EN Journal of Applied Statistics 2016-12-15

Cell function is regulated by gene regulatory networks (GRNs) defined protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer properties GRNs using partial observation, unobserved sequential be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on resulting model suffer from curse dimensionality.

10.1093/bioinformatics/btad670 article EN cc-by Bioinformatics 2023-11-01

The behavior of air pollution is governed by complex dynamics in which the quality a site affected pollutants transported from neighboring locations via physical processes. To estimate sources observed pollution, it crucial to take atmospheric conditions into account. Traditional approaches building empirical models use observations, but do not extensively incorporate knowledge. Failure exploit such knowledge can be critically limiting, particularly situations where near-real-time estimation...

10.1016/j.csda.2018.12.003 article EN cc-by-nc-nd Computational Statistics & Data Analysis 2018-12-27

Abstract Many agencies are investigating whether releasing synthetic microdata could be a viable dissemination strategy for highly sensitive data, such as business which disclosure avoidance regulations otherwise prohibit the release of public use microdata. However, existing methods assume that original data either cover entire population or comprise simple random sample, limits application these in context survey with unequal weights. This paper discusses generation under informative...

10.1111/rssa.12622 article EN Journal of the Royal Statistical Society Series A (Statistics in Society) 2020-11-10

In spite of accumulating evidence suggesting that different complex traits share a common risk basis, namely pleiotropy, effective investigation pleiotropic architecture still remains challenging. order to address this challenge, we developed ShinyGPA, an interactive and dynamic visualization toolkit investigate structure. ShinyGPA requires only the summary statistics from genome-wide association studies (GWAS), which reduces burden on researchers using tool. allows users effectively genetic...

10.1371/journal.pone.0190949 article EN cc-by PLoS ONE 2018-01-08

Individuals may vary in their responses to treatment, and identification of subgroups differentially affected by a treatment is an important issue medical research. The risk misleading subgroup analyses has become well known, some exploratory can be helpful clarifying how covariates potentially interact with the treatment. Motivated real data study pediatric kidney transplant, we consider semiparametric Bayesian latent model examine its utility for effect analysis using secondary data....

10.1002/sim.7970 article EN Statistics in Medicine 2018-09-19

Integration of genetic studies for multiple phenotypes is a powerful approach to improving the identification variants associated with complex traits. Although it has been shown that leveraging shared basis among phenotypes, namely pleiotropy, can increase statistical power identify risk variants, remains challenging effectively integrate genome-wide association study (GWAS) datasets large number phenotypes. We previously developed graph-GPA, Bayesian hierarchical model integrates GWAS boost...

10.1093/bioinformatics/bty061 article EN Bioinformatics 2018-02-06

Abstract High throughput spatial transcriptomics (HST) is a rapidly emerging class of experimental technologies that allow for profiling gene expression in tissue samples at or near single-cell resolution while retaining the location each sequencing unit within sample. Through analyzing HST data, we seek to identify sub-populations sample reflect distinct cell types states. Existing methods either ignore heterogeneity profiles, fail account important statistical features such as skewness,...

10.1101/2021.06.23.449615 preprint EN cc-by-nc bioRxiv (Cold Spring Harbor Laboratory) 2021-06-24

Genome-wide association studies (GWAS) have successfully identified a large number of genetic variants associated with traits and diseases. However, it still remains challenging to fully understand the functional mechanisms underlying many variants. This is especially case when we are interested in shared across multiple phenotypes. To address this challenge, propose graph-GPA 2.0 (GGPA 2.0), statistical framework integrate GWAS datasets for phenotypes incorporate annotations within unified...

10.3389/fgene.2023.1079198 article EN cc-by Frontiers in Genetics 2023-07-12

In spite of great success genome-wide association studies (GWAS), multiple challenges still remain. First, complex traits are often associated with many single nucleotide polymorphisms (SNPs), each small or moderate effect sizes. Second, our understanding the functional mechanisms through which genetic variants is limited. To address these challenges, we propose GPA-Tree and it simultaneously implements mapping identifies key combinations annotations related to risk-associated SNPs by...

10.1093/bioinformatics/btab802 article EN Bioinformatics 2021-11-23

Abstract Motivation Cell function is regulated by gene regulatory networks (GRNs) defined protein-mediated interaction between constituent genes. Despite advances in experimental techniques, we can still measure only a fraction of the processes that govern GRN dynamics. To infer properties GRNs using partial observation, unobserved sequential be replaced with distributed time delays, yielding non-Markovian models. Inference methods based on resulting model suffer from curse dimensionality....

10.1101/2022.11.27.518074 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2022-11-27

학술 연구나 정책 입안 등을 위한 심층적 자료 활용의 확대는 동시에 개별 정보 노출에 대한 염려도 증가시킨다. 때문에 최근 이십여 년 간 통계적 노출제어(정보보호) 분야에서 많은 논문들이 발표되었다. 본 논문은 그러한 연구 내용들을 정리하여 국내 통계인들과 기관들에게 소개하고자 한다. 주요 내용으로 국소통합이나 잡음추가와 같은 전통적인 매스킹 기법 뿐만 아니라, 온라인 분석 시스템에서의 정보보호 처리, 차등정보보호를 통한 노출제어 및 재현자료를 활용한 대안 모색에 대해 다룬다. 또한 각각의 주제에 방법론 소개와 함께 활용 사례 장단점을 논의하였다. 논문이 실제적인 문제를 고민하는 통계인들에게 도움이 되기를 바란다. The increasing demand from researchers and policy makers for microdata has also increased related privacy security concerns. During the past two...

10.5351/kjas.2016.29.6.1041 article EN Korean Journal of Applied Statistics 2016-10-31
Coming Soon ...