- Statistical Methods and Inference
- Bayesian Methods and Mixture Models
- Statistical Methods and Bayesian Inference
- Gene expression and cancer classification
- Genomics and Chromatin Dynamics
- Epigenetics and DNA Methylation
- Genetic Associations and Epidemiology
- Statistical Distribution Estimation and Applications
- Genomic variations and chromosomal abnormalities
- Colorectal Cancer Screening and Detection
- Algorithms and Data Compression
- RNA modifications and cancer
- Advanced Surface Polishing Techniques
- Statistical Methods in Clinical Trials
- Global Cancer Incidence and Screening
- Advancements in Photolithography Techniques
- Advanced Statistical Methods and Models
- Meteorological Phenomena and Simulations
- Image Processing Techniques and Applications
- Software Reliability and Analysis Research
- Robotic Path Planning Algorithms
- Ecology and Conservation Studies
- Cardiac and Coronary Surgery Techniques
- Climate variability and models
- Smart Grid and Power Systems
Gwangju Institute of Science and Technology
2024
Northern Illinois University
2012-2023
Augusta University
2010-2013
Augusta University Health
2012-2013
Brigham and Women's Hospital
2007
Texas A&M University
2007
Medical University of South Carolina
2007
Samsung (South Korea)
2005
Abstract Background PHF21A has been associated with intellectual disability and craniofacial anomalies based on its deletion in the Potocki-Shaffer syndrome region at 11p11.2 disruption three patients balanced translocations. In addition, de novo truncating mutations were reported recently. Here, we analyze genomic data from seven unrelated individuals provide detailed clinical descriptions, further expanding phenotype haploinsufficiency. Methods Diagnostic trio whole exome sequencing,...
We propose Bayesian parametric and semiparametric partially linear regression methods to analyze the outcome-dependent follow-up data when random time of a measurement an individual depends on history both observed longitudinal outcomes previous times. begin with investigation simplifying assumptions Lipsitz, Fitzmaurice, Ibrahim, Gelber, Lipshultz, present new model for analyzing such by allowing subject-specific correlations response introducing latent variable accommodate association...
AbstractWe propose Bayesian parametric and semiparametric partially linear regression methods to analyze the outcome-dependent follow-up data when random time of a measurement an individual depends on history both observed longitudinal outcomes previous times. We begin with investigation simplifying assumptions Lipsitz, Fitzmaurice, Ibrahim, Gelber, Lipshultz, present new model for analyzing such by allowing subject-specific correlations response introducing latent variable accommodate...
Epigenetic changes, especially DNA methylation at CpG loci have important implications in cancer and other complex diseases. With the development of next-generation sequencing (NGS), it is feasible to generate data interrogate difference status for genome-wide using case-control design. However, a proper efficient statistical test lacking. There are several challenges. First, unlike experiments microarrays, where there one measure individual particular site, here we counts allele...
Summary We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption covariate processes GLM may be unrealistic and if this happens it can cast doubt on inference observed effects. Allowing functions to unknown, we propose apply Bayesian methods including cubic smoothing splines or P-splines possible nonlinearity use an additive complex...
Abstract The sea surface temperature (SST) is an important factor of the earth climate system. A deep understanding SST essential for monitoring and prediction. In general, follows a nonlinear pattern in both time location can be modeled by dynamic system which changes with location. this article, we propose radial basis function network-based model able to catch nonlinearity data use dynamically weighted particle filter estimate parameters model. We analyze observed Caribbean Islands area...
Background/Objectives: For efficient PhotoVoltaic (PV) power generation, computing and information technologies are increasingly used in irradiance forecasting correction. Methods/Statistical Analysis: Today the majority of PV modules for grid-connected so solar generation that predicts available output ahead is essential integrating resources into electricity grids. This paper proposes a short-term system employs Neural Network (NN) models to forecast power. Results: The proposed uses...
Differential methylation of regulatory elements is critical in epigenetic researches and can be statistically tested. We developed a new statistical test, the generalized integrated functional test (GIFT), that tests for regional differences based on percent at each CpG site within genomic region. The GIFT uses estimated subject-specific profiles with smoothing methods, specifically wavelet smoothing, calculates an ANOVA-like to compare average profile groups. In this way, possibly...
Motivation: Researchers in genomics are increasingly interested epigenetic factors such as DNA methylation because they play an important role regulating gene expression without changes the sequence of DNA. Abnormal is associated with many human diseases. Results: We propose two different approaches to test for differentially methylated regions (DMRs) complex traits, while accounting correlations among CpG sites DMRs. The first approach a nonparametric method using kernel distance statistic...
Researchers in genomics are increasingly interested epigenetic factors such as DNA methylation, because they play an important role regulating gene expression without changes the sequence. There have been significant advances developing statistical methods to detect differentially methylated regions (DMRs) associated with binary disease status. Most of these being developed for detecting differential methylation rates between cases and controls. We consider multiple severity levels disease,...
Abstract Disclosure: N. Kim: None. D. Ryu: C. Oh: Obesity is defined as the accumulation of excess fat in body. Although obesity main cause metabolic disorders, not all obese people have disorders. Among many factors identified, ratio visceral (vWAT) to subcutaneous white adipose tissue (sWAT), representing body distribution, has been elucidated be closely associated with health. Body distribution proposed a possible explanation for this discrepancy. To investigate regulation we performed...
We consider a Bayesian functional data analysis for observations measured as extremely long sequences. Splitting the sequence into several small windows with manageable lengths, may not be independent especially when they are neighboring each other. propose to utilize smoothing splines estimate individual patterns within window and establish transition models parameters involved in address dependence structure between windows. The difference of groups individuals at can evaluated by Bayes...
Abstract Study question Can we develop predictive, practical biomarkers for premature ovarian insufficiency (POI) from normal ovaries and human blood, tested validated by machine-learning (ML) algorithm? Summary answer Applying random forest XGBoost algorithms, pinpointed 60 significant genes tissues; transcriptome analysis identified RPM1 as the most biomarker. What is known already Accelerated aging has been suggested one of possible underlying mechanisms POI; previous literature focused...
Regression procedures are not only hindered by large p and small n, but can also suffer in cases when outliers present or the data generating mechanisms heavy tailed. Since penalized estimates like least absolute shrinkage selection operator (LASSO) equipped to deal with n encouraging sparsity, we combine a LASSO type penalty deviation loss function, instead of standard squares loss, handle presence tails. The model is cast Bayesian setting Gibbs sampler derived efficiently sample from...
The role of genome-wide patterns methylation in human disease has drawn attention increasingly recent years, because the methylome potential for large effects etiology. Most analyses have utilized percent signal that is methylated, known as β-value, or logistic transformation β, named M-value, as the summary measures. However, general, these measures do not follow a Normal distribution and lead to statistical tests sensitive to outlying samples. In this paper, we proposed N-value, type...
In this article we propose a multiple-inflation Poisson regression to model count response data containing excessive frequencies at more than one non-negative integer values. To handle multiple responses, generalize the zero-inflated by replacing its binary with multinomial regression, while Su et al. [Statist. Sinica 23 (2013) 1071–1090] proposed for consecutive responses frequencies. We give several properties of our model, and do statistical inference under fully Bayesian framework....
In this paper we utilize a survival analysis methodology incorporating Bayesian additive regression trees to account for nonlinear and covariate effects. We compare the performance of trees, Cox proportional hazards random forests models censored data, using simulation studies breast cancer with U.S. SEER database year 2005. studies, three across varying sample sizes censoring rates on basis bias prediction accuracy. cancer, retrospectively analyze subset 1500 patients having invasive ductal...
Abstract Systems of components have a structure that plays an important role in determining how the reliability individual relates to system. The system can be computed from component reliabilities using results basic probability theory simplest case with all assumed act independently one another. However, dependence, such calculations much more involved. When data been collected on both and each system, it difficult model any possible dependence between components. Established methods use...
Summary Infectious diseases that can be spread directly or indirectly from one person to another are caused by pathogenic microorganisms such as bacteria, viruses, parasites, fungi. remain of the greatest threats human health and analysis infectious disease data is among most important application statistics. In this article, we develop Bayesian methodology using parametric bivariate accelerated lifetime model study dependency between colonization infection times for Acinetobacter baumannii...
Motivation: Methods are needed to test pre-defined genomic regions such as promoters for differential methylation in genome-wide association studies, where the number of samples is limited and data have large amounts measurement error. Results: We developed a new statistical test, generalized integrated functional (GIFT), which tests regional differences based on relationship between percent location CpG sites within region. In this method, subject-specific profiles first estimated, average...