Xiaotong Shen

ORCID: 0000-0003-1300-1451
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Gene expression and cancer classification
  • Bioinformatics and Genomic Networks
  • Bayesian Methods and Mixture Models
  • Face and Expression Recognition
  • Genetic Associations and Epidemiology
  • Advanced biosensing and bioanalysis techniques
  • Advanced Statistical Methods and Models
  • Robotics and Sensor-Based Localization
  • Statistical Methods and Bayesian Inference
  • Bayesian Modeling and Causal Inference
  • Robotic Path Planning Algorithms
  • Autonomous Vehicle Technology and Safety
  • Genetic and phenotypic traits in livestock
  • Sparse and Compressive Sensing Techniques
  • Machine Learning and Data Classification
  • Control Systems and Identification
  • Neural Networks and Applications
  • Genetic Mapping and Diversity in Plants and Animals
  • Data Mining Algorithms and Applications
  • RNA Interference and Gene Delivery
  • Fault Detection and Control Systems
  • Machine Learning in Bioinformatics
  • Recommender Systems and Techniques
  • Machine Learning and ELM

University of Minnesota
2016-2025

Twin Cities Orthopedics
2005-2025

Beijing Normal University
2017-2025

Hebei North University
2025

Shandong First Medical University
2022-2024

Shihezi University
2024

University of Minnesota System
2005-2023

Changchun University of Science and Technology
2023

Sun Yat-sen University
2022-2023

Hebei Medical University
2023

Autonomous vehicles are expected to play a key role in the future of urban transportation systems, as they offer potential for additional safety, increased productivity, greater accessibility, better road efficiency, and positive impact on environment. Research autonomous systems has seen dramatic advances recent years, due increases available computing power reduced cost sensing technologies, resulting maturing technological readiness level fully vehicles. The objective this paper is provide...

10.3390/machines5010006 article EN cc-by Machines 2017-02-17

In high-dimensional data analysis, feature selection becomes one means for dimension reduction, which proceeds with parameter estimation. Concerning accuracy of and estimation, we study nonconvex constrained regularized likelihoods in the presence nuisance parameters. Theoretically, show that L(0)-likelihood its computational surrogate are optimal they achieve consistency sharp under necessary condition required any method to be consistent It permits up exponentially many candidate features....

10.1080/01621459.2011.645783 article EN Journal of the American Statistical Association 2012-03-01

In this paper, we present a multivehicle cooperative driving system architecture using perception along with experimental validation. For goal, first propose multimodal that provides see-through, lifted-seat, satellite and all-around views to drivers. Using the extended range information from system, then realize by see-through forward collision warning, overtaking/lane-changing assistance, automated hidden obstacle avoidance. We demonstrate capabilities features of our through real-world...

10.1109/tits.2014.2337316 article EN IEEE Transactions on Intelligent Transportation Systems 2014-07-28

In this paper, we develop a general theory for the convergence rate of sieve estimates, maximum likelihood estimates (MLE's) and related obtained by optimizing certain empirical criteria in parameter spaces. many cases, especially when space is infinite dimensional, maximization over whole undesirable. such one has to perform an approximating (sieve) original allow size grow as sample increases. This method called sieves. case estimation, MLE based on MLE. We found that estimate governed (a)...

10.1214/aos/1176325486 article EN The Annals of Statistics 1994-06-01

Many non/semi-parametric time series estimates may be regarded as different forms of sieve extremum estimates. For stationary β-mixing observations, we obtain convergence rates and root-n asymptotic normality plug-in smooth functionals. As applications to models, give for nonparametric ARX(p,q) regression via neural networks, splines, wavelets; partial linear additive AR(p) monotone transformation AR(1) models.

10.2307/2998559 article EN Econometrica 1998-03-01

We compute the rate at which posterior distribution concentrates around true parameter value. The spaces we work in are quite general and include finite dimensional cases. rates driven by two quantities: size of space, as measured bracketing entropy, degree to prior a small ball parameter. consider examples.

10.1214/aos/1009210686 article EN The Annals of Statistics 2001-06-01

We develop a general theory which provides unified treatment for the asymptotic normality and efficiency of maximum likelihood estimates (MLE's) in parametric, semiparametric nonparametric models. find that behavior substitution estimating smooth functionals are essentially governed by two indices: degree smoothness functional local size underlying parameter space. show when space is not very large, standard (nonsieve), sieve penalized MLE's asymptotically efficient Fisher sense, under...

10.1214/aos/1030741085 article EN The Annals of Statistics 1997-12-01

In this paper, we study the local behavior of regression splines. particular, explicit expressions for asymptotic pointwise bias and variance splines are obtained. addition, normality is established, leading to construction approximate confidence intervals bands function.

10.1214/aos/1024691356 article EN The Annals of Statistics 1998-10-01

Let $Y_1,\ldots, Y_n$ be independent identically distributed with density $p_0$ and let $\mathscr{F}$ a space of densities. We show that the supremum likelihood ratios $\prod^n_{i=1} p(Y_i)/p_0(Y_i)$, where is over $p \in \mathscr{F}$ $\|p^{1/2} - p^{1/2}_0\|_2 \geq \varepsilon$, exponentially small probability close to 1. The exponent proportional $n\varepsilon^2$. only condition required for this hold $\varepsilon$ exceeds value determined by bracketing Hellinger entropy $\mathscr{F}$. A...

10.1214/aos/1176324524 article EN The Annals of Statistics 1995-04-01

AbstractMost model selection procedures use a fixed penalty penalizing an increase in the size of model. These nonadaptive perform well only one type situation. For instance, Bayesian information criterion (BIC) with large performs for "small" models and poorly "large" models, Akaike's (AIC) does just opposite. This article proposes adaptive procedure that uses data-adaptive complexity based on concept generalized degrees freedom. The proposed procedure, combining benefit class procedures,...

10.1198/016214502753479356 article EN Journal of the American Statistical Association 2002-03-01

The concept of large margins have been recognized as an important principle in analyzing learning methodologies, including boosting, neural networks, and support vector machines (SVMs). However, this alone is not adequate for nonseparable cases. We propose a methodology, called ψ-learning, that derived from direct consideration generalization errors. provide theory ψ-learning show it essentially attains the optimal rates convergence two examples. Finally, results simulation studies breast...

10.1198/016214503000000639 article EN Journal of the American Statistical Association 2003-09-01

Abstract This article focuses on conducting global testing for association between a binary trait and set of rare variants (RVs), although its application can be much broader to other types traits, common (CVs), gene or pathway analysis. We show that many the existing tests have deteriorating performance in presence nonassociated RVs: their power dramatically drop as proportion RVs group tested increases. propose class so-called sum powered score (SPU) tests, each which is based vector from...

10.1534/genetics.114.165035 article EN Genetics 2014-05-16

Extracting grouping structure or identifying homogenous subgroups of predictors in regression is crucial for high-dimensional data analysis. A low-dimensional particular—grouping, when captured a model—enables to enhance predictive performance and facilitate model's interpretability. Grouping pursuit extracts most responsible outcomes response. This the case gene network analysis, where reveals functionalities with regard progression disease. To address challenges pursuit, we introduce novel...

10.1198/jasa.2010.tm09380 article EN Journal of the American Statistical Association 2010-06-01

High-dimensional regression/classification continues to be an important and challenging problem, especially when features are highly correlated. Feature selection, combined with additional structure information on the has been considered promising in promoting performance. Graph-guided fused lasso (GFlasso) recently proposed facilitate feature selection graph exploitation, exhibit certain structures. However, formulation GFlasso relies pairwise sample correlations perform grouping, which...

10.1145/2339530.2339675 article EN 2012-08-12

RNA detection has become one of the most robust parts in molecular biology, medical diagnostics and drug discovery. Conventional methods involve an extra reverse transcription step, which limits their further application for rapid detection. We herein report a novel finding that Bst Klenow DNA polymerases possess innate transcriptase activities, so step next amplification reaction can be combined to isothermal have demonstrated could successfully used transcribe within 125-nt length by real...

10.1021/jacs.5b08144 article EN Journal of the American Chemical Society 2015-10-16

In precision medicine, the ultimate goal is to recommend most effective treatment an individual patient based on patient-specific molecular and clinical profiles, possibly high-dimensional. To advance cancer treatment, large-scale screenings of cell lines against chemical compounds have been performed help better understand relationship between genomic features drug response; existing machine learning approaches use exclusively supervised learning, including penalized regression recommender...

10.1002/sim.9491 article EN cc-by-nc-nd Statistics in Medicine 2022-06-18

AbstractIn binary classification, margin-based techniques usually deliver high performance. As a result, multicategory problem is often treated as sequence of classifications. In the absence dominating class, this treatment may be suboptimal and yield poor performance, such for support vector machines (SVMs). We propose novel generalization ψ-learning that treats all classes simultaneously. The new eliminates potential while at same time retaining desirable properties its counterpart....

10.1198/016214505000000781 article EN Journal of the American Statistical Association 2006-05-16

Summary We consider penalized linear regression, especially for “large p , small n ” problems, which the relationships among predictors are described a priori by network. A class of motivating examples includes modeling phenotype through gene expression profiles while accounting coordinated functioning genes in form biological pathways or networks. To incorporate prior knowledge similar effect sizes neighboring network, we propose grouped penalty based on L γ ‐norm that smoothes regression...

10.1111/j.1541-0420.2009.01296.x article EN Biometrics 2009-07-23

Binary support vector machines (SVMs) have been proven to deliver high performance. In multiclass classification, however, issues remain with respect variable selection. One challenging issue is classification and selection in the presence of variables magnitude thousands, greatly exceeding size training sample. This often occurs genomics classification. To meet challenge, this article proposes a novel machine, which performs simultaneously through an L1-norm penalized sparse representation....

10.1198/016214506000001383 article EN Journal of the American Statistical Association 2007-05-16

The importance of network-based approach to identifying biological markers for diagnostic classification and prognostic assessment in the context microarray data has been increasingly recognized. To our knowledge, there have few, if any, statistical tools that explicitly incorporate prior information gene networks into classifier building. main idea this paper is take full advantage observation neighboring genes a network tend function together processes embed formal framework.We propose...

10.1186/1471-2105-10-s1-s21 article EN cc-by BMC Bioinformatics 2009-01-01

Clustering is one of the most useful tools for high-dimensional analysis, e.g., microarray data. It becomes challenging in presence a large number noise variables, which may mask underlying clustering structures. Therefore, removal through variable selection necessary. One effective way regularization simultaneous parameter estimation and model-based clustering. However, existing methods focus on regularizing mean parameters representing centers clusters, ignoring dependencies among...

10.1214/09-ejs487 article EN cc-by Electronic Journal of Statistics 2009-01-01

In this paper, we consider the problem of estimating multiple graphical models simultaneously using fused lasso penalty, which encourages adjacent graphs to share similar structures. A motivating example is analysis brain networks Alzheimer's disease neuroimaging data. Specifically, may wish estimate a network for normal controls (NC), patients with mild cognitive impairment (MCI), and (AD). We expect two NC MCI common structures but not be identical each other; similarly AD. The proposed...

10.1137/130936397 article EN SIAM Journal on Optimization 2015-01-01
Coming Soon ...