Xingye Qiao

ORCID: 0000-0003-0937-9822
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Face and Expression Recognition
  • Machine Learning and Data Classification
  • Advanced Statistical Methods and Models
  • Gene expression and cancer classification
  • Imbalanced Data Classification Techniques
  • Advanced Causal Inference Techniques
  • Machine Learning and Algorithms
  • Spectroscopy and Chemometric Analyses
  • Bayesian Methods and Mixture Models
  • Domain Adaptation and Few-Shot Learning
  • Reinforcement Learning in Robotics
  • Neural Networks and Applications
  • AI in cancer detection
  • Neural dynamics and brain function
  • Advanced Algorithms and Applications
  • Fault Detection and Control Systems
  • Data Stream Mining Techniques
  • Computational Drug Discovery Methods
  • Machine Learning in Healthcare
  • Bayesian Modeling and Causal Inference
  • Functional Brain Connectivity Studies
  • Distributed Sensor Networks and Detection Algorithms
  • Blind Source Separation Techniques
  • Advanced Statistical Process Monitoring

Binghamton University
2015-2025

Dalian Ocean University
2024

Michigan State University
2013-2017

Purdue University West Lafayette
2013

North Carolina State University
2010

University of North Carolina at Chapel Hill
2008

While Distance Weighted Discrimination (DWD) is an appealing approach to classification in high dimensions, it was designed for balanced datasets. In the case of unequal costs, biased sampling, or unbalanced data, there are major improvements available, using appropriately weighted versions DWD (wDWD). A contribution this paper development optimal weighting schemes various nonstandard problems. addition, we discuss several alternative criteria and propose adaptive scheme (awDWD) demonstrate...

10.1198/jasa.2010.tm08487 article EN Journal of the American Statistical Association 2010-03-01

In multicategory classification, standard techniques typically treat all classes equally. This treatment can be problematic when the dataset is unbalanced in sense that certain have very small class proportions compared to others. The minority may ignored or discounted during classification process due their proportions. a serious problem if those are important. this article, we study of and propose new criteria measure accuracy. Moreover, three different weighted learning procedures, two...

10.1111/j.1541-0420.2008.01017.x article EN Biometrics 2008-03-24

Over recent decades, there has been a significant increase in postsecondary STEM education among autistic individuals. Using data from the High School Longitudinal Study of 2009, this study examined pathways students, emphasizing key determinants like proximal context, self-efficacy, and outcome expectations within framework social cognitive theory. The results revealed that despite lower college attendance rate, students displayed pronounced inclination for majors, particularly fields...

10.1177/00144029241312777 article EN Exceptional Children 2025-01-24

In an era where diverse and complex data are increasingly accessible, the precise prediction of individual treatment effects (ITE) becomes crucial across fields such as healthcare, economics, public policy. Current state-of-the-art approaches, while providing valid intervals through Conformal Quantile Regression (CQR) related techniques, often yield overly conservative intervals. this work, we introduce a conformal inference approach to ITE using conditional density outcome given covariates....

10.48550/arxiv.2501.14933 preprint EN arXiv (Cornell University) 2025-01-24

In an era where diverse and complex data are increasingly accessible, the precise prediction of individual treatment effects (ITE) becomes crucial across fields such as healthcare, economics, social policy. Current state-of-the-art approaches, while providing valid intervals through Conformal Quantile Regression (CQR) related techniques, often yield overly conservative intervals. this work, we introduce a conformal inference approach to ITE using conditional density outcome given covariates....

10.1609/aaai.v39i20.35397 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

PLS initially creates uncorrelated latent variables which are linear combinations of the original input vectors Xi, where weights used to determine combinations, proportional covariance. Secondly, a least squares regression is then performed on subset extracted that lead lower and biased variance transformed data. This process, leads estimate coefficients when compared Ordinary Least Squares approach. Classical Principal Component Analysis (PCA), kernel ridge (KRR) techniques well known...

10.1016/j.procs.2011.08.051 article EN Procedia Computer Science 2011-01-01

Classification is an important topic in statistics and machine learning with great potential many real applications. In this paper, we investigate two popular large margin classification methods, Support Vector Machine (SVM) Distance Weighted Discrimination (DWD), under contexts: the high-dimensional, low-sample size data imbalanced data. A unified family of machines, FLexible Assortment MachinE (FLAME) proposed, within which DWD SVM are special cases. The FLAME helps to identify...

10.48550/arxiv.1310.3004 preprint EN other-oa arXiv (Cornell University) 2013-01-01

A novel linear classification method that possesses the merits of both Support Vector Machine (SVM) and Distance-weighted Discrimination (DWD) is proposed in this article.The can be viewed as a hybrid SVM DWD finds direction by minimizing mainly loss, determines intercept term manner.We show our inheres merit DWD, hence, overcomes data-piling overfitting issue SVM.On other hand, new not subject to imbalanced data which was main advantage over DWD.It uses an unusual loss combines Hinge (of...

10.4310/sii.2015.v8.n3.a7 article EN Statistics and Its Interface 2015-01-01

The stability of statistical analysis is an important indicator for reproducibility, which one main principle the scientific method. It entails that similar conclusions can be reached based on independent samples from same underlying population. In this article, we introduce a general measure classification instability (CIS) to quantify sampling variability prediction made by Interestingly, asymptotic CIS any weighted nearest neighbor classifier turns out proportional Euclidean norm its...

10.1080/01621459.2015.1089772 article EN Journal of the American Statistical Association 2015-10-02

The primary objectives of this paper are: 1.) to apply Statistical Learning Theory (SLT), specifically Partial Least Squares (PLS) and Kernelized PLS (K-PLS), the universal "feature-rich/case-poor" (also known as "large p small n", or "high-dimension, low-sample size") microarray problem by eliminating those features (or probes) that do not contribute "best" chromosome bio-markers for lung cancer, 2.) quantitatively measure verify (by an independent means) efficacy process. A secondary...

10.1186/1752-0509-5-s3-s13 article EN BMC Systems Biology 2011-01-01

Load forecasting at distribution networks is more challenging than load transmission because its pattern stochastic and unpredictable. To plan sufficient resources estimate DER hosting capacity, it invaluable for a network planner to get the probabilistic of daily peak-load under feeder over long term. In this paper, we model functions using power law distributions, which tested by improved Kolmogorov-Smirnov test enhanced Monte Carlo simulation approach. addition, uncertainty modeling...

10.1109/pesgm.2017.8274629 article EN 2017-07-01

In ultrahigh dimensional setting, independence screening has been both theoretically and empirically proved a useful variable selection framework with low computation cost. this work, we propose two-step by using marginal information in different perspective from screening. particular, retain significant variables rather than out irrelevant ones. The new method is shown to be model consistent the linear regression model. To improve finite sample performance, then introduce three-step version...

10.5705/ss.202015.0413 article EN Statistica Sinica 2017-07-31

In many real applications of statistical learning, a decision made from misclassification can be too costly to afford; in this case, reject option, which defers the until further investigation is conducted, often preferred. recent years, there has been much development for binary classification with option. Yet, little progress multicategory case. article, we propose margin-based methods addition, and more importantly, introduce new unique refine option problem, where class an observation...

10.1080/01621459.2017.1282372 article EN Journal of the American Statistical Association 2017-01-20

Classification is an important tool with many useful applications. Fisher's linear discriminant analysis ( LDA ) a traditional model‐based classification method which makes use of the Gaussian distributional information. However, in high‐dimensional, low‐sample‐size setting, cannot be directly deployed because sample covariance not invertible. While there are modern methods for high‐dimensional data, they may fully information as does. Hence some situations, it still desirable to...

10.1002/sam.11367 article EN publisher-specific-oa Statistical Analysis and Data Mining The ASA Data Science Journal 2017-12-06

Breast cancer screening has reference to of asymptomatic, generally healthy women for breast cancer, identify those who should receive a follow up check. Early can detect non-invasive ductal carcinoma in situ (called "pre cancer"), which almost never forms lump and is non-detectible, except by mammography. This paper will describe the design preliminary evaluation this PNN/GRNN ensemble pre-screener, context possible pre-screening protocol, may, if required, include other data. The results...

10.1016/j.procs.2012.09.101 article EN Procedia Computer Science 2012-01-01

Circadian cues in children (sunlight, exercise, diet patterns) may be associated with health outcomes. The primary objective was to assess associations of daily cortisol fluctuations (morning, night) cardiovascular A secondary determine if 1-year longitudinal changes circadian levels are outcomes.The Cardiovascular Health Intervention Program (CHIP) a cross-sectional and study risk profiles public elementary school Southern Maine. Participants were 689 students 4th grade (baseline; age =...

10.1016/j.psyneuen.2021.105252 article EN cc-by-nc-nd Psychoneuroendocrinology 2021-05-11

10.17615/k54y-bf03 article EN Carolina Digital Repository (University of North Carolina at Chapel Hill) 2010-01-01

The individualized treatment recommendation (ITR) is an important analytic framework for precision medicine. goal of ITR to assign the best treatments patients based on their individual characteristics. From machine learning perspective, solution problem can be formulated as a weighted classification maximize mean benefit from recommended given patients' Several methods have been proposed in both binary setting and multicategory setting. In practice, one may prefer more flexible that...

10.48550/arxiv.2004.02772 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We address the challenge of effective exploration while maintaining good performance in policy gradient methods. As a solution, we propose diverse (DE) via conjugate policies. DE learns and deploys set policies which can be conveniently generated as byproduct descent. provide both theoretical empirical results showing effectiveness at achieving exploration, improving performance, advantage over by random perturbations.

10.1609/aaai.v33i01.33013404 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Set classification problems arise when tasks are based on sets of observations as opposed to individual observations. In set classification, a rule is trained with N observations, where each labeled class information, and the prediction label performed also Data for appear, example, in diagnostics disease multiple cell nucleus images from single tissue. Relevant statistical models introduced, which motivate framework context-free feature extraction. By understanding an empirical...

10.1111/biom.12164 article EN Biometrics 2014-03-03

Abstract Blacklegged ticks, Ixodes scapularis Say (Acari: Ixodidae), are the primary vectors of Lyme disease in U.S.A. In this study, adult ticks were observed on public trails exhibiting increasing levels terrain complexity with a potential host nearby. The goal study was to (a) examine extent which may actively search (vs. sit‐and‐wait) for nearby host, (b) determine whether or not could locate position natural conditions and (c) role distances travelled short period time (30 min). Results...

10.1111/mve.12440 article EN Medical and Veterinary Entomology 2020-03-30

Ordinal data are often seen in real applications. Regular multicategory classification methods not designed for this type and a more proper treatment is needed. We consider framework of ordinal which pools the results from binary classifiers together. An inherent difficulty that class prediction can be ambiguous due to boundary crossing. To fix issue, we propose noncrossing method materializes by imposing constraints. asymptotic study proposed conducted. show simulated examples improve...

10.4310/sii.2017.v10.n2.a3 article EN Statistics and Its Interface 2016-10-31
Coming Soon ...