Xuelei Sherry Ni

ORCID: 0000-0003-0634-0025
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Financial Distress and Bankruptcy Prediction
  • Imbalanced Data Classification Techniques
  • FinTech, Crowdfunding, Digital Finance
  • Statistical Methods and Inference
  • Neural Networks and Applications
  • Face and Expression Recognition
  • Machine Learning and ELM
  • Advanced Statistical Methods and Models
  • Sparse and Compressive Sensing Techniques
  • Bayesian Methods and Mixture Models
  • Fault Detection and Control Systems
  • Advanced biosensing and bioanalysis techniques
  • Microfinance and Financial Inclusion
  • Data Mining Algorithms and Applications
  • RNA Interference and Gene Delivery
  • Evaluation Methods in Various Fields
  • Stock Market Forecasting Methods
  • Nuclear Physics and Applications
  • Time Series Analysis and Forecasting
  • Private Equity and Venture Capital
  • Domain Adaptation and Few-Shot Learning
  • Metaheuristic Optimization Algorithms Research
  • Industrial Technology and Control Systems
  • Banking stability, regulation, efficiency
  • Advanced Decision-Making Techniques

Kennesaw State University
2011-2025

SAIC Motor (China)
2010

Georgia Institute of Technology
2007

This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification.Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training.The five most commonly used FS methods including weight by Gini, Chi-square, hierarchical variable clustering, correlation, information applied alleviate effect of redundant features.Two optimization approaches, random search (RS) Bayesian...

10.5121/ijdms.2019.11101 article EN International Journal of Database Management Systems 2019-02-28

In the peer-to-peer (P2P) lending market, current studies focus on two categories of approaches to evaluate loans, thus providing investment suggestions investors: credit scoring (i.e., predicting risk) and profit profitability). However, relying a single approach may bias loan evaluation conclusion. this paper, we propose bivariate model based integration approaches. We first formulate task as multi-target problem, in which loan_status default or not default) is used discrete outcome for...

10.3390/risks13020033 article EN cc-by Risks 2025-02-12

Recent results in homotopy and solution paths demonstrate that certain well-designed greedy algorithms, with a range of values the algorithmic parameter, can provide to sequence convex optimization problems. On other hand, regression many existing criteria subset selection (including Cp, AIC, BIC, MDL, RIC, etc.) involve optimizing an objective function contains counting measure. The two problems are formulated as (P1) (P0) present paper. latter is generally combinatoric has been proven be...

10.1214/009053606000001334 article EN The Annals of Statistics 2007-04-01

In this paper the authors explore implications and pedagogical potential of game jams. While jams have been noted for enhancing teamwork, communication, project management skill-sets, in educational benefits Game Jams. Through an analysis academic results participants a jam site as compared to control group non-jam participants, conclude that participation appear positive impact on students' performance many computing classes. The includes detailed comparison over 70,000 student measures...

10.1145/3241815.3241862 article EN 2018-09-14

Cell-penetrating peptides (CPPs) are capable of transporting molecules to which they tethered across cellular membranes. Unsurprisingly, CPPs have attracted attention for their potential drug delivery applications, but several technical hurdles remain be overcome. Chief among them is the so-called ‘endosomal escape problem,’ i.e. propensity CPP-cargo endocytosed entrapped in endosomes rather than reaching cytosol. Previously, a CPP fused calmodulin that bound binding site-containing cargos...

10.1371/journal.pone.0254468 article EN cc-by PLoS ONE 2021-09-02

In the peer to (P2P) lending platform, investors hope maximize their return while minimizing risk through a comprehensive understanding of P2P market. A low and stable average default rate across all borrowers denotes healthy market provides more confidence in promising investment. Therefore, having powerful model describe trend is crucial. Different from previous studies that focus on modeling at individual level, this paper, we are first comprehensively explore monthly aggregative level...

10.1145/3374135.3385287 preprint EN 2020-04-02

In the peer-to-peer (P2P) lending market, lenders lend money to borrowers through a virtual platform and earn possible profit generated by interest rate. From perspective of lenders, they want maximize while minimizing risk. Therefore, many studies have used machine learning algorithms help identify "best" loans for making investments. The mainly focused on two categories guide lenders' investments: one aims at risk investment (i.e., credit scoring perspective) other maximizing perspective)....

10.1145/3374135.3385272 preprint EN 2020-04-02

We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications.The model uses very simpleneural network structure tool in firststage, thenthe newly created features are used asthe additional input variables logistic regression second stage.The modelis compared traditional onestage credit customer classification.It is observed that proposed outperforms one-stage terms of accuracy, area under ROC curve, andKS...

10.5121/ijdkp.2018.8601 article EN International Journal of Data Mining & Knowledge Management Process 2018-11-30

This paper aims to explore models based on the extreme gradient boosting (XGBoost) approach for business risk classification. Feature selection (FS) algorithms and hyper-parameter optimizations are simultaneously considered during model training. The five most commonly used FS methods including weight by Gini, Chi-square, hierarchical variable clustering, correlation, information applied alleviate effect of redundant features. Two optimization approaches, random search (RS) Bayesian...

10.48550/arxiv.1901.08433 preprint EN other-oa arXiv (Cornell University) 2019-01-01

We aim at developing and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, ensembling techniques. Area Under Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross validation. Two undersampling strategies including random (RUS) cluster centroid (CCUS), as well two oversampling methods (ROS) Synthetic Minority Oversampling Technique (SMOTE),...

10.5121/ijmit.2019.11101 article EN International Journal of Managing Information Technology 2019-02-28

We aim at developing and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, ensembling techniques. Area Under Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross validation. Two undersampling strategies including random (RUS) cluster centroid (CCUS), as well two oversampling methods (ROS) Synthetic Minority Oversampling Technique (SMOTE),...

10.48550/arxiv.1903.05535 preprint EN other-oa arXiv (Cornell University) 2019-01-01

In bankruptcy prediction, the proportion of events is very low, which often oversampled to eliminate this bias.In paper, we study influence event rate on discrimination abilities prediction models.First statistical association and significance public records firmographics indicators with were explored.Then was from 0.12% 10%, 20%, 30%, 40%, 50%, respectively.Seven models developed, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine,...

10.5121/ijdms.2018.10101 article EN International Journal of Database Management Systems 2018-02-28

Data mining techniques have numerous applications in bankcard response modeling. Logistic regression has been used as the standard modeling tool financial industry because of its almost always desirable performance and interpretability. In this paper, we propose a hybrid model, which integrates decision tree based chi-square automatic interaction detection (CHAID) into logistic regression. first stage CHAID analysis is to detect possibly potential variable interactions. Then second stage,...

10.1109/icsai.2018.8599369 article EN 2018-11-01

We aim at developing and improving the imbalanced business risk modeling via jointly using proper evaluation criteria, resampling, cross-validation, classifier regularization, ensembling techniques. Area Under Receiver Operating Characteristic Curve (AUC of ROC) is used for model comparison based on 10-fold cross validation. Two undersampling strategies including random (RUS) cluster centroid (CCUS), as well two oversampling methods (ROS) Synthetic Minority Oversampling Technique (SMOTE),...

10.2139/ssrn.3415356 article EN SSRN Electronic Journal 2019-01-01

10.5281/zenodo.2583550 article EN cc-by Zenodo (CERN European Organization for Nuclear Research) 2019-03-05

Many business operations and strategies rely on bankruptcy prediction.In this paper, we aim to study the impacts of public records firmographics predict in a 12month-ahead period with using different classification models adding values traditionally used financial ratios.Univariate analysis shows statistical association significance indicators bankruptcy.Further, seven machine learning methods were developed, including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting,...

10.5121/csit.2018.80309 article EN 2018-02-17

We propose a two-stage hybrid approach with neural networks as the new feature construction algorithms for bankcard response classifications. The model uses very simple network structure tool in first stage, then newly created features are used additional input variables logistic regression second stage. is compared traditional one-stage credit customer classification. It observed that proposed outperforms terms of accuracy, area under ROC curve, and KS statistic. By creating technique,...

10.48550/arxiv.1812.02546 preprint EN other-oa arXiv (Cornell University) 2018-01-01

The objective of this study is to develop a good risk model for classifying business delinquency by simultaneously exploring several machine learning based methods including regularization, hyper-parameter optimization, and ensembling algorithms. rationale under the analyses firstly obtain base binary classifiers (include Logistic Regression (LR), K-Nearest Neighbors (KNN), Decision Tree (DT), Artificial Neural Networks (ANN)) via regularization appropriate settings hyper-parameters. Then...

10.1145/3299815.3314478 preprint EN 2019-04-18
Coming Soon ...