Alexander Y. Shestopaloff

ORCID: 0000-0003-2228-5708
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Complex Network Analysis Techniques
  • Complex Systems and Time Series Analysis
  • Markov Chains and Monte Carlo Methods
  • Target Tracking and Data Fusion in Sensor Networks
  • Stock Market Forecasting Methods
  • Advanced Clustering Algorithms Research
  • Bayesian Methods and Mixture Models
  • Machine Learning in Healthcare
  • Blockchain Technology Applications and Security
  • Autopsy Techniques and Outcomes
  • Bayesian Modeling and Causal Inference
  • Time Series Analysis and Forecasting
  • Data-Driven Disease Surveillance
  • Statistical Methods and Bayesian Inference
  • Financial Risk and Volatility Modeling
  • Sparse and Compressive Sensing Techniques
  • Topological and Geometric Data Analysis
  • Bioinformatics and Genomic Networks
  • Graph theory and applications
  • Migration, Health and Trauma
  • Neural Networks and Applications
  • Advanced Text Analysis Techniques
  • Advanced Graph Neural Networks
  • Fault Detection and Control Systems
  • Statistical Methods and Inference

Queen Mary University of London
2020-2025

Memorial University of Newfoundland
2025

The Alan Turing Institute
2018-2023

University of Oxford
2023

University of Toronto
2013-2017

Reduced-rank (RR) regression may be interpreted as a dimensionality reduction technique able to reveal complex relationships among the data parsimoniously. However, RR models typically overlook any potential group structure responses by assuming low-rank on coefficient matrix. To address this limitation, Bayesian Partial (BPRR) is exploited, where response vector and matrix are partitioned into low- full-rank sub-groups. As opposed literature, which assumes known rank, novel strategy...

10.1080/10618600.2024.2446357 article EN cc-by-nc-nd Journal of Computational and Graphical Statistics 2025-01-03

We tackle the challenges of modeling high-dimensional data sets, particularly those with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration concepts from non-parametric regression, factor models, neural networks for regression. introduces PCA Soft layers, which can be embedded at any stage network architecture, allowing model to alternate between non-linear transformations. This flexibility makes our...

10.48550/arxiv.2502.11310 preprint EN arXiv (Cornell University) 2025-02-16

Verbal autopsies (VA) are increasingly used in low- and middle-income countries where most causes of death (COD) occur at home without medical attention, deaths differ substantially from hospital deaths. Hence, there is no plausible "standard" against which VAs for may be validated. Previous studies have shown contradictory performance automated methods compared to physician-based classification CODs. We sought compare the classic naive Bayes classifier (NBC) versus existing classifiers,...

10.1186/s12916-015-0521-2 article EN cc-by BMC Medicine 2015-11-25

Abstract Not all graphs are clusterable. have a clustered structure and can be meaningfully summarized through vertex clustering. Clusterable characterized by pockets of densely connected vertices that only sparsely to the remaining graph. In this article, we re-introduce very simple intuitive, yet highly informative, statistical hypothesis test for graph clusterability is based on neighborhood samples. The goal determine if meets necessary structural conditions clusters. Our clusterable...

10.1007/s41060-023-00389-6 article EN cc-by International Journal of Data Science and Analytics 2023-04-16

Abstract We introduce graph clustering quality measures based on comparisons of global, intra- and inter-cluster densities, an accompanying statistical significance test a step-by-step routine for assessment. Our work is centred the idea that well-clustered graphs will display mean intra-cluster density higher than global density. do not rely any generative model null graph. are shown to meet axioms good function. They have intuitive graph-theoretic interpretation, formal interpretation can...

10.1093/comnet/cnaa012 article EN Journal of Complex Networks 2020-03-17

In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to dependence between two shifted relative one another, and leveraged for purposes of control, forecasting or clustering. We develop a clustering-driven methodology robust detection lagged multi-factor models. Within our framework, envisioned pipeline takes as input set series, creates an enlarged universe extracted subsequence from each using sliding...

10.2139/ssrn.4445975 article EN SSRN Electronic Journal 2023-01-01

inference problem that has no straightforward solution. We take a Bayesian approach to the of unknown parameters non-linear state model; this, in turn, requires availability ecient Markov Chain Monte Carlo (MCMC) sampling methods for latent (hidden) variables and model parameters. Using ensemble technique Neal (2010) embedded HMM (2003), we introduce new method space models. The key idea is perform parameter updates conditional on an enormously large sequences, as opposed single sequence,...

10.14288/1.0043899 preprint EN arXiv (Cornell University) 2013-05-02

We propose a new scheme for selecting pool states the embedded Hidden Markov Model (HMM) Chain Monte Carlo (MCMC) method. This allows HMM method to be used efficient sampling in state space models where can high-dimensional. Previously, methods were only applicable low-dimensional state-space models. demonstrate that using our proposed selection scheme, an sampler have similar performance well-tuned uses combination of Particle Gibbs with Backward Sampling (PGBS) and Metropolis updates. The...

10.1214/17-ba1077 article EN Bayesian Analysis 2017-10-23

We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from potentially non-stationary data stream. The method is based on extended Kalman filter (EKF), but uses novel low-rank plus diagonal decomposition posterior precision matrix, which gives cost per step linear in number model parameters. In contrast to methods stochastic variational inference, our fully deterministic, and does not require step-size tuning. show...

10.48550/arxiv.2305.19535 preprint EN cc-by arXiv (Cornell University) 2023-01-01

In the light of micro-scale inefficiencies due to highly fragmented bitcoin trading landscape, we use a granular data set comprising orderbook and trades from most liquid markets, understand price formation process at sub-1-second time scales. To this end, construct features that encapsulate relevant microstructural information over short lookback windows. These are subsequently leveraged, first generate leader–lagger network quantifies how markets impact one another, then train linear...

10.1080/1350486x.2022.2080083 article EN cc-by-nc-nd Applied Mathematical Finance 2021-09-03

Measuring graph clustering quality remains an open problem. To address it, we introduce measures based on comparisons of intra- and inter-cluster densities, accompanying statistical test the significance their differences a step-by-step routine for assessment. Our null hypothesis does not rely any generative model graph, unlike modularity which uses configuration as model. are shown to meet axioms good function, very commonly used measure. They also have intuitive graph-theoretic...

10.48550/arxiv.1906.02366 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Background: Verbal autopsies (VA) are increasingly used in low- and middle-income countries where most causes of death (COD) occur at home without medical attention, deaths differ substantially from hospital deaths. Hence, there is no plausible “standard” against which VAs for may be validated. Previous studies have shown contradictory performance automated methods compared to physician-based classification CODs. We sought compare the classic naive Bayes classifier (NBC) versus existing...

10.32920/14639652.v1 preprint EN cc-by 2021-05-21

We present the findings of a large-scale live trading experiment involving placement millions market orders sent at high frequency on two cryptocurrency exchanges, Bybit and Binance. analyze execution outcomes these in comparison to expected outcome based most recent snapshot Limit Order Book (LOB) time order submission for modes: one using second marketable limit aiming best price. Discrepancies between actual are due intermittent LOB updates during span resulting from delays exchange,...

10.2139/ssrn.4677989 article EN SSRN Electronic Journal 2024-01-01

We derive a novel, provably robust, and closed-form Bayesian update rule for online filtering in state-space models the presence of outliers misspecified measurement models. Our method combines generalised inference with methods such as extended ensemble Kalman filter. use former to show robustness latter ensure computational efficiency case nonlinear matches or outperforms other robust (such those based on variational Bayes) at much lower cost. this empirically range problems outlier...

10.48550/arxiv.2405.05646 preprint EN arXiv (Cornell University) 2024-05-09

Reduced-rank (RR) regression may be interpreted as a dimensionality reduction technique able to reveal complex relationships among the data parsimoniously. However, RR models typically overlook any potential group structure responses by assuming low-rank on coefficient matrix. To address this limitation, Bayesian Partial (BPRR) is exploited, where response vector and matrix are partitioned into low- full-rank sub-groups. As opposed literature, which assumes known rank, novel strategy...

10.48550/arxiv.2406.17444 preprint EN arXiv (Cornell University) 2024-06-25

We demonstrate the use of Conditional Variational Encoder (CVAE) to improve forecasts daily stock volume time series in both short and long term forecasting tasks, with advanced information input variables such as rebalancing dates. CVAE generates non-linear out-of-sample forecasts, which have better accuracy closer fit correlation actual data, compared traditional linear models. These generative can also be used for scenario generation, aids interpretation. further discuss correlations...

10.48550/arxiv.2406.19414 preprint EN arXiv (Cornell University) 2024-06-19

We employ a Bayesian modelling technique for high dimensional cointegration estimation to construct low volatility portfolios from large number of stocks. The proposed framework effectively identifies sparse and important relationships amongst baskets stocks across various asset spaces, resulting in with reduced volatility. Such persist well over the out-of-sample testing time, providing practical benefits portfolio construction optimization. Further studies on drawdown minimization also...

10.48550/arxiv.2407.10175 preprint EN arXiv (Cornell University) 2024-07-14
Coming Soon ...