- Advanced Graph Neural Networks
- Statistical Methods and Inference
- Neural Networks and Applications
- Domain Adaptation and Few-Shot Learning
- Face and Expression Recognition
- Advanced Statistical Methods and Models
- Complex Network Analysis Techniques
- Model Reduction and Neural Networks
- Bayesian Methods and Mixture Models
- Machine Learning and ELM
- Sparse and Compressive Sensing Techniques
- Tensor decomposition and applications
- Stellar, planetary, and galactic studies
- Bioinformatics and Genomic Networks
- Machine Learning and Algorithms
- Topic Modeling
- Gaussian Processes and Bayesian Inference
- Image Retrieval and Classification Techniques
- Astronomy and Astrophysical Research
- Stochastic Gradient Optimization Techniques
- Forecasting Techniques and Applications
- Advanced Causal Inference Techniques
- Magnetic confinement fusion research
- Neural Networks and Reservoir Computing
- Machine Learning in Materials Science
The Institute of Statistical Mathematics
2022-2025
RIKEN Center for Advanced Intelligence Project
2018-2024
Kyoto University
2018-2020
Osaka University
2016
We propose a systematic approach for decomposing numerical turbulence fields with both low and high degrees of freedom, extending beyond the conventional division into zonal flow turbulence. Specifically, we utilize Fourier expansion to decompose several substructures where phase kinetic energy density aligns positively or negatively flow's poloidal velocity, enabling separation expected be absorbed flow. The proposed methods were successfully applied simulation datasets generated,...
Abstract R -process enhanced stars with [Eu/Fe] ≥ +0.7 (so-called r -II stars) are believed to have formed in an extremely neutron-rich environment which a rare astrophysical event (e.g., neutron-star merger) occurred. This scenario is supported by the existence of ultra-faint dwarf galaxy, Reticulum II, where most highly elements. In this scenario, some small fraction galaxies around Milky Way were enhanced. When each r-enhanced galaxy accreted Way, it deposited many Galactic halo similar...
A simple framework Probabilistic Multi-view Graph Embedding (PMvGE) is proposed for multi-view feature learning with many-to-many associations so that it generalizes various existing methods. PMvGE a probabilistic model predicting new via graph embedding of the nodes data vectors links their associations. are transformed by neural networks to in shared space, and probability association between two modeled inner product vectors. While techniques can treat only either or non-linear...
We propose weighted inner product similarity (WIPS) for neural network-based graph embedding. In addition to the parameters of networks, we optimize weights by allowing positive and negative values. Despite its simplicity, WIPS can approximate arbitrary general similarities including definite, conditionally indefinite kernels. is free from model selection, since it learn any models such as cosine similarity, Poincaré distance Wasserstein distance. Our experiments show that proposed method...
A large number of images are available on online photo-sharing services along with rich meta-data, including tags, groups, and locations, etc. For associating two domains different modalities, e.g. Canonical Correlation Analysis (CCA) its extended methods used widely. We employ a more flexible graph embedding method called Cross-Domain Matching (CDMCA), which can deal many-to-many relationships between any domains, for images, groups. Experiments Tag-to-Image Image-to-Tag retrieval tasks...
We propose $β$-graph embedding for robustly learning feature vectors from data and noisy link weights. A newly introduced empirical moment $β$-score reduces the influence of contamination measures difference between underlying correct expected weights links specified generative model. The proposed method is computationally tractable; we employ a minibatch-based efficient stochastic algorithm prove that this locally minimizes $β$-score. conduct numerical experiments on synthetic real-world datasets.
This study proposes an interpretable neural network-based non-proportional odds model (N$^3$POM) for ordinal regression. N$^3$POM is different from conventional approaches to regression with models in several ways: (1) defined both continuous and discrete responses, whereas standard methods typically treat the ordered variables as if they are discrete, (2) instead of estimating response-dependent finite-dimensional coefficients linear responses done approaches, we train a non-linear network...
This study delves into the domain of dynamical systems, specifically forecasting time series defined through an evolution function. Traditional approaches in this area predict future behavior systems by inferring However, these methods may confront obstacles due to presence missing variables, which are usually attributed challenges measurement and a partial understanding system interest. To overcome obstacle, we introduce autoregressive with slack (ARS) model, that simultaneously estimates...
This paper presents an integrated perspective on robustness in regression. Specifically, we examine the relationship between traditional outlier-resistant robust estimation and optimization, which focuses parameter resistant to imaginary dataset-perturbations. While both are commonly regarded as methods, these concepts demonstrate a bias-variance trade-off, indicating that they follow roughly converse strategies.
We study a minimax risk of estimating inverse functions on plane, while keeping an estimator is also invertible. Learning invertibility from data and exploiting invertible are used in many domains, such as statistics, econometrics, machine learning. Although the consistency universality estimators have been well investigated, analysis efficiency these methods still under development. In this study, we for bi-Lipschitz square 2-dimensional plane. first introduce two types L2-risks to evaluate...
This study proposes an interpretable neural network-based nonproportional odds model (N3POM) for ordinal regression. N3POM is different from conventional approaches to regression with models in several ways: (a) defined both continuous and discrete responses, whereas standard methods typically treat the variables as if they were discrete, (b) instead of estimating response-dependent finite-dimensional coefficients linear responses done approaches, we train a nonlinear network serve...
We propose shifted inner-product similarity (SIPS), which is a novel yet very simple extension of the ordinary (IPS) for neural-network based graph embedding (GE). In contrast to IPS, that limited approximating positive-definite (PD) similarities, SIPS goes beyond limitation by introducing bias terms in IPS; we theoretically prove capable not only PD but also conditionally (CPD) similarities with many examples such as cosine similarity, negative Poincare distance and Wasserstein distance....
$R$-process enhanced stars with [Eu/Fe]$\geq+0.7$ (so-called $r$-II stars) are believed to have formed in an extremely neutron-rich environment which a rare astrophysical event (e.g., neutron star merger) occurred. This scenario is supported by the existence of ultra-faint dwarf galaxy, Reticulum~II, where most highly $r$-process elements. In this scenario, some small fraction galaxies around Milky Way were $r$ enhanced. When each $r$-enhanced galaxy accreted Way, it deposited many Galactic...
We propose $\textit{weighted inner product similarity}$ (WIPS) for neural network-based graph embedding. In addition to the parameters of networks, we optimize weights by allowing positive and negative values. Despite its simplicity, WIPS can approximate arbitrary general similarities including definite, conditionally indefinite kernels. is free from similarity model selection, since it learn any models such as cosine similarity, Poincar\'e distance Wasserstein distance. Our experiments show...
This paper discusses the estimation of generalization gap, difference between performance and training performance, for overparameterized models including neural networks. We first show that a functional variance, key concept in defining widely-applicable information criterion, characterizes gap even settings where conventional theory cannot be applied. As computational cost variance is expensive models, we propose an efficient approximation function Langevin (Langevin FV). method leverages...
This study delves into the domain of dynamical systems, specifically forecasting time series defined through an evolution function. Traditional approaches in this area predict future behavior systems by inferring However, these methods may confront obstacles due to presence missing variables, which are usually attributed challenges measurement and a partial understanding system interest. To overcome obstacle, we introduce autoregressive with slack (ARS) model, that simultaneously estimates...
Density power divergence (DPD) is designed to robustly estimate the underlying distribution of observations, in presence outliers. However, DPD involves an integral parametric density models be estimated; explicit form term can derived only for specific densities, such as normal and exponential densities. While we may perform a numerical integration each iteration optimization algorithms, computational complexity has hindered practical application DPD-based estimation more general To address...
While highly expressive parametric models including deep neural networks have an advantage to model complicated concepts, training such non-linear is known yield a high risk of notorious overfitting. To address this issue, study considers $(k,q)$th order variation regularization ($(k,q)$-VR), which defined as the $q$th-powered integral absolute $k$th derivative be trained; penalizing $(k,q)$-VR expected smoother function, avoid Particularly, encompasses conventional (general-order) total...
\'Cwik and Mielniczuk (1989) introduced a univariate kernel density ratio estimator, which directly estimates the without estimating two densities of interest. This study presents its straightforward multivariate adaptation.