- Neural Networks and Applications
- Gaussian Processes and Bayesian Inference
- Advanced Neural Network Applications
- Machine Learning in Materials Science
- Model Reduction and Neural Networks
- Advanced Graph Neural Networks
- Machine Learning and ELM
- Parallel Computing and Optimization Techniques
- Stochastic Gradient Optimization Techniques
- Mathematical Dynamics and Fractals
- Random Matrices and Applications
- Advanced Image Fusion Techniques
- Statistical Mechanics and Entropy
- Quantum many-body systems
- Graph Theory and Algorithms
- Protein Structure and Dynamics
- Adversarial Robustness in Machine Learning
- Geometric Analysis and Curvature Flows
- Particle physics theoretical and experimental studies
- Topological and Geometric Data Analysis
- Optical Systems and Laser Technology
- advanced mathematical theories
- Sparse and Compressive Sensing Techniques
- Astrophysics and Cosmic Phenomena
- Computational Physics and Python Applications
University of Science and Technology of China
2021-2022
Academy of Mathematics and Systems Science
2022
Chinese Academy of Sciences
2022
Northwestern University
2020-2021
A bstract Deep learning methods have been increasingly adopted to study jets in particle physics. Since symmetry-preserving behavior has shown be an important factor for improving the performance of deep many applications, Lorentz group equivariance — a fundamental spacetime symmetry elementary particles recently incorporated into model jet tagging. However, design is computationally costly due analytic construction high-order tensors. In this article, we introduce LorentzNet, new The...
The prevailing thinking is that orthogonal weights are crucial to enforcing dynamical isometry and speeding up training. increase in learning speed results from initialization linear networks has been well-proven. However, while the same believed also hold for nonlinear when condition satisfied, training dynamics behind this contention have not thoroughly explored. In work, we study of ultra-wide across a range architectures, including Fully Connected Networks (FCNs) Convolutional Neural...
The prevailing thinking is that orthogonal weights are crucial to enforcing dynamical isometry and speeding up training. increase in learning speed results from initialization linear networks has been well-proven. However, while the same believed also hold for nonlinear when condition satisfied, training dynamics behind this contention have not thoroughly explored. In work, we study of ultra-wide across a range architectures, including Fully Connected Networks (FCNs) Convolutional Neural...
Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data. Nevertheless, it is well known that deep GCNs suffer from the over-smoothing problem, where node representations tend to be indistinguishable as more layers are stacked up. The theoretical research date on has focused primarily expressive power rather than trainability, an optimization perspective. Compared expressivity, trainability attempts address a fundamental...
By conceiving physical systems as 3D many-body point clouds, geometric graph neural networks (GNNs), such SE(3)/E(3) equivalent GNNs, have showcased promising performance. In particular, their effective message-passing mechanics make them adept at modeling molecules and crystalline materials. However, current GNNs only offer a mean-field approximation of the system, encapsulated within two-body message passing, thus falling short in capturing intricate relationships these graphs. To address...
In recent years, the mean field theory has been applied to study of neural networks and achieved a great deal success. The various network structures, including CNNs, RNNs, Residual networks, Batch normalization. Inevitably, work also covered use dropout. shows that existence depth scales limit maximum signal propagation gradient backpropagation. However, backpropagation is derived under independence assumption weights used during feed forward are drawn independently from ones in This not...
Most theoretical studies explaining the regularization effect in deep learning have only focused on gradient descent with a sufficient small rate or even flow (infinitesimal rate). Such researches, however, neglected reasonably large applied most practical applications. In this work, we characterize implicit bias of linear networks for binary classification using logistic loss regime, inspired by seminal work Lewkowycz et al. [26] regression setting squared loss. They found regime stepsize...
We construct a continuous family of exchangeable pairs by perturbing the random variable through diffusion processes on manifold in order to apply Stein method certain geometric settings. compare our perturbation with other approaches building and show that scheme cooperates infinitesimal version Stein's harmoniously. More precisely, satisfy key condition general. Based pairs, we are able extend approximate normality eigenfunctions Laplacian compact Witten Laplacian, which is form:$Δ_w = Δ-...