- Gaussian Processes and Bayesian Inference
- Advanced Memory and Neural Computing
- CCD and CMOS Imaging Sensors
- Machine Learning and Algorithms
- Neural dynamics and brain function
- Machine Learning and Data Classification
- Domain Adaptation and Few-Shot Learning
- Transportation Planning and Optimization
- Transportation and Mobility Innovations
- Statistical Methods and Inference
- Energy, Environment, and Transportation Policies
- Neuroscience and Neural Engineering
- Neural Networks and Applications
- Stochastic Gradient Optimization Techniques
- Time Series Analysis and Forecasting
- Generative Adversarial Networks and Image Synthesis
- Ferroelectric and Negative Capacitance Devices
- Anomaly Detection Techniques and Applications
- Photoreceptor and optogenetics research
- Model Reduction and Neural Networks
- Vehicle emissions and performance
- Bayesian Modeling and Causal Inference
- Quantum Computing Algorithms and Architecture
- Control Systems and Identification
- Electric Vehicles and Infrastructure
Zhejiang University
2024
National University of Defense Technology
2018-2021
University of Toronto
2017-2021
Deutsche Gesellschaft für Internationale Zusammenarbeit
2020
Vector Institute
2019
Tsinghua University
2016-2017
China University of Petroleum, Beijing
2015
The low-resolution analog-to-digital convertor (ADC) is a promising solution to significantly reduce the power consumption of radio frequency circuits in massive multiple-input multiple-output (MIMO) systems. In this letter, we investigate uplink spectral efficiency (SE) MIMO systems with ADCs over Rician fading channels, where both perfect and imperfect channel state information are considered. By modeling quantization noise as an additive noise, derive tractable exact approximation...
Variational Bayesian neural networks (BNNs) perform variational inference over weights, but it is difficult to specify meaningful priors and approximate posteriors in a high-dimensional weight space. We introduce functional (fBNNs), which maximize an Evidence Lower BOund (ELBO) defined directly on stochastic processes, i.e. distributions functions. prove that the KL divergence between processes equals supremum of marginal divergences all finite sets inputs. Based this, we practical training...
Variational Bayesian neural nets combine the flexibility of deep learning with uncertainty estimation. Unfortunately, there is a tradeoff between cheap but simple variational families (e.g.~fully factorized) or expensive and complicated inference procedures. We show that natural gradient ascent adaptive weight noise implicitly fits posterior to maximize evidence lower bound (ELBO). This insight allows us train full-covariance, fully factorized, matrix-variate Gaussian posteriors using noisy...
In this paper we introduce ZhuSuan, a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of methods and learning. ZhuSuan is built upon Tensorflow. Unlike existing learning libraries, are mainly designed deterministic neural networks supervised tasks, featured its root into inference, thus supporting various kinds models, including both traditional hierarchical models recent generative models. We use running examples to illustrate...
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under NVIDIA Open Model License Agreement, a permissive license that allows distribution, modification, use of its outputs. These perform competitively to on wide range evaluation benchmarks, were sized fit single DGX H100 with 8 GPUs when deployed in FP8 precision. believe community can benefit from these various research studies...
Recent progress in variational inference has paid much attention to the flexibility of posteriors. One promising direction is use implicit distributions, i.e., distributions without tractable densities as posterior. However, existing methods on posteriors still face challenges noisy estimation and computational infeasibility when applied models with high-dimensional latent variables. In this paper, we present a new approach named Kernel Implicit Variational Inference that addresses these...
Recently there have been increasing interests in learning and inference with implicit distributions (i.e., without tractable densities). To this end, we develop a gradient estimator for based on Stein's identity spectral decomposition of kernel operators, where the eigenfunctions are approximated by Nyström method. Unlike previous works that only provide estimates at sample points, our approach directly function, thus allows simple principled out-of-sample extension. We theoretical results...
The generalization properties of Gaussian processes depend heavily on the choice kernel, and this remains a dark art. We present Neural Kernel Network (NKN), flexible family kernels represented by neural network. NKN architecture is based composition rules for kernels, so that each unit network corresponds to valid kernel. It can compactly approximate compositional kernel structures such as those used Automatic Statistician (Lloyd et al., 2014), but because differentiable, it end-to-end...
Multiply-accumulate calculations using a memristor crossbar array is an important method to realize neuromorphic computing. However, the fabrication technology still immature, and it difficult fabricate large-scale arrays with high-yield, which restricts development of memristor-based computing technology. Therefore, cascading small-scale achieve computational ability that can be achieved by arrays, great significance for promoting application To address this issue, we present cascaded...
This paper proposes Full-Parallel Convolutional Neural Networks (FP-CNN) for specific target recognition, which utilize the analog memristive array circuits to carry out vector-matrix multiplication, and generate multiple output feature maps in one single processing cycle. Compared with ReLU Tanh function, we adopt absolute activation function innovatively reduce network scale dramatically, can achieve 99% recognition accuracy rate only three layers. Furthermore, propose a performance...
With the rapid development of VLSI industry, research intelligent applications moves towards IoT edge computing. While power consumption and area cost deep neural networks usually exceed hardware limitation devices. In this paper, we propose a low-power network architecture to address such problem. We simplify current popular convolutional structure, utilize memristor crossbar store weights execute convolution operation in parallel, present spiking networks. At same time, proposed...
As one of the most promising methods in next generation neuromorphic systems, memristor-based spiking neural networks (SNNs) show great advantages terms power efficiency, integration density, and biological plausibility. However, because nondifferentiability discrete spikes, it is difficult to train SNNs with gradient descent error backpropagation online. In this article, we propose an improved training algorithm for multilayer memristive SNN (MSNN) three spontaneously, supporting <italic...
This paper proposes a method that renders the weights of neural network with quaternary synapses map into only four-level memristance memristive devices. We show this is capable operating negligible loss in classification accuracy when memristors utilized can store at least four unique values. Compared other state-of-the-art methods, presented achieve 98.65% under 0.60M parameters. Systematic error analysis shows still reach over 95% condition yield memristor crossbar array, 100 µV op-amp...
Vector neural network (VNN) is one of the most important methods to process interval data. However, VNN, which contains a great number multiply-accumulate (MAC) operations, often adopts pure numerical calculation method, and thus difficult be miniaturized for embedded applications. In this paper, we propose memristor based vector-type backpropagation (MVTBP) architecture utilizes memristive arrays accelerate MAC operations Owing unique brain-like synaptic characteristics devices, e.g. ,...
The developments of Rademacher complexity and PAC-Bayesian theory have been largely independent. One exception is the PAC-Bayes theorem Kakade, Sridharan, Tewari (2008), which established via by viewing Gibbs classifiers as linear operators. goal this paper to extend bridge between state-of-the-art theory. We first demonstrate that one can match fast rate Catoni's bounds (Catoni, 2007) using shifted processes (Wegkamp, 2003; Lecué Mitchell, 2012; Zhivotovskiy Hanneke, 2018). then derive a...