NFDI4DS | UHH-SEMS - Publication Details

Shengyang Sun

ORCID: 0000-0003-3286-0585

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101560169

Research Areas

Gaussian Processes and Bayesian Inference
Advanced Memory and Neural Computing
CCD and CMOS Imaging Sensors
Machine Learning and Algorithms
Neural dynamics and brain function
Machine Learning and Data Classification
Domain Adaptation and Few-Shot Learning
Transportation Planning and Optimization
Transportation and Mobility Innovations
Statistical Methods and Inference
Energy, Environment, and Transportation Policies
Neuroscience and Neural Engineering
Neural Networks and Applications
Stochastic Gradient Optimization Techniques
Time Series Analysis and Forecasting
Generative Adversarial Networks and Image Synthesis
Ferroelectric and Negative Capacitance Devices
Anomaly Detection Techniques and Applications
Photoreceptor and optogenetics research
Model Reduction and Neural Networks
Vehicle emissions and performance
Bayesian Modeling and Causal Inference
Quantum Computing Algorithms and Architecture
Control Systems and Identification
Electric Vehicles and Infrastructure

Zhejiang University
2024

National University of Defense Technology
2018-2021

University of Toronto
2017-2021

Deutsche Gesellschaft für Internationale Zusammenarbeit
2020

Vector Institute
2019

Tsinghua University
2016-2017

China University of Petroleum, Beijing
2015

On the Spectral Efficiency of Massive MIMO Systems With Low-Resolution ADCs

OPENALEX - Publications

Jiayi Zhang Linglong Dai Shengyang Sun Zhaocheng Wang

The low-resolution analog-to-digital convertor (ADC) is a promising solution to significantly reduce the power consumption of radio frequency circuits in massive multiple-input multiple-output (MIMO) systems. In this letter, we investigate uplink spectral efficiency (SE) MIMO systems with ADCs over Rician fading channels, where both perfect and imperfect channel state information are considered. By modeling quantization noise as an additive noise, derive tractable exact approximation...

10.1109/lcomm.2016.2535132 article EN IEEE Communications Letters 2016-02-26

Functional Variational Bayesian Neural Networks

OPENALEX - Publications

Shengyang Sun Guodong Zhang Jiaxin Shi Roger Grosse

Variational Bayesian neural networks (BNNs) perform variational inference over weights, but it is difficult to specify meaningful priors and approximate posteriors in a high-dimensional weight space. We introduce functional (fBNNs), which maximize an Evidence Lower BOund (ELBO) defined directly on stochastic processes, i.e. distributions functions. prove that the KL divergence between processes equals supremum of marginal divergences all finite sets inputs. Based this, we practical training...

10.48550/arxiv.1903.05779 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Noisy Natural Gradient as Variational Inference

OPENALEX - Publications

Guodong Zhang Shengyang Sun David Duvenaud Roger Grosse

Variational Bayesian neural nets combine the flexibility of deep learning with uncertainty estimation. Unfortunately, there is a tradeoff between cheap but simple variational families (e.g.~fully factorized) or expensive and complicated inference procedures. We show that natural gradient ascent adaptive weight noise implicitly fits posterior to maximize evidence lower bound (ELBO). This insight allows us train full-covariance, fully factorized, matrix-variate Gaussian posteriors using noisy...

10.48550/arxiv.1712.02390 preprint EN other-oa arXiv (Cornell University) 2017-01-01

ZhuSuan: A Library for Bayesian Deep Learning

OPENALEX - Publications

Jiaxin Shi Jianfei Chen Jun Zhu Shengyang Sun Yucen Luo and 2 more

In this paper we introduce ZhuSuan, a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of methods and learning. ZhuSuan is built upon Tensorflow. Unlike existing learning libraries, are mainly designed deterministic neural networks supervised tasks, featured its root into inference, thus supporting various kinds models, including both traditional hierarchical models recent generative models. We use running examples to illustrate...

10.48550/arxiv.1709.05870 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Nemotron-4 340B Technical Report

OPENALEX - Publications

Nvidia NULL AUTHOR_ID Bo Adler Niket Agarwal Ashwath Aithal and 78 more

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under NVIDIA Open Model License Agreement, a permissive license that allows distribution, modification, use of its outputs. These perform competitively to on wide range evaluation benchmarks, were sized fit single DGX H100 with 8 GPUs when deployed in FP8 precision. believe community can benefit from these various research studies...

10.48550/arxiv.2406.11704 preprint EN arXiv (Cornell University) 2024-06-17

Kernel Implicit Variational Inference

OPENALEX - Publications

Jiaxin Shi Shengyang Sun Jun Zhu

Recent progress in variational inference has paid much attention to the flexibility of posteriors. One promising direction is use implicit distributions, i.e., distributions without tractable densities as posterior. However, existing methods on posteriors still face challenges noisy estimation and computational infeasibility when applied models with high-dimensional latent variables. In this paper, we present a new approach named Kernel Implicit Variational Inference that addresses these...

10.48550/arxiv.1705.10119 preprint EN other-oa arXiv (Cornell University) 2017-01-01

A Spectral Approach to Gradient Estimation for Implicit Distributions

OPENALEX - Publications

Jiaxin Shi Shengyang Sun Jun Zhu

Recently there have been increasing interests in learning and inference with implicit distributions (i.e., without tractable densities). To this end, we develop a gradient estimator for based on Stein's identity spectral decomposition of kernel operators, where the eigenfunctions are approximated by Nyström method. Unlike previous works that only provide estimates at sample points, our approach directly function, thus allows simple principled out-of-sample extension. We theoretical results...

10.48550/arxiv.1806.02925 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Differentiable Compositional Kernel Learning for Gaussian Processes

OPENALEX - Publications

Shengyang Sun Guodong Zhang Chaoqi Wang Wenyuan Zeng Jiaman Li and 1 more

The generalization properties of Gaussian processes depend heavily on the choice kernel, and this remains a dark art. We present Neural Kernel Network (NKN), flexible family kernels represented by neural network. NKN architecture is based composition rules for kernels, so that each unit network corresponds to valid kernel. It can compactly approximate compositional kernel structures such as those used Automatic Statistician (Lloyd et al., 2014), but because differentiable, it end-to-end...

10.48550/arxiv.1806.04326 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection

OPENALEX - Publications

Shengyang Sun Xiaojin Gong

10.1109/icme57554.2024.10688202 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2024-07-15

Pedestrian Simulation Modeling for World Expo 2010 Shanghai

OPENALEX - Publications

Yue Zhou Jiangyan Wang Di Huang Shengyang Sun

10.1016/s1570-6672(08)60060-4 article EN Journal of Transportation Systems Engineering and Information Technology 2009-04-01

Cascaded Architecture for Memristor Crossbar Array Based Larger-Scale Neuromorphic Computing

OPENALEX - Publications

Shengyang Sun Hui Xu Jiwei Li Qingjiang Li Haijun Liu

Multiply-accumulate calculations using a memristor crossbar array is an important method to realize neuromorphic computing. However, the fabrication technology still immature, and it difficult fabricate large-scale arrays with high-yield, which restricts development of memristor-based computing technology. Therefore, cascading small-scale achieve computational ability that can be achieved by arrays, great significance for promoting application To address this issue, we present cascaded...

10.1109/access.2019.2915787 article EN cc-by-nc-nd IEEE Access 2019-01-01

A memristor-based convolutional neural network with full parallelization architecture

OPENALEX - Publications

Shengyang Sun Zhiwei Li Jiwei Li Husheng Liu Haijun Liu and 1 more

This paper proposes Full-Parallel Convolutional Neural Networks (FP-CNN) for specific target recognition, which utilize the analog memristive array circuits to carry out vector-matrix multiplication, and generate multiple output feature maps in one single processing cycle. Compared with ReLU Tanh function, we adopt absolute activation function innovatively reduce network scale dramatically, can achieve 99% recognition accuracy rate only three layers. Furthermore, propose a performance...

10.1587/elex.16.20181034 article EN IEICE Electronics Express 2019-01-01

Low-Consumption Neuromorphic Memristor Architecture Based on Convolutional Neural Networks

OPENALEX - Publications

Shengyang Sun Jiwei Li Zhiwei Li Haijun Liu Qingjiang Li and 1 more

With the rapid development of VLSI industry, research intelligent applications moves towards IoT edge computing. While power consumption and area cost deep neural networks usually exceed hardware limitation devices. In this paper, we propose a low-power network architecture to address such problem. We simplify current popular convolutional structure, utilize memristor crossbar store weights execute convolution operation in parallel, present spiking networks. At same time, proposed...

10.1109/ijcnn.2018.8489441 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2018-07-01

In Situ Learning in Hardware Compatible Multilayer Memristive Spiking Neural Network

OPENALEX - Publications

Jiwei Li Hui Xu Shengyang Sun Nan Li Qingjiang Li and 2 more

As one of the most promising methods in next generation neuromorphic systems, memristor-based spiking neural networks (SNNs) show great advantages terms power efficiency, integration density, and biological plausibility. However, because nondifferentiability discrete spikes, it is difficult to train SNNs with gradient descent error backpropagation online. In this article, we propose an improved training algorithm for multilayer memristive SNN (MSNN) three spontaneously, supporting <italic...

10.1109/tcds.2021.3049487 article EN IEEE Transactions on Cognitive and Developmental Systems 2021-01-07

Quaternary synapses network for memristor-based spiking convolutional neural networks

OPENALEX - Publications

Shengyang Sun Jiwei Li Zhiwei Li Husheng Liu Haijun Liu and 1 more

This paper proposes a method that renders the weights of neural network with quaternary synapses map into only four-level memristance memristive devices. We show this is capable operating negligible loss in classification accuracy when memristors utilized can store at least four unique values. Compared other state-of-the-art methods, presented achieve 98.65% under 0.60M parameters. Systematic error analysis shows still reach over 95% condition yield memristor crossbar array, 100 µV op-amp...

10.1587/elex.16.20190004 article EN IEICE Electronics Express 2019-01-01

Memristor-based vector neural network architecture*

OPENALEX - Publications

Haijun Liu Changlin Chen Xi Zhu Shengyang Sun Qingjiang Li and 1 more

Vector neural network (VNN) is one of the most important methods to process interval data. However, VNN, which contains a great number multiply-accumulate (MAC) operations, often adopts pure numerical calculation method, and thus difficult be miniaturized for embedded applications. In this paper, we propose memristor based vector-type backpropagation (MVTBP) architecture utilizes memristive arrays accelerate MAC operations Owing unique brain-like synaptic characteristics devices, e.g. ,...

10.1088/1674-1056/ab65b5 article EN Chinese Physics B 2019-12-27

Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes

OPENALEX - Publications

Jun Yang Shengyang Sun Daniel M. Roy

The developments of Rademacher complexity and PAC-Bayesian theory have been largely independent. One exception is the PAC-Bayes theorem Kakade, Sridharan, Tewari (2008), which established via by viewing Gibbs classifiers as linear operators. goal this paper to extend bridge between state-of-the-art theory. We first demonstrate that one can match fast rate Catoni's bounds (Catoni, 2007) using shifted processes (Wegkamp, 2003; Lecué Mitchell, 2012; Zhivotovskiy Hanneke, 2018). then derive a...

10.48550/arxiv.1908.07585 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Coming Soon ...