- Neural dynamics and brain function
- Advanced Memory and Neural Computing
- Domain Adaptation and Few-Shot Learning
- Neural Networks and Applications
- Neuroscience and Neural Engineering
- Human Pose and Action Recognition
- Multimodal Machine Learning Applications
- EEG and Brain-Computer Interfaces
- Stochastic Gradient Optimization Techniques
- Model Reduction and Neural Networks
- Advanced Neural Network Applications
- Functional Brain Connectivity Studies
- Neural Networks and Reservoir Computing
- stochastic dynamics and bifurcation
- Machine Learning and ELM
- Ferroelectric and Negative Capacitance Devices
- Generative Adversarial Networks and Image Synthesis
- Sparse and Compressive Sensing Techniques
- Visual perception and processing mechanisms
- Machine Learning and Algorithms
- Neuroscience and Neuropharmacology Research
- Topic Modeling
- Intelligent Tutoring Systems and Adaptive Learning
- Reinforcement Learning in Robotics
- Explainable Artificial Intelligence (XAI)
ETH Zurich
2019-2024
University of Bern
2015-2024
SIB Swiss Institute of Bioinformatics
2019-2024
University of Zurich
2019-2022
École Polytechnique Fédérale de Lausanne
2021
Instituto Superior Técnico
2020
Instituto Superior de Gestão
2020
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento
2012-2015
University of Lisbon
2010-2015
Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based task-conditioned hypernetworks, i.e., that generate the weights of target model task identity. Continual learning (CL) is less difficult for class models thanks to simple key feature: instead recalling input-output relations all previously seen data, hypernetworks only require rehearsing task-specific weight...
Deep learning has seen remarkable developments over the last years, many of them inspired by neuroscience. However, main mechanism behind these advances - error backpropagation appears to be at odds with neurobiology. Here, we introduce a multilayer neuronal network model simplified dendritic compartments in which error-driven synaptic plasticity adapts towards global desired output. In contrast previous work our does not require separate phases and is driven local prediction errors...
Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. To overcome this problem, we present a novel approach based task-conditioned hypernetworks, i.e., that generate the weights of target model task identity. Continual learning (CL) is less difficult for class models thanks to simple key feature: instead recalling input-output relations all previously seen data, hypernetworks only require rehearsing task-specific weight...
At present, the mechanisms of in-context learning in Transformers are not well understood and remain mostly an intuition. In this paper, we suggest that training on auto-regressive objectives is closely related to gradient-based meta-learning formulations. We start by providing a simple weight construction shows equivalence data transformations induced 1) single linear self-attention layer 2) gradient-descent (GD) regression loss. Motivated construction, show empirically when...
A bstract One of the most fundamental laws physics is principle least action. Motivated by its predictive power, we introduce a neuronal least-action for cortical processing sensory streams to produce appropriate behavioural outputs in real time. The postulates that voltage dynamics pyramidal neurons prospectively minimizes local somato-dendritic mismatch error within individual neurons. For output neurons, implies minimizing an instantaneous error. deep network it prospective firing...
Animal behaviour depends on learning to associate sensory stimuli with the desired motor command. Understanding how brain orchestrates necessary synaptic modifications across different areas has remained a longstanding puzzle. Here, we introduce multi-area neuronal network model in which plasticity continuously adapts towards global output. In this is driven by local dendritic prediction error that arises from failure predict top-down input given bottom-up activities. Such errors occur at...
The success of deep learning, a brain-inspired form AI, has sparked interest in understanding how the brain could similarly learn across multiple layers neurons. However, majority biologically-plausible learning algorithms have not yet reached performance backpropagation (BP), nor are they built on strong theoretical foundations. Here, we analyze target propagation (TP), popular but fully understood alternative to BP, from standpoint mathematical optimization. Our theory shows that TP is...
Learning a sequence of tasks without access to i.i.d. observations is widely studied form continual learning (CL) that remains challenging. In principle, Bayesian directly applies this setting, since recursive and one-off updates yield the same result. practice, however, updating often leads poor trade-off solutions across because approximate inference necessary for most models interest. Here, we describe an alternative approach where task-conditioned parameter distributions are continually...
Abstract Sensory association cortices receive diverse inputs with their role in representing and integrating multi-sensory content remaining unclear. Here we examined the neuronal correlates of an auditory-tactile stimulus sequence posterior parietal cortex (PPC) using 2-photon calcium imaging awake mice. We find that subpopulations layer 2/3 PPC reliably represent texture-touch events, addition to auditory cues presage incoming tactile stimulus. Notably, altering flow sensory events through...
It is believed that energy efficiency an important constraint in brain evolution. As synaptic transmission dominates consumption, can be saved by ensuring only a few synapses are active. therefore likely the formation of sparse codes and connectivity fundamental objectives plasticity. In this work we study how result from learning rule excitatory synapses. Information maximised when potentiation depression balanced according to mean presynaptic activity level resulting fraction zero-weight...
Finding neural network weights that generalize well from small datasets is difficult. A promising approach to learn a weight initialization such number of changes results in low generalization error. We show this form meta-learning can be improved by letting the learning algorithm decide which change, i.e., where learn. find patterned sparsity emerges process, with pattern varying on problem-by-problem basis. This selective better and less interference range few-shot continual problems....
Being able to model uncertainty is a vital property for any intelligent agent. In an environment in which the domain of input stimuli fully controlled neglecting may work, but this usually does not hold true real-world scenario. This highlights necessity learning algorithms that robustly detect noisy and out-of-distribution examples. Here we propose novel approach estimation based on adversarially trained hypernetworks. We define weight posterior uniformly allow realizations neural network...
We consider deep multi-layered generative models such as Boltzmann machines or Hopfield nets in which computation (which implements inference) is both recurrent and stochastic, but where the recurrence not to model sequential structure, only perform computation. find conditions under a simple feedforward very good initialization for inference, after input units are clamped observed values. It means that initialization, network close fixed point of dynamics, energy gradient 0. The main...
A fundamental function of cortical circuits is the integration information from different sources to form a reliable basis for behavior. While animals behave as if they optimally integrate according Bayesian probability theory, implementation required computations in biological substrate remains unclear. We propose novel, view on dynamics conductance-based neurons and synapses which suggests that are naturally equipped perform integration. In our approach apical dendrites represent prior...
Transformers have become the dominant model in deep learning, but reason for their superior performance is poorly understood. Here, we hypothesize that strong of stems from an architectural bias towards mesa-optimization, a learned process running within forward pass consisting following two steps: (i) construction internal learning objective, and (ii) its corresponding solution found through optimization. To test this hypothesis, reverse-engineer series autoregressive trained on simple...
Recent developments in few-shot learning have shown that during fast adaption, gradient-based meta-learners mostly rely on embedding features of powerful pretrained networks. This leads us to research ways effectively adapt and utilize the meta-learner's full potential. Here, we demonstrate effectiveness hypernetworks this context. We propose a soft row-sharing hypernetwork architecture show training with variant MAML is tightly linked meta-learning curvature matrix used condition gradients...
This review examines gradient-based techniques to solve bilevel optimization problems. Bilevel extends the loss minimization framework underlying statistical learning systems that are implicitly defined through a quantity they minimize. characterization can be applied neural networks, optimizers, algorithmic solvers, and even physical allows for greater modeling flexibility compared usual explicit definition of such systems. We focus on solving problems this kind gradient descent, leveraging...
Equilibrium systems are a powerful way to express neural computations. As special cases, they include models of great current interest in both neuroscience and machine learning, such as deep networks, equilibrium recurrent models, or meta-learning. Here, we present new principle for learning with temporally- spatially-local rule. Our casts least-control problem, where first introduce an optimal controller lead the system towards solution state, then define reducing amount control needed...
The success of deep learning, a brain-inspired form AI, has sparked interest in understanding how the brain could similarly learn across multiple layers neurons. However, majority biologically-plausible learning algorithms have not yet reached performance backpropagation (BP), nor are they built on strong theoretical foundations. Here, we analyze target propagation (TP), popular but fully understood alternative to BP, from standpoint mathematical optimization. Our theory shows that TP is...