- Functional Brain Connectivity Studies
- Decision-Making and Behavioral Economics
- Topic Modeling
- Advanced MRI Techniques and Applications
- Machine Learning in Healthcare
- Economic and Environmental Valuation
- Domain Adaptation and Few-Shot Learning
- RNA and protein synthesis mechanisms
- Behavioral Health and Interventions
- Neural and Behavioral Psychology Studies
- Explainable Artificial Intelligence (XAI)
- Genomics and Phylogenetic Studies
- Olfactory and Sensory Function Studies
- Advanced Neuroimaging Techniques and Applications
- Machine Learning in Bioinformatics
- Epilepsy research and treatment
- Neural Networks and Applications
- Face Recognition and Perception
- Neural dynamics and brain function
- Architecture and Computational Design
- Infrared Thermography in Medicine
- Machine Learning in Materials Science
- Parallel Computing and Optimization Techniques
- Cognitive Science and Mapping
- Adversarial Robustness in Machine Learning
Stanford University
2005-2024
Technische Universität Berlin
2018-2023
Max Planck Institute for Human Development
2020-2023
Freie Universität Berlin
2017-2022
Einstein Center for Neurosciences Berlin
2017-2021
Max Planck Institute for Human Cognitive and Brain Sciences
2019
WZB Berlin Social Science Center
2019
Berlin School of Economics and Law
2019
Athinoula A. Martinos Center for Biomedical Imaging
2017
AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and adaptable to wide range downstream tasks. We call these foundation underscore their critically central yet incomplete character. This report provides thorough account opportunities risks models, ranging from capabilities language, vision, robotics, reasoning, human interaction) technical principles(e.g., model architectures, training procedures, data, systems,...
The genome is a sequence that completely encodes the DNA, RNA, and proteins orchestrate function of whole organism. Advances in machine learning combined with massive datasets genomes could enable biological foundation model accelerates mechanistic understanding generative design complex molecular interactions. We report Evo, genomic enables prediction generation tasks from to scale. Using an architecture based on advances deep signal processing, we scale Evo 7 billion parameters context...
The genome is a sequence that encodes the DNA, RNA, and proteins orchestrate an organism’s function. We present Evo, long-context genomic foundation model with frontier architecture trained on millions of prokaryotic phage genomes, report scaling laws DNA to complement observations in language vision. Evo generalizes across proteins, enabling zero-shot function prediction competitive domain-specific models generation functional CRISPR-Cas transposon systems, representing first examples...
The application of deep learning (DL) models to neuroimaging data poses several challenges, due the high dimensionality, low sample size, and complex temporo-spatial dependency structure these data. Even further, DL often act as black boxes, impeding insight into association cognitive state brain activity. To approach we introduce DeepLight framework, which utilizes long short-term memory (LSTM) based analyze whole-brain functional Magnetic Resonance Imaging (fMRI) decode a (e.g., seeing...
Genomic (DNA) sequences encode an enormous amount of information for gene regulation and protein synthesis. Similar to natural language models, researchers have proposed foundation models in genomics learn generalizable features from unlabeled genome data that can then be fine-tuned downstream tasks such as identifying regulatory elements. Due the quadratic scaling attention, previous Transformer-based genomic used 512 4k tokens context (<0.001% human genome), significantly limiting modeling...
State space models (SSMs) have demonstrated state-of-the-art sequence modeling performance in some modalities, but underperform attention language modeling. Moreover, despite scaling nearly linearly length instead of quadratically, SSMs are still slower than Transformers due to poor hardware utilization. In this paper, we make progress on understanding the expressivity gap between and modeling, reducing barrier attention. First, use synthetic tasks understand We find that existing struggle...
How do we choose when confronted with many alternatives? There is surprisingly little decision modelling work large choice sets, despite their prevalence in everyday life. Even further, there an apparent disconnect between research small supporting a process of gaze-driven evidence accumulation, and larger arguing for models optimal choice, satisficing, hybrids the two. Here, bridge this divide by developing comparing different versions these many-alternative value-based experiment 9, 16,...
Self-supervised learning techniques are celebrating immense success in natural language processing (NLP) by enabling models to learn from broad data at unprecedented scales. Here, we aim leverage the of these for mental state decoding, where researchers identify specific states (e.g., experience anger or joy) brain activity. To this end, devise a set novel self-supervised frameworks neuroimaging inspired prominent NLP. At their core, dynamics activity modeling sequences akin how text modeled...
Choices are influenced by gaze allocation during deliberation, so that fixating an alternative longer leads to increased probability of choosing it. Gaze-dependent evidence accumulation provides a parsimonious account choices, response times and gaze-behaviour in many simple decision scenarios. Here, we test whether this framework can also predict more complex context-dependent patterns choice three-alternative risky task, where choices eye movements were subject attraction compromise...
Deep learning (DL) models find increasing application in mental state decoding, where researchers seek to understand the mapping between states (e.g., experiencing anger or joy) and brain activity by identifying those spatial temporal features of that allow accurately identify (i.e., decode) these states. Once a DL model has been trained decode set states, neuroimaging often make use methods from explainable artificial intelligence research model's learned mappings activity. Here, we...
Machine learning models are increasingly being scaled in both sequence length and model dimension to reach longer contexts better performance. However, existing architectures such as Transformers scale quadratically along these axes. We ask: there performant that can sub-quadratically dimension? introduce Monarch Mixer (M2), a new architecture uses the same sub-quadratic primitive dimension: matrices, simple class of expressive structured matrices captures many linear transforms, achieves...
State space models (SSMs) have high performance on long sequence modeling but require sophisticated initialization techniques and specialized implementations for quality runtime performance. We study whether a simple alternative can match SSMs in efficiency: directly learning convolutions over the sequence. find that key requirement to achieving is keeping convolution kernels smooth. interventions--such as squashing kernel weights--result smooth recover SSM range of tasks including arena,...
Recent empirical findings have indicated that gaze allocation plays a crucial role in simple decision behaviour. Many of these point towards an influence onto the speed evidence accumulation accumulation-to-bound process (resulting generally higher choice probabilities for items been looked at longer). Further, researchers shown strength association between and behaviour is highly variable individuals, encouraging future work to study this on individual level. However, few models exist...
Abstract How do we make simple consumer choices (e.g., deciding between an apple, orange, and a banana)? Recent empirical evidence suggests close link choice behavior eye movements at the group level, with generally higher probabilities for items that were looked longer during decision process. However, it is unclear how variable this effect across individuals. Here, investigate question in multialternative forced-choice experiment using novel computational model can be easily applied to...
Choices are influenced by gaze allocation during deliberation, so that fixating an alternative longer leads to increased probability of choosing it. Gaze-dependent evidence accumulation provides a parsimonious account choices, response times and gaze-behaviour in many simple decision scenarios. Here, we test whether this framework can also predict more complex context-dependent patterns choice three-alternative risky task, where choices eye movements were subject attraction compromise...
The development of deep learning architectures is a resource-demanding process, due to vast design space, long prototyping times, and high compute costs associated with at-scale model training evaluation. We set out simplify this process by grounding it in an end-to-end mechanistic architecture (MAD) pipeline, encompassing small-scale capability unit tests predictive scaling laws. Through suite synthetic token manipulation tasks such as compression recall, designed probe capabilities, we...
Iterative improvement of model architectures is fundamental to deep learning: Transformers first enabled scaling, and recent advances in hybridization have pushed the quality-efficiency frontier. However, optimizing remains challenging expensive. Current automated or manual approaches fall short, largely due limited progress design search spaces simplicity resulting patterns heuristics. In this work, we propose a new approach for synthesis tailored (STAR). Our combines novel space based on...
The analysis of large data sets can help to gain knowledge about specific organs or on diseases, just as big does in many non-medical areas. This article aims information from 3D volumes, so the visual content lung CT scans a number patients. In case described set, only little annotation is available patients that were all part an ongoing screening program and besides age gender no patient findings was for this work. scenario happen regularly image are produced become increasingly quantities...