NFDI4DS | UHH-SEMS - Publication Details

Alex Tamkin

ORCID: 0009-0006-0007-3746

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5081558429

Research Areas

Topic Modeling
Natural Language Processing Techniques
Domain Adaptation and Few-Shot Learning
Speech Recognition and Synthesis
Multimodal Machine Learning Applications
Remote-Sensing Image Classification
Reinforcement Learning in Robotics
Ethics and Social Impacts of AI
Adversarial Robustness in Machine Learning
Advanced Bandit Algorithms Research
Artificial Intelligence in Healthcare and Education
Generative Adversarial Networks and Image Synthesis
Music and Audio Processing
Machine Learning in Healthcare
Explainable Artificial Intelligence (XAI)
Simulation Techniques and Applications
Text Readability and Simplification
Stellar, planetary, and galactic studies
Data Stream Mining Techniques
Socioeconomic Development in MENA
Speech and Audio Processing
Advanced Vision and Imaging
Hearing Loss and Rehabilitation
Experimental Behavioral Economics Studies
Fullerene Chemistry and Applications

Stanford University
2019-2025

Chinese University of Hong Kong
2022

University of Michigan
2022

University of Minnesota
2022

Columbia University
2022

University of California, Santa Cruz
2022

Bauhaus-Universität Weimar
2022

Leipzig University
2022

University of Minnesota System
2022

North Carolina State University
2022

On the Opportunities and Risks of Foundation Models

OPENALEX - Publications

Rishi Bommasani Drew A. Hudson Ehsan Adeli Russ B. Altman Simran Arora and 95 more

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and adaptable to wide range downstream tasks. We call these foundation underscore their critically central yet incomplete character. This report provides thorough account opportunities risks models, ranging from capabilities language, vision, robotics, reasoning, human interaction) technical principles(e.g., model architectures, training procedures, data, systems,...

10.48550/arxiv.2108.07258 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models

OPENALEX - Publications

Alex Tamkin Miles Brundage Jack Clark Deep Ganguli

On October 14th, 2020, researchers from OpenAI, the Stanford Institute for Human-Centered Artificial Intelligence, and other universities convened to discuss open research questions surrounding GPT-3, largest publicly-disclosed dense language model at time. The meeting took place under Chatham House Rules. Discussants came a variety of backgrounds including computer science, linguistics, philosophy, political communications, cyber policy, more. Broadly, discussion centered around two main...

10.48550/arxiv.2102.02503 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Towards Measuring the Representation of Subjective Global Opinions in Language Models

OPENALEX - Publications

Esin Durmus Karina Nyugen Thomas I. Liao Nicholas Schiefer Amanda Askell and 13 more

Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed capture issues across different countries. Next, define metric that quantifies the similarity between LLM-generated survey human responses, conditioned...

10.48550/arxiv.2306.16388 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Collective Constitutional AI: Aligning a Language Model with Public Input

OPENALEX - Publications

Saffron Huang Divya Siddarth Liane Lovitt Thomas I. Liao Esin Durmus and 2 more

There is growing consensus that language model (LM) developers should not be the sole deciders of LM behavior, creating a need for methods enable broader public to collectively shape behavior systems affect them. To address this need, we present Collective Constitutional AI (CCAI): multi-stage process sourcing and integrating input into LMs—from identifying target population principles training evaluating model. We demonstrate real-world practicality approach by what is, our knowledge, first...

10.1145/3630106.3658979 preprint EN cc-by 2022 ACM Conference on Fairness, Accountability, and Transparency 2024-06-03

Investigating Transferability in Pretrained Language Models

OPENALEX - Publications

Alex Tamkin Trisha Singh Davide Giovanardi Noah D. Goodman

How does language model pretraining help transfer learning? We consider a simple ablation technique for determining the impact of each pretrained layer on task performance. This method, partial reinitialization, involves replacing different layers with random weights, then finetuning entire and observing change in reveals that BERT, high probing performance downstream GLUE tasks are neither necessary nor sufficient accuracy those tasks. Furthermore, benefit using parameters varies...

10.18653/v1/2020.findings-emnlp.125 article EN cc-by 2020-01-01

Drone.io: A Gestural and Visual Interface for Human-Drone Interaction

OPENALEX - Publications

Jessica R. Cauchard Alex Tamkin Cheng Yao Wang Luke Vink Michelle Park and 2 more

Drones are becoming ubiquitous and offer support to people in various tasks, such as photography, increasingly interactive social contexts. We introduce drone.io, a projected body-centric graphical user interface for human-drone interaction. Using two simple gestures, users can interact with drone natural manner. drone.io is the first embedded on provide both input output capabilities. This paper describes design process of drone.io. present proof concept, drone-based implementation, well...

10.1109/hri.2019.8673011 article EN 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI) 2019-03-01

Studying Large Language Model Generalization with Influence Functions

OPENALEX - Publications

Roger Grosse Juhan Bae Cem Anil Nelson Elhage Alex Tamkin and 12 more

When trying to gain better visibility into a machine learning model in order understand and mitigate the associated risks, potentially valuable source of evidence is: which training examples most contribute given behavior? Influence functions aim answer counterfactual: how would model's parameters (and hence its outputs) change if sequence were added set? While influence have produced insights for small models, they are difficult scale large language models (LLMs) due difficulty computing an...

10.48550/arxiv.2308.03296 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Operationalising the definition of general-purpose AI: assessing four approaches

OPENALEX - Publications

Risto Uuk Carlos Ignacio Gutierrez Alex Tamkin

10.1080/17579961.2025.2469344 article EN Law Innovation and Technology 2025-02-25

Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy

OPENALEX - Publications

Ramtin Keramati Christoph Dann Alex Tamkin Emma Brunskill

While maximizing expected return is the goal in most reinforcement learning approaches, risk-sensitive objectives such as conditional value at risk (CVaR) are more suitable for many high-stakes applications. However, relatively little known about how to explore quickly learn policies with good CVaR. In this paper, we present first algorithm sample-efficient of CVaR-optimal Markov decision processes based on optimism face uncertainty principle. This method relies a novel optimistic version...

10.1609/aaai.v34i04.5870 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

BenchMD: A Benchmark for Unified Learning on Medical Images and Sensors

OPENALEX - Publications

Kathryn Wantlin Chenwei Wu Shih-Cheng Huang Oishi Banerjee Farah Z. Dadabhoy and 10 more

Medical data poses a daunting challenge for AI algorithms: it exists in many different modalities, experiences frequent distribution shifts, and suffers from scarcity of examples labels. Recent advances, including transformers self-supervised learning, promise more universal approach that can be applied flexibly across these diverse conditions. To measure drive progress this direction, we present BenchMD: benchmark tests how well unified, modality-agnostic methods, architectures training...

10.48550/arxiv.2304.08486 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

OPENALEX - Publications

Alex Tamkin Mike Wu Noah D. Goodman

Many recent methods for unsupervised representation learning train models to be invariant different "views," or distorted versions of an input. However, designing these views requires considerable trial and error by human experts, hindering widespread adoption across domains modalities. To address this, we propose viewmaker networks: generative that learn produce useful from a given Viewmakers are stochastic bounded adversaries: they generating then adding $\ell_p$-bounded perturbation the...

10.48550/arxiv.2010.07432 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Language Through a Prism: A Spectral Approach for Multiscale Language Representations

OPENALEX - Publications

Alex Tamkin Dan Jurafsky Noah D. Goodman

Language exhibits structure at different scales, ranging from subwords to words, sentences, paragraphs, and documents. To what extent do deep models capture information these can we force them better across this hierarchy? We approach question by focusing on individual neurons, analyzing the behavior of their activations timescales. show that signal processing provides a natural framework for separating enabling us 1) disentangle scale-specific in existing embeddings 2) train learn more...

10.48550/arxiv.2011.04823 preprint EN other-oa arXiv (Cornell University) 2020-01-01

C5T5: Controllable Generation of Organic Molecules with Transformers

OPENALEX - Publications

Daniel Rothchild Alex Tamkin Julie Yu Ujval Misra Joseph E. Gonzalez

Methods for designing organic materials with desired properties have high potential impact across fields such as medicine, renewable energy, petrochemical engineering, and agriculture. However, using generative modeling to design substances is difficult because candidate compounds must satisfy multiple constraints, including synthetic accessibility other metrics that are intuitive domain experts but challenging quantify. We propose C5T5, a novel self-supervised pretraining method enables...

10.48550/arxiv.2108.10307 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Active Learning Helps Pretrained Models Learn the Intended Task

OPENALEX - Publications

Alex Tamkin Dat Tien Nguyen Salil Deshpande Jesse Mu Noah D. Goodman

Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data. An example is an object classifier trained on red squares and blue circles: encountering squares, intended behavior undefined. We investigate whether pretrained models better active learners, capable of disambiguating between possible tasks a user may be trying specify. Intriguingly, we find that learning emergent property pretraining process:...

10.48550/arxiv.2204.08491 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Recursive Routing Networks: Learning to Compose Modules for Language Understanding

OPENALEX - Publications

Ignacio Cases Clemens Rosenbaum Matthew Riemer Atticus Geiger Tim Klinger and 7 more

Ignacio Cases, Clemens Rosenbaum, Matthew Riemer, Atticus Geiger, Tim Klinger, Alex Tamkin, Olivia Li, Sandhini Agarwal, Joshua D. Greene, Dan Jurafsky, Christopher Potts, Lauri Karttunen. Proceedings of the 2019 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019.

10.18653/v1/n19-1365 article EN 2019-01-01

Task Ambiguity in Humans and Language Models

OPENALEX - Publications

Alex Tamkin Kunal Handa Avash Shrestha Noah D. Goodman

Language models have recently achieved strong performance across a wide range of NLP benchmarks. However, unlike benchmarks, real world tasks are often poorly specified, and agents must deduce the user's intended behavior from combination context, instructions, examples. We investigate how both humans behave in face such task ambiguity by proposing AmbiBench, new benchmark six ambiguously-specified classification tasks. evaluate on AmbiBench seeing well they identify using 1) instructions...

10.48550/arxiv.2212.10711 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Oolong: Investigating What Makes Crosslingual Transfer Hard with Controlled Studies

OPENALEX - Publications

Zhengxuan Wu Isabel Papadimitriou Alex Tamkin

When we transfer a pretrained language model to new language, there are many axes of variation that change at once. To disentangle the impact different factors like syntactic similarity and vocabulary similarity, propose set controlled studies: systematically transform GLUE benchmark, altering one axis crosslingual time, then measure resulting drops in model's downstream performance. We find models can largely recover from syntactic-style shifts, but cannot misalignment embedding matrix...

10.48550/arxiv.2202.12312 preprint EN cc-by arXiv (Cornell University) 2022-01-01

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

OPENALEX - Publications

Alex Tamkin Vincent Liu Rongfei Lu Daniel Fein Colin Schultz and 1 more

Self-supervised learning algorithms, including BERT and SimCLR, have enabled significant strides in fields like natural language processing, computer vision, speech processing. However, these algorithms are domain-specific, meaning that new self-supervised must be developed for each setting, myriad healthcare, scientific, multimodal domains. To catalyze progress toward domain-agnostic methods, we introduce DABS: a Domain-Agnostic Benchmark learning. perform well on DABS, an algorithm is...

10.48550/arxiv.2111.12062 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Turbulence in Focus: Benchmarking Scaling Behavior of 3D Volumetric Super-Resolution with BLASTNet 2.0 Data

OPENALEX - Publications

Wai Tong Chung Bassem Akoush Pushan Sharma Alex Tamkin Ki Sung Jung and 7 more

Analysis of compressible turbulent flows is essential for applications related to propulsion, energy generation, and the environment. Here, we present BLASTNet 2.0, a 2.2 TB network-of-datasets containing 744 full-domain samples from 34 high-fidelity direct numerical simulations, which addresses current limited availability 3D reacting non-reacting flow simulation data. With this data, benchmark total 49 variations five deep learning approaches super-resolution - can be applied improving...

10.48550/arxiv.2309.13457 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Eliciting Human Preferences with Language Models

OPENALEX - Publications

Belinda Z. Li Alex Tamkin Noah D. Goodman Jacob Andreas

Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts. But selecting writing prompts for challenging--especially in that involve unusual edge cases, demand precise articulation of nebulous preferences, require an accurate mental model LM behavior. We propose use *LMs themselves* guide the task specification process. In this paper, we introduce **Generative Active Task Elicitation (GATE)**: a learning framework which elicit and...

10.48550/arxiv.2310.11589 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Evaluating and Mitigating Discrimination in Language Model Decisions

OPENALEX - Publications

Alex Tamkin Amanda Askell Liane Lovitt Esin Durmus Nicholas Joseph and 4 more

As language models (LMs) advance, interest is growing in applying them to high-stakes societal decisions, such as determining financing or housing eligibility. However, their potential for discrimination contexts raises ethical concerns, motivating the need better methods evaluate these risks. We present a method proactively evaluating discriminatory impact of LMs wide range use cases, including hypothetical cases where they have not yet been deployed. Specifically, we an LM generate array...

10.48550/arxiv.2312.03689 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Clio: Privacy-Preserving Insights into Real-World AI Use

OPENALEX - Publications

Alex Tamkin Miles McCain Kunal Handa Esin Durmus Liane Lovitt and 16 more

How are AI assistants being used in the real world? While model providers theory have a window into this impact via their users' data, both privacy concerns and practical challenges made analyzing data difficult. To address these issues, we present Clio (Claude insights observations), privacy-preserving platform that uses themselves to analyze surface aggregated usage patterns across millions of conversations, without need for human reviewers read raw conversations. We validate can be done...

10.48550/arxiv.2412.13678 preprint EN arXiv (Cornell University) 2024-12-18

Coming Soon ...