NFDI4DS | UHH-SEMS - Publication Details

PaLM: Scaling Language Modeling with Pathways

OPENALEX - Publications

Aakanksha Chowdhery Sharan Narang Jacob Devlin Maarten Bosma Gaurav Mishra and 62 more

Large language models have been shown to achieve remarkable performance across a variety of natural tasks using few-shot learning, which drastically reduces the number task-specific training examples needed adapt model particular application. To further our understanding impact scale on we trained 540-billion parameter, densely activated, Transformer model, call Pathways Language Model PaLM. We PaLM 6144 TPU v4 chips Pathways, new ML system enables highly efficient multiple Pods. demonstrate...

10.48550/arxiv.2204.02311 preprint EN cc-by arXiv (Cornell University) 2022-01-01

LaMDA: Language Models for Dialog Applications

OPENALEX - Publications

Romal Thoppilan Daniel De Freitas Jamie Hall Noam Shazeer Apoorv Kulshreshtha and 55 more

We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized dialog, which have up to 137B parameters and are pre-trained on 1.56T words public dialog data web text. While model scaling alone can improve quality, it shows less improvements safety factual grounding. demonstrate that fine-tuning with annotated enabling the consult external knowledge sources lead significant towards two key challenges The first challenge,...

10.48550/arxiv.2201.08239 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations

OPENALEX - Publications

Aida Mostafazadeh Davani Mark Díaz Vinodkumar Prabhakaran

Abstract Majority voting and averaging are common approaches used to resolve annotator disagreements derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often reflecting their individual biases values, especially in the case of subjective tasks such as detecting affect, aggression, hate speech. Annotator capture important nuances that ignored while aggregating annotations a truth. In order address this, we investigate...

10.1162/tacl_a_00449 article EN cc-by Transactions of the Association for Computational Linguistics 2022-01-01

PaLM 2 Technical Report

OPENALEX - Publications

Rohan Anil Andrew M. Dai Orhan Fırat Melvin Johnson Dmitry Lepikhin and 95 more

We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities is more compute-efficient than its predecessor PaLM. 2 Transformer-based trained using mixture of objectives. Through extensive evaluations on English language, tasks, we demonstrate significantly improved quality downstream tasks across different sizes, while simultaneously exhibiting faster efficient inference compared to This efficiency enables broader deployment also...

10.48550/arxiv.2305.10403 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Power to the People? Opportunities and Challenges for Participatory AI

OPENALEX - Publications

Abeba Birhane William M. Isaac Vinodkumar Prabhakaran Mark Díaz Madeleine Clare Elish and 2 more

Participatory approaches to artificial intelligence (AI) and machine learning (ML) are gaining momentum: the increased attention comes partly with view that participation opens gateway an inclusive, equitable, robust, responsible trustworthy AI. Among other benefits, participatory essential understanding adequately representing needs, desires perspectives of historically marginalized communities. However, there currently exists lack clarity on what meaningful entails it is expected do. In...

10.1145/3551624.3555290 preprint EN 2022-10-06

Addressing Age-Related Bias in Sentiment Analysis

OPENALEX - Publications

Mark Díaz Isaac Johnson Amanda Lazar Anne Marie Piper Darren Gergle

Computational approaches to text analysis are useful in understanding aspects of online interaction, such as opinions and subjectivity text. Yet, recent studies have identified various forms bias language-based models, raising concerns about the risk propagating social biases against certain groups based on sociodemographic factors (e.g., gender, race, geography). In this study, we contribute a systematic examination application language models study discourse aging. We analyze treatment...

10.1145/3173574.3173986 article EN 2018-04-20

On Releasing Annotator-Level Labels and Information in Datasets

OPENALEX - Publications

Vinodkumar Prabhakaran Aida Mostafazadeh Davani Mark Díaz

A common practice in building NLP datasets, especially using crowd-sourced annotations, involves obtaining multiple annotator judgements on the same data instances, which are then flattened to produce a single “ground truth” label or score, through majority voting, averaging, adjudication. While these approaches may be appropriate certain annotation tasks, such aggregations overlook socially constructed nature of human perceptions that annotations for relatively more subjective tasks meant...

10.18653/v1/2021.law-1.14 article EN cc-by 2021-01-01

CrowdWorkSheets: Accounting for Individual and Collective Identities Underlying Crowdsourced Dataset Annotation

OPENALEX - Publications

Mark Díaz Ian Kivlichan Rachel Rosen Dylan Baker Razvan Amironesei and 2 more

Human annotated data plays a crucial role in machine learning (ML) research and development. However, the ethical considerations around processes decisions that go into dataset annotation have not received nearly enough attention. In this paper, we survey an array of literature provides insights crowdsourced annotation. We synthesize these insights, lay out challenges space along two layers: (1) who annotator is, how annotators' lived experiences can impact their annotations, (2)...

10.1145/3531146.3534647 article EN 2022 ACM Conference on Fairness, Accountability, and Transparency 2022-06-20

The Illusion of Artificial Inclusion

OPENALEX - Publications

William S. Agnew A. S. Bergman Jennifer Chien Mark Díaz Seliem El-Sayed and 3 more

Human participants play a central role in the development of modern artificial intelligence (AI) technology, psychological science, and user research. Recent advances generative AI have attracted growing interest to possibility replacing human these domains with surrogates. We survey several such "substitution proposals" better understand arguments for against substituting AI. Our scoping review indicates that recent wave proposals is motivated by goals as reducing costs research work...

10.1145/3613904.3642703 article EN cc-by-sa 2024-05-11

"The cavalry ain't coming in to save us"

OPENALEX - Publications

Jessa Dickinson Mark Díaz Christopher A. Le Dantec Sheena Erete

Cities are increasingly integrating sensing and information communication technologies to improve municipal services, civic engagement, quality of life for residents. Although these have the potential affect economic, social, environmental factors, there has been less focus on residents lower income communities' involvement in technology design. Based two public forums held underserved communities, we describe residents' perceptions their communities challenges that limit technologies'...

10.1145/3359225 article EN Proceedings of the ACM on Human-Computer Interaction 2019-11-07

Going Gray, Failure to Hire, and the Ick Factor

OPENALEX - Publications

Amanda Lazar Mark Díaz Robin Brewer Chelsea Kim Anne Marie Piper

Ageism is a pervasive, and often invisible, form of discrimination. Though it can affect people all ages, older adults in particular face age-related stereotypes bias their everyday lives. In this paper, we describe the ways which bloggers articulate collective narrative on ageism as appears lives, develop community with anti-ageist interests, discuss strategies to navigate change societal views institutions. Bloggers criticize stereotypical notions that focus exclusively losses occur age...

10.1145/2998181.2998275 article EN 2017-02-14

Addressing Age-Related Bias in Sentiment Analysis

OPENALEX - Publications

Mark Díaz Isaac Johnson Amanda Lazar Anne Marie Piper Darren Gergle

Recent studies have identified various forms of bias in language-based models, raising concerns about the risk propagating social biases against certain groups based on sociodemographic factors (e.g., gender, race, geography). In this study, we analyze treatment age-related terms across 15 sentiment analysis models and 10 widely-used GloVe word embeddings attempt to alleviate through a method processing model training data. Our results show significant age is encoded outputs many algorithms...

10.24963/ijcai.2019/852 article EN 2019-07-28

STAR: SocioTechnical Approach to Red Teaming Language Models

OPENALEX - Publications

Laura Weidinger John W. Mellor Bernat Guillén Pegueroles Nahema Marchal Ravin Kumar and 7 more

10.18653/v1/2024.emnlp-main.1200 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01

Platforming Intersectionality: Networked Solidarity and the Limits of Corporate Social Media

OPENALEX - Publications

Aymar Jean Christian Faithe Day Mark Díaz Chelsea Peterson‐Salahuddin

How do historically marginalized narratives spread on social media platforms? Developing research in collaboration with intersectional artists and community, or what we call “platforming intersectionality,” can reveal the promise limitations of for bridging disparate, segregated communities, “networked solidarity.” Using case studies indie TV series about show that intersectionality corporate platforms, but causes are largely visible outside both online offline. Basic conditions spreading...

10.1177/2056305120933301 article EN cc-by-nc Social Media + Society 2020-07-01

Whose Ground Truth? Accounting for Individual and Collective Identities Underlying Dataset Annotation

OPENALEX - Publications

Emily Denton Mark Díaz Ian Kivlichan Vinodkumar Prabhakaran Rachel Rosen

Human annotations play a crucial role in machine learning (ML) research and development. However, the ethical considerations around processes decisions that go into building ML datasets has not received nearly enough attention. In this paper, we survey an array of literature provides insights crowdsourced dataset annotation. We synthesize these insights, lay out challenges space along two layers: (1) who annotator is, how annotators' lived experiences can impact their annotations, (2)...

10.48550/arxiv.2112.04554 preprint EN cc-by arXiv (Cornell University) 2021-01-01

DICES Dataset: Diversity in Conversational AI Evaluation for Safety

OPENALEX - Publications

Lora Aroyo Alex Taylor Mark Díaz Christopher M. Homan Alicia Parrish and 3 more

Machine learning approaches often require training and evaluation datasets with a clear separation between positive negative examples. This risks simplifying even obscuring the inherent subjectivity present in many tasks. Preserving such variance content diversity is expensive laborious. especially troubling when building safety for conversational AI systems, as both socially culturally situated. To demonstrate this crucial aspect of safety, to facilitate in-depth model performance analyses,...

10.48550/arxiv.2306.11247 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates

OPENALEX - Publications

Aida Mostafazadeh Davani Mark Díaz Dylan Baker Vinodkumar Prabhakaran

Recent years have seen substantial investments in AI-based tools designed to detect offensive language at scale, aiming moderate social media platforms, and ensure safety of conversational AI technologies such as ChatGPT Bard. These efforts largely treat this task a technical endeavor, relying on data annotated for offensiveness by global crowd workforce, without considering workers' socio-cultural backgrounds or the values their perceptions reflect. Existing research that examines...

10.1145/3630106.3659021 article EN other-oa 2022 ACM Conference on Fairness, Accountability, and Transparency 2024-06-03

GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives

OPENALEX - Publications

Vinodkumar Prabhakaran Christopher M. Homan Lora Aroyo Aida Mostafazadeh Davani Alicia Parrish and 4 more

10.18653/v1/2024.naacl-long.190 article EN 2024-01-01

Responsible Crowdsourcing for Responsible Generative AI: Engaging Crowds in AI Auditing and Evaluation

OPENALEX - Publications

Wesley Hanwen Deng Mireia Yurrita Mark Díaz Jina Suh Nick Judd and 4 more

With the rise of generative AI (GenAI), there has been an increased need for participation by large and diverse user bases in evaluation auditing. GenAI developers are increasingly adopting crowdsourcing approaches to test audit their products services. However, it remains open question how design deploy responsible effective pipelines auditing evaluation. This workshop aims take a step towards bridging this gap. Our interdisciplinary team organizers will work with participants explore...

10.1609/hcomp.v12i1.31609 article EN Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 2024-10-14

Understanding and Being Understood: User Strategies for Identifying and Recovering From Mistranslations in Machine Translation-Mediated Chat

OPENALEX - Publications

Samantha Robertson Mark Díaz

Machine translation (MT) is now widely and freely available, has the potential to greatly improve cross-lingual communication. In order use MT reliably safely, end users must be able assess quality of system outputs determine how much they can rely on them guide their decisions actions. However, it difficult for detect recover from mistranslations due limited language skills. this work we collected 19 MT-mediated role-play conversations in housing employment scenarios, conducted in-depth...

10.1145/3531146.3534638 article EN 2022 ACM Conference on Fairness, Accountability, and Transparency 2022-06-20

Inclusion of Underserved Residents in City Technology Planning

OPENALEX - Publications

Jessa Dickinson Sheena Erete Mark Díaz Denise Linn Riedl

Cities are increasingly integrating urban technologies into their infrastructures to improve municipal services, civic engagement, and quality of life for residents. Research suggests that implemented in communities can worsen existing inequalities, yet there is little understanding what underserved residents think about or how they engage with cities technology policies practices. Based on two forums held communities, we found motivated participate city planning because believe impacts the...

10.1145/3170427.3188583 article EN 2018-04-20

Accounting for Offensive Speech as a Practice of Resistance

OPENALEX - Publications

Mark Díaz Razvan Amironesei Laura Weidinger Iason Gabriel

Tasks such as toxicity detection, hate speech and online harassment detection have been developed for identifying interactions involving offensive speech. In this work we articulate the need a relational understanding of offensiveness to help distinguish denotative from serving mechanism through which marginalized communities resist oppressive social norms. Using examples queer community, argue that evaluations must focus on impacts language use. We call cynic perspective– or characteristic...

10.18653/v1/2022.woah-1.18 article EN cc-by 2022-01-01

Whose Walkability?

OPENALEX - Publications

Mark Díaz Nicholas Diakopoulos

The Walk Score is a patented algorithm for measuring the walkability of given geographic area. In addition to its use in real estate, accompanying API used range research public health and urban development. This study explores how neighborhood residents differently understand notion as well extent which their personal definitions are reflected Score's underlying algorithm. We find that, while generally aligns with residents' priorities around walkability, significant subjective aspects that...

10.1145/3359228 article EN Proceedings of the ACM on Human-Computer Interaction 2019-11-07

The Reasonable Effectiveness of Diverse Evaluation Data

OPENALEX - Publications

Lora Aroyo Mark Díaz Christopher M. Homan Vinodkumar Prabhakaran Alex Taylor and 1 more

In this paper, we present findings from an semi-experimental exploration of rater diversity and its influence on safety annotations conversations generated by humans talking to a generative AI-chat bot. We find significant differences in judgments produced raters different geographic regions annotation platforms, correlate these perspectives with demographic sub-groups. Our work helps define best practices model development -- specifically human evaluation models the backdrop growing...

10.48550/arxiv.2301.09406 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

The illusion of artificial inclusion

OPENALEX - Publications

William S. Agnew A. S. Bergman Jennifer Chien Mark Díaz Seliem El-Sayed and 3 more

Human participants play a central role in the development of modern artificial intelligence (AI) technology, psychological science, and user research. Recent advances generative AI have attracted growing interest to possibility replacing human these domains with surrogates. We survey several such "substitution proposals" better understand arguments for against substituting AI. Our scoping review indicates that recent wave proposals is motivated by goals as reducing costs research work...

10.48550/arxiv.2401.08572 preprint EN cc-by arXiv (Cornell University) 2024-01-01