Amandalynne Paullada

ORCID: 0000-0002-9585-0125
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Topic Modeling
  • Explainable Artificial Intelligence (XAI)
  • Natural Language Processing Techniques
  • Text Readability and Simplification
  • Biomedical Text Mining and Ontologies
  • Adversarial Robustness in Machine Learning
  • Privacy-Preserving Technologies in Data
  • Human Mobility and Location-Based Analysis
  • Machine Learning and Data Classification
  • Social Media and Politics
  • Dental Research and COVID-19
  • Hate Speech and Cyberbullying Detection
  • Ethics and Social Impacts of AI
  • Advanced Malware Detection Techniques
  • Computational and Text Analysis Methods
  • Context-Aware Activity Recognition Systems
  • Advanced Text Analysis Techniques
  • Diversity and Career in Medicine
  • Personal Information Management and User Behavior
  • Advances in Oncology and Radiotherapy
  • Genetics, Bioinformatics, and Biomedical Research
  • Housing Market and Economics
  • Human-Automation Interaction and Safety
  • Authorship Attribution and Profiling
  • Urban, Neighborhood, and Segregation Studies

University of Washington
2017-2024

University of Washington Medical Center
2022

Seattle University
2022

Smart Information Flow Technologies (United States)
2016

There is a tendency across different subfields in AI to valorize small collection of influential benchmarks. These benchmarks operate as stand-ins for range anointed common problems that are frequently framed foundational milestones on the path towards flexible and generalizable systems. State-of-the-art performance these widely understood indicative progress long-term goals. In this position paper, we explore limits such order reveal construct validity issues their framing functionally...

10.48550/arxiv.2111.15366 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Many approaches in biomedical informatics (BMI) rely on the ability to define, gather, and manipulate data support health through a cyclical research-practice lifecycle. Researchers within this field are often fortunate work closely with healthcare public systems influence generation capture have access vast amount of data. informaticists also expertise engage stakeholders, develop new methods applications, policy. However, research policy that explicitly seeks address systemic drivers would...

10.1016/j.jbi.2024.104653 article EN cc-by-nc-nd Journal of Biomedical Informatics 2024-05-10

Racial discrimination has been a central driver of residential segregation for many decades, in the Seattle area as well United States whole. In addition to redlining and restrictive housing covenants, advertisements included explicit racial language until 1968. Since then, patterns have remained racialized, despite overt forms becoming less prevalent. this paper, we use Structural Topic Models (STM) qualitative analysis investigate how contemporary rental listings from Seattle-Tacoma

10.1093/sf/soaa075 article EN Social Forces 2020-06-27

Previous work on classifying Twitter users’ political alignment has mainly focused lexical and social network features. This study provides evidence that affiliation is also reflected in features which have been previously overlooked: discourse patterns (proportion of Tweets are retweets or replies) their rate use capitalization punctuation. We find robust differences between politically left- right-leaning communities with respect to these sub-lexical features, although they not enough...

10.18653/v1/w17-2909 article EN cc-by 2017-01-01

Inferring the nature of relationships between biomedical entities from text is an important problem due to difficulty maintaining human-curated knowledge bases in rapidly evolving fields. Neural word embeddings have earned attention for apparent ability encode relational information. However, embedding models that disregard syntax during training are limited their structural fundamental cognitive theories analogy. In this paper, we demonstrate utility encoding dependency structure a model...

10.18653/v1/2020.bionlp-1.4 article EN cc-by 2020-01-01

We propose using a multilabel probing task to assess the morphosyntactic representations of multilingual word embeddings. This tweak on canonical makes it easy explore representations, both holistically and at level individual features (e.g., gender, number, case), leads more naturally study how language models handle co-occurring agreement phenomena). demonstrate this with BERT (Devlin et al., 2018), training probes for seven typologically diverse languages: Afrikaans, Croatian, Finnish,...

10.18653/v1/2021.findings-emnlp.382 article EN cc-by 2021-01-01

With the growth of Automatic Content Moderation (ACM) on widely used social media platforms, transparency into design moderation technology and policy is necessary for online communities to advocate themselves when harms occur.In this work, we describe a suite interactive modules support exploration various aspects technology, particularly those components that rely English models datasets hate speech detection, subtask within ACM. We intend demo stakeholders ACM in investigating definitions...

10.18653/v1/2022.hcinlp-1.2 article EN cc-by 2022-01-01

Many datasets contain personally identifiable information, or PII, which poses privacy risks to individuals. PII masking is commonly used redact personal information such as names, addresses, and phone numbers from text data. Most modern pipelines involve machine learning algorithms. However, these systems may vary in performance, that individuals particular demographic groups bear a higher risk for having their exposed. In this paper, we evaluate the performance of three off-the-shelf on...

10.48550/arxiv.2205.04505 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

Many datasets contain personally identifiable information, or PII, which poses privacy risks to individuals. PII masking is commonly used redact personal information such as names, addresses, and phone numbers from text data. Most modern pipelines involve machine learning algorithms. However, these systems may vary in performance, that individuals particular demographic groups bear a higher risk for having their exposed. In this paper, we evaluate the performance of three off-the-shelf on...

10.18653/v1/2022.ltedi-1.10 article EN cc-by 2022-01-01

We introduce a multilabel probing task to assess the morphosyntactic representations of word embeddings from multilingual language models. demonstrate this with BERT (Devlin et al., 2018), training probes for seven typologically diverse languages varying morphological complexity: Afrikaans, Croatian, Finnish, Hebrew, Korean, Spanish, and Turkish. Through simple but robust paradigm, we show that renders many features easily simultaneously extractable (e.g., gender, grammatical case,...

10.48550/arxiv.2104.08464 preprint EN cc-by arXiv (Cornell University) 2021-01-01
Coming Soon ...