Lucas Monteiro Paes

ORCID: 0000-0003-0129-1420
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Adversarial Robustness in Machine Learning
  • Natural Language Processing Techniques
  • Hate Speech and Cyberbullying Detection
  • Ethics and Social Impacts of AI
  • Topic Modeling
  • Advanced Statistical Process Monitoring
  • Multimodal Machine Learning Applications
  • Statistical Methods and Inference
  • Pregnancy and preeclampsia studies
  • Explainable Artificial Intelligence (XAI)
  • Fetal and Pediatric Neurological Disorders
  • Imbalanced Data Classification Techniques
  • Neonatal and fetal brain pathology
  • Machine Learning and Data Classification
  • Semantic Web and Ontologies
  • Safety Systems Engineering in Autonomy

Harvard University
2023-2024

Google (United States)
2023

Harvard University Press
2023

In AI alignment, extensive latitude must be granted to annotators, either human or algorithmic, judge which model outputs are `better' `safer.' We refer this as alignment discretion. Such discretion remains largely unexamined, posing two risks: (i) annotators may use their power of arbitrarily, and (ii) models fail mimic To study phenomenon, we draw on legal concepts that structure how decision-making authority is conferred exercised, particularly in cases where principles conflict...

10.48550/arxiv.2502.10441 preprint EN arXiv (Cornell University) 2025-02-10

Machine learning (ML) is widely used to moderate online content. Despite its scalability relative human moderation, the use of ML introduces unique challenges content moderation. One such challenge predictive multiplicity: multiple competing models for classification may perform equally well on average, yet assign conflicting predictions same This multiplicity can result from seemingly innocuous choices made during training, which do not meaningfully change accuracy model, but nevertheless...

10.1145/3630106.3659036 article EN cc-by 2022 ACM Conference on Fairness, Accountability, and Transparency 2024-06-03

Accurately predicting the volume of amniotic fluid is fundamental to assessing pregnancy risks, though task usually requires many hours laborious work by medical experts. In this paper, we present AmnioML, a machine learning solution that leverages deep and conformal prediction output fast accurate estimates segmentation masks from fetal MRIs with Dice coefficient over 0.9. Also, make available novel, curated dataset for 853 exams benchmark performance recent architectures. addition,...

10.1609/aaai.v37i13.26837 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2023-06-26

The Rashomon effect in machine learning (ML) occurs when multiple distinct models achieve similar average loss on a given task. set of all with expected smaller than ϵ is called the set. characterization this for task allows searching that satisfy additional constraints (e.g., interpretability, fairness) without compromising accuracy. Though folklore treats as collection indistinguishable "good" models, there are no established theoretical guarantees statistically indistinguishable. We fill...

10.1109/isit54713.2023.10206657 article EN 2022 IEEE International Symposium on Information Theory (ISIT) 2023-06-25

Machine learning (ML) is widely used to moderate online content. Despite its scalability relative human moderation, the use of ML introduces unique challenges content moderation. One such challenge predictive multiplicity: multiple competing models for classification may perform equally well on average, yet assign conflicting predictions same This multiplicity can result from seemingly innocuous choices during model development, as random seed selection parameter initialization. We...

10.48550/arxiv.2402.16979 preprint EN arXiv (Cornell University) 2024-02-26

Perturbation-based explanation methods such as LIME and SHAP are commonly applied to text classification. This work focuses on their extension generative language models. To address the challenges of output long inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms. handle output, introduce notion scalarizers for mapping real numbers investigate multiple possibilities. take multi-level approach, proceeding from coarser levels...

10.48550/arxiv.2403.14459 preprint EN arXiv (Cornell University) 2024-03-21

Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating a fixed ML model defined multiple race sex Here, sample complexity for estimating worst-case gap largest difference error rates) increases exponentially with number group-denoting attributes. To address this issue, we propose an approach to test based on...

10.1109/jsait.2024.3397741 article EN IEEE Journal on Selected Areas in Information Theory 2024-01-01

Feature attribution methods explain black-box machine learning (ML) models by assigning importance scores to input features. These can be computationally expensive for large ML models. To address this challenge, there has been increasing efforts develop amortized explainers, where a model is trained predict feature with only one inference. Despite their efficiency, explainers produce inaccurate predictions and misleading explanations. In paper, we propose selective explanations, novel method...

10.48550/arxiv.2405.19562 preprint EN arXiv (Cornell University) 2024-05-29

Image search and retrieval tasks can perpetuate harmful stereotypes, erase cultural identities, amplify social disparities. Current approaches to mitigate these representational harms balance the number of retrieved items across population groups defined by a small (often binary) attributes. However, most existing methods overlook intersectional determined combinations group attributes, such as gender, race, ethnicity. We introduce Multi-Group Proportional Representation (MPR), novel metric...

10.48550/arxiv.2407.08571 preprint EN arXiv (Cornell University) 2024-07-11

Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating a fixed ML model defined multiple race sex Here, sample complexity for estimating worst-case gap largest difference error rates) increases exponentially with number group-denoting attributes. To address this issue, we propose an approach to test based on...

10.48550/arxiv.2312.03867 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Text-to-image models take a sentence (i.e., prompt) and generate images associated with this input prompt. These have created award wining-art, videos, even synthetic datasets. However, text-to-image (T2I) can that underrepresent minorities based on race sex. This paper investigates which word in the prompt is responsible for bias generated images. We introduce method computing scores each prompt; these represent its influence biases model's output. Our follows principle of \emph{explaining...

10.48550/arxiv.2306.05500 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...