- Explainable Artificial Intelligence (XAI)
- Anomaly Detection Techniques and Applications
- Topic Modeling
- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Privacy-Preserving Technologies in Data
- Gaussian Processes and Bayesian Inference
- Ethics and Social Impacts of AI
- Time Series Analysis and Forecasting
- Adversarial Robustness in Machine Learning
- Natural Language Processing Techniques
- Machine Learning in Materials Science
- Privacy, Security, and Data Protection
- Artificial Intelligence in Healthcare and Education
- Machine Learning in Healthcare
- Machine Learning and Data Classification
- Criminal Justice and Corrections Analysis
- Traffic and Road Safety
- Computability, Logic, AI Algorithms
- Data Stream Mining Techniques
- Speech and dialogue systems
- Machine Learning and Algorithms
- Advanced Graph Neural Networks
- Scientific Computing and Data Management
- Autonomous Vehicle Technology and Safety
University of Tübingen
2024
TH Bingen University of Applied Sciences
2022-2023
Stanford University
1990
Heterogeneous tabular data are the most commonly used form of and essential for numerous critical computationally demanding applications. On homogeneous sets, deep neural networks have repeatedly shown excellent performance therefore been widely adopted. However, their adaptation to inference or generation tasks remains challenging. To facilitate further progress in field, this work provides an overview state-of-the-art learning methods data. We categorize these into three groups:...
Tabular data is among the oldest and most ubiquitous forms of data. However, generation synthetic samples with original data's characteristics remains a significant challenge for tabular While many generative models from computer vision domain, such as variational autoencoders or adversarial networks, have been adapted generation, less research has directed towards recent transformer-based large language (LLMs), which are also in nature. To this end, we propose GReaT (Generation Realistic...
With a variety of local feature attribution methods being proposed in recent years, follow-up work suggested several evaluation strategies. To assess the quality across different techniques, most popular among these strategies image domain use pixel perturbations. However, advances discovered that produce conflicting rankings and can be prohibitively expensive to compute. In this work, we present an information-theoretic analysis based on Our findings reveal results are strongly affected by...
Heterogeneous tabular data are the most commonly used form of and essential for numerous critical computationally demanding applications. On homogeneous sets, deep neural networks have repeatedly shown excellent performance therefore been widely adopted. However, their adaptation to inference or generation tasks remains challenging. To facilitate further progress in field, this work provides an overview state-of-the-art learning methods data. We categorize these into three groups:...
Predicting the future trajectories of surrounding vehicles is an important challenge in automated driving, especially highly interactive environments such as roundabouts. Many works approach task with behavioral cloning: A single-step prediction model established by learning mapping states to corresponding actions from a fixed dataset. To achieve long term trajectory prediction, repeatedly executed. However, models learned cloning are unable compensate for accumulating errors that inevitably...
Explainable AI (XAI) is widely viewed as a sine qua non for ever-expanding research. A better understanding of the needs XAI users, well human-centered evaluations explainable models are both necessity and challenge. In this paper, we explore how HCI researchers conduct user studies in applications based on systematic literature review. After identifying thoroughly analyzing 97core papers with human-based over past five years, categorize them along measured characteristics explanatory...
The streams of research on adversarial examples and counterfactual explanations have largely been growing independently. This has led to several recent works trying elucidate their similarities differences. Most prominently, it argued that examples, as opposed explanations, a unique characteristic in they lead misclassification compared the ground truth. However, computational goals methodologies employed existing explanation example generation methods often lack alignment with this...
We examine machine learning models in a setup where individuals have the choice to share optional personal information with decision-making system, as seen modern insurance pricing models. Some users consent their data being used whereas others object and keep undisclosed. In this work, we show that decision not can be considered itself should protected respect users' privacy. This observation raises overlooked problem of how ensure who protect do suffer any disadvantages result. To address...
We address the critical challenge of applying feature attribution methods to transformer architecture, which dominates current applications in natural language processing and beyond. Traditional explainable AI (XAI) explicitly or implicitly rely on linear additive surrogate models quantify impact input features a model's output. In this work, we formally prove an alarming incompatibility: transformers are structurally incapable align with popular for attribution, undermining grounding these...
We present GRACIE (Graph Recalibration and Adaptive Counterfactual Inspection Explanation), a novel approach for generative classification counterfactual explanations of dynamically changing graph data. study problems through the lens classifiers. propose dynamic, self-supervised latent variable model that updates by identifying plausible counterfactuals input graphs recalibrating decision boundaries contrastive optimization. Unlike prior work, we do not rely on linear separability between...
Psychological trauma can manifest following various distressing events and is captured in diverse online contexts. However, studies traditionally focus on a single aspect of trauma, often neglecting the transferability findings across different scenarios. We address this gap by training language models with progressing complexity trauma-related datasets, including genocide-related court data, Reddit dataset post-traumatic stress disorder (PTSD), counseling conversations, Incel forum posts....
While retrieval augmented generation (RAG) has been shown to enhance factuality of large language model (LLM) outputs, LLMs still suffer from hallucination, generating incorrect or irrelevant information. One common detection strategy involves prompting the LLM again assess whether its response is grounded in retrieved evidence, but this approach costly. Alternatively, lightweight natural inference (NLI) models for efficient grounding verification can be used at time. existing pre-trained...
We introduce a novel semi-supervised Graph Counterfactual Explainer (GCE) methodology, Dynamic GRAph (DyGRACE). It leverages initial knowledge about the data distribution to search for valid counterfactuals while avoiding using information from potentially outdated decision functions in subsequent time steps. Employing two graph autoencoders (GAEs), DyGRACE learns representation of each class binary classification scenario. The GAEs minimise reconstruction error between original and its...
As machine learning (ML) models are increasingly being deployed in high-stakes applications, policymakers have suggested tighter data protection regulations (e.g., GDPR, CCPA). One key principle is the "right to be forgotten" which gives users right their deleted. Another an actionable explanation, also known as algorithmic recourse, allowing reverse unfavorable decisions. To date, it unknown whether these two principles can operationalized simultaneously. Therefore, we introduce and study...
Many supervised machine learning tasks, such as future state prediction in dynamical systems, require precise modeling of a forecast's uncertainty.The Multiple Hypotheses Prediction (MHP) approach addresses this problem by providing several hypotheses that represent possible outcomes.Unfortunately, with the common l2 loss function, these do not preserve data distribution's characteristics.We propose an alternative for distribution preserving MHP and review relevant theorems supporting our...
We propose a novel and practical privacy notion called $f$-Membership Inference Privacy ($f$-MIP), which explicitly considers the capabilities of realistic adversaries under membership inference attack threat model. Consequently, $f$-MIP offers interpretable guarantees improved utility (e.g., better classification accuracy). In particular, we derive parametric family that refer to as $\mu$-Gaussian Membership ($\mu$-GMIP) by theoretically analyzing likelihood ratio-based attacks on...
We examine machine learning models in a setup where individuals have the choice to share optional personal information with decision-making system, as seen modern insurance pricing models. Some users consent their data being used whereas others object and keep undisclosed. In this work, we show that decision not can be considered itself should protected respect users' privacy. This observation raises overlooked problem of how ensure who protect do suffer any disadvantages result. To address...
Interest in understanding and factorizing learned embedding spaces through conceptual explanations is steadily growing. When no human concept labels are available, discovery methods search trained for interpretable concepts like object shape or color that can provide post-hoc decisions. Unlike previous work, we argue should be identifiable, meaning a number of known provably recovered to guarantee reliability the explanations. As starting point, explicitly make connection between classical...