NFDI4DS | UHH-SEMS - Publication Details

Toon Calders

ORCID: 0000-0002-4943-6978

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5073211270

Research Areas

Data Mining Algorithms and Applications
Rough Sets and Fuzzy Logic
Data Management and Algorithms
Advanced Database Systems and Queries
Imbalanced Data Classification Techniques
Semantic Web and Ontologies
Data Quality and Management
Ethics and Social Impacts of AI
Machine Learning and Data Classification
Natural Language Processing Techniques
Explainable Artificial Intelligence (XAI)
Topic Modeling
Complex Network Analysis Techniques
Human Mobility and Location-Based Analysis
Data Stream Mining Techniques
Business Process Modeling and Analysis
Time Series Analysis and Forecasting
Online Learning and Analytics
Hate Speech and Cyberbullying Detection
Data Visualization and Analytics
Adversarial Robustness in Machine Learning
Graph Theory and Algorithms
Service-Oriented Architecture and Web Services
Anomaly Detection Techniques and Applications
Software System Performance and Reliability

University of Antwerp
2008-2024

Université Libre de Bruxelles
2013-2020

ZNA Middelheim Hospital
2016-2019

University of Bremen
2019

Département d'Informatique
2013-2015

Eindhoven University of Technology
2006-2013

Siemens (United States)
2012-2013

Tamedia (Switzerland)
2005-2011

Fund for Scientific Research
2002

Data preprocessing techniques for classification without discrimination

OPENALEX - Publications

Faisal Kamiran Toon Calders

Recently, the following Discrimination-Aware Classification Problem was introduced: Suppose we are given training data that exhibit unlawful discrimination; e.g., toward sensitive attributes such as gender or ethnicity. The task is to learn a classifier optimizes accuracy, but does not have this discrimination in its predictions on test data. This problem relevant many settings, when generated by biased decision process attribute serves proxy for unobserved features. In paper, concentrate...

10.1007/s10115-011-0463-8 article EN cc-by-nc Knowledge and Information Systems 2011-12-03

Three naive Bayes approaches for discrimination-free classification

OPENALEX - Publications

Toon Calders Sicco Verwer

In this paper, we investigate how to modify the naive Bayes classifier in order perform classification that is restricted be independent with respect a given sensitive attribute. Such independency restrictions occur naturally when decision process leading labels data-set was biased; e.g., due gender or racial discrimination. This setting motivated by many cases which there exist laws disallow partly based on Naive application of machine learning techniques would result huge fines for...

10.1007/s10618-010-0190-x article EN cc-by-nc Data Mining and Knowledge Discovery 2010-07-26

Building Classifiers with Independency Constraints

OPENALEX - Publications

Toon Calders Faisal Kamiran Mykola Pechenizkiy

In this paper we study the problem of classifier learning where input data contains unjustified dependencies between some attributes and class label. Such cases arise for example when training is collected from different sources with labeling criteria or generated by a biased decision process. When trained directly on such data, these undesirable will carry over to classifier's predictions. order tackle problem, classification independency constraints problem: find an accurate model which...

10.1109/icdmw.2009.83 article EN IEEE ... International Conference on Data Mining workshops 2009-12-01

Classifying without discriminating

OPENALEX - Publications

Faisal Kamiran Toon Calders

Classification models usually make predictions on the basis of training data. If data is biased towards certain groups or classes objects, e.g., there racial discrimination black people, learned model will also show discriminatory behavior that particular community. This partial attitude may lead to outcomes when labeling future unlabeled objects. Often, however, impartial classification results are desired even required by law for objects in spite having In this paper, we tackle problem...

10.1109/ic4.2009.4909197 article EN 2009-02-01

Discrimination Aware Decision Tree Learning

OPENALEX - Publications

Faisal Kamiran Toon Calders Mykola Pechenizkiy

Recently, the following discrimination aware classification problem was introduced: given a labeled dataset and an attribute B, find classifier with high predictive accuracy that at same time does not discriminate on basis of B. This is motivated by fact often available historic data biased due to discrimination, e.g., when B denotes ethnicity. Using standard learners this may lead wrongfully classifiers, even if removed from training data. Existing solutions for consist in "cleaning away"...

10.1109/icdm.2010.50 article EN 2010-12-01

Handling Conditional Discrimination

OPENALEX - Publications

Indre liobaite Faisal Kamiran Toon Calders

Historical data used for supervised learning may contain discrimination. We study how to train classifiers on such data, so that they are discrimination free with respect a given sensitive attribute, e.g., gender. Existing techniques deal this problem aim at removing all and do not take into account part of the be explainable by other attributes, as, education level. In context, we introduce analyze issue conditional non-discrimination in classifier design. show some differences decisions...

10.1109/icdm.2011.72 article EN 2011-12-01

Controlling Attribute Effect in Linear Regression

OPENALEX - Publications

Toon Calders Asim Karim Faisal Kamiran Wasif Ali Xiangliang Zhang

In data mining we often have to learn from biased data, because, for instance, comes different batches or there was a gender racial bias in the collection of social data. some applications it may be necessary explicitly control this models This paper is first study learning linear regression under constraints that biasing effect given attribute such as batch number. We show how propensity modeling can used factoring out part justified by externally provided explanatory attributes. Then...

10.1109/icdm.2013.114 article EN 2013-12-01

Mining Compressing Sequential Patterns

OPENALEX - Publications

Hoang Thanh Lam Fabian Mörchen Dmitriy Fradkin Toon Calders

Abstract Pattern mining based on data compression has been successfully applied in many tasks. For itemset data, the Krimp algorithm minimum description length (MDL) principle was shown to be very effective solving redundancy issue descriptive pattern mining. However, for sequence of set frequent sequential patterns is not fully addressed literature. In this article, we study MDL‐based algorithms non‐redundant sets from a database. First, propose an encoding scheme compressing with patterns....

10.1002/sam.11192 article EN Statistical Analysis and Data Mining The ASA Data Science Journal 2013-05-23

Measuring Fairness with Biased Rulers: A Comparative Study on Bias Metrics for Pre-trained Language Models

OPENALEX - Publications

Pieter Delobelle Ewoenam Kwaku Tokpo Toon Calders Bettina Berendt

Pieter Delobelle, Ewoenam Tokpo, Toon Calders, Bettina Berendt. Proceedings of the 2022 Conference North American Chapter Association for Computational Linguistics: Human Language Technologies. 2022.

10.18653/v1/2022.naacl-main.122 article EN cc-by Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2022-01-01

Non-derivable itemset mining

OPENALEX - Publications

Toon Calders Bart Goethals

All frequent itemset mining algorithms rely heavily on the monotonicity principle for pruning. This allows excluding candidate itemsets from expensive counting phase. In this paper, we present sound and complete deduction rules to derive bounds support of an itemset. Based these rules, construct a condensed representation all itemsets, by removing those which can be derived, resulting in so called Non-Derivable Itemsets (NDI) representation. We also connections between our proposal recent...

10.1007/s10618-006-0054-6 article EN cc-by-nc Data Mining and Knowledge Discovery 2007-01-25

Quantifying explainable discrimination and removing illegal discrimination in automated decision making

OPENALEX - Publications

Faisal Kamiran Indrė Žliobaitė Toon Calders

10.1007/s10115-012-0584-8 article EN Knowledge and Information Systems 2012-11-17

Introduction to the special section on educational data mining

OPENALEX - Publications

Toon Calders Mykola Pechenizkiy

Educational Data Mining (EDM) is an emerging multidisciplinary research area, in which methods and techniques for exploring data originating from various educational information systems have been developed. EDM both a learning science, as well rich application area mining, due to the growing availability of data. contributes study how students learn, settings they learn. It enables data-driven decision making improving current practice material. We present brief overview introduce four...

10.1145/2207243.2207245 article EN ACM SIGKDD Explorations Newsletter 2012-05-01

Approximation of Frequentness Probability of Itemsets in Uncertain Data

OPENALEX - Publications

Toon Calders Calin Garboni Bart Goethals

Mining frequent item sets from transactional datasets is a well known problem with good algorithmic solutions. Most of these algorithms assume that the input data free errors. Real data, however, often affected by noise. Such noise can be represented uncertain in which each has an existence probability. Recently, Bernecker et al. (2009) proposed frequentness probability, i.e., probability given set frequent, to select database. A dynamic programming approach evaluate this measure was as...

10.1109/icdm.2010.42 article EN 2010-12-01

Mining frequent itemsets in a stream

OPENALEX - Publications

Toon Calders Nele Dexters Joris J. M. Gillis Bart Goethals

10.1016/j.is.2012.01.005 article EN Information Systems 2012-01-30

Applying Webmining Techniques to Execution Traces to Support the Program Comprehension Process

OPENALEX - Publications

Andy Zaidman Toon Calders Serge Demeyer Jan Paredaens

Well-designed object-oriented programs typically consist of a few key classes that work tightly together to provide the bulk functionality. As such, these are excellent starting points for program comprehension process. We propose technique uses Webmining principles on execution traces discover important and interacting classes. Based two medium-scale case studies - Apache Ant Jakarta JMeter detailed architectural information from its developers, we show our heuristic does in fact find...

10.1109/csmr.2005.12 article EN 2005-03-31

Mining Frequent Itemsets in a Stream

OPENALEX - Publications

Toon Calders Nele Dexters Bart Goethals

Mining frequent itemsets in a datastream proves to be difficult problem, as arrive rapid succession and storing parts of the stream is typically impossible.Nonetheless, it has many useful applications; e.g., opinion sentiment analysis from social networks.Current mining algorithms are based on approximations.In earlier work, items under max-frequency measure proved effective for items.In this paper, we extended our work itemsets.Firstly, an optimized incremental algorithm presented.The...

10.1109/icdm.2007.66 article EN 2007-10-01

Coming Soon ...