NFDI4DS | UHH-SEMS - Publication Details

Oluwaseyi Feyisetan

ORCID: 0000-0002-0786-9505

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5021612761

Research Areas

Privacy-Preserving Technologies in Data
Mobile Crowdsensing and Crowdsourcing
Privacy, Security, and Data Protection
Adversarial Robustness in Machine Learning
Cryptography and Data Security
Topic Modeling
Auction Theory and Applications
Internet Traffic Analysis and Secure E-voting
Anomaly Detection Techniques and Applications
Machine Learning and Data Classification
Stochastic Gradient Optimization Techniques
Data Quality and Management
Authorship Attribution and Profiling
Open Source Software Innovations
Experimental Behavioral Economics Studies
Explainable Artificial Intelligence (XAI)
Machine Learning and Algorithms
Natural Language Processing Techniques
Digital and Cyber Forensics
Data Stream Mining Techniques
Web Data Mining and Analysis
Software Testing and Debugging Techniques
Electronic Health Records Systems
Advanced Neural Network Applications
VLSI and Analog Circuit Testing

Amazon (Germany)
2019-2021

Amazon (United States)
2020-2021

University of Southampton
2014-2019

King's College London
2011

Privacy- and Utility-Preserving Textual Analysis via Calibrated Multivariate Perturbations

OPENALEX - Publications

Oluwaseyi Feyisetan Borja Balle Thomas M Drake Tom Diethe

Accurately learning from user data while providing quantifiable privacy guarantees provides an opportunity to build better ML models maintaining trust. This paper presents a formal approach carrying out preserving text perturbation using the notion of d_χ-privacy designed achieve geo-indistinguishability in location data. Our applies carefully calibrated noise vector representation words high dimension space as defined by word embedding models. We present proof that satisfies where parameter...

10.1145/3336191.3371856 article EN 2020-01-20

Improving Paid Microtasks through Gamification and Adaptive Furtherance Incentives

OPENALEX - Publications

Oluwaseyi Feyisetan Elena Simperl Max Van Kleek Nigel Shadbolt

Crowdsourcing via paid microtasks has been successfully applied in a plethora of domains and tasks. Previous efforts for making such crowdsourcing more effective have considered aspects as diverse task workflow design, spam detection, quality control, pricing models. Our work expands upon by examining the potential adding gamification to microtask interfaces means improving both worker engagement effectiveness. We run series experiments image labeling, one most common use cases...

10.1145/2736277.2741639 article EN 2015-05-18

Leveraging Hierarchical Representations for Preserving Privacy and Utility in Text

OPENALEX - Publications

Oluwaseyi Feyisetan Tom Diethe Thomas M Drake

Guaranteeing a certain level of user privacy in an arbitrary piece text is challenging issue. However, with this challenge comes the potential unlocking access to vast data stores for training machine learning models and supporting driven decisions. We address problem through lens dx-privacy, generalization Differential Privacy non Hamming distance metrics. In work, we explore word representations Hyperbolic space as means preserving text. provide proof satisfying then define probability...

10.1109/icdm.2019.00031 article EN 2021 IEEE International Conference on Data Mining (ICDM) 2019-11-01

A Differentially Private Text Perturbation Method Using Regularized Mahalanobis Metric

OPENALEX - Publications

Zekun Xu Abhinav Aggarwal Oluwaseyi Feyisetan Nathanael Teissier

Balancing the privacy-utility tradeoff is a crucial requirement of many practical machine learning systems that deal with sensitive customer data. A popular approach for privacy- preserving text analysis noise injection, in which data first mapped into continuous embedding space, perturbed by sampling spherical from an appropriate distribution, and then projected back to discrete vocabulary space. While this allows perturbation admit required metric differential privacy, often utility...

10.18653/v1/2020.privatenlp-1.2 article EN cc-by 2020-01-01

Social Incentives in Paid Collaborative Crowdsourcing

OPENALEX - Publications

Oluwaseyi Feyisetan Elena Simperl

Paid microtask crowdsourcing has traditionally been approached as an individual activity, with units of work created and completed independently by the members crowd. Other forms have, however, embraced more varied models, which allow for a greater level participant interaction collaboration. This article studies feasibility uptake such approach in context paid microtasks. Specifically, we compare engagement, task output, accuracy paired-worker model traditional, single-worker version. Our...

10.1145/3078852 article EN ACM Transactions on Intelligent Systems and Technology 2017-07-24

Workshop on Privacy in NLP (PrivateNLP 2020)

OPENALEX - Publications

Oluwaseyi Feyisetan Sepideh Ghanavati Patricia Thaine

Privacy-preserving data analysis has become essential in Machine Learning (ML), where access to vast amounts of can provide large gains the accuracies tuned models. A proportion user-contributed comes from natural language e.g., text transcriptions voice assistants. It is therefore important for curated datasets preserve privacy users whose collected and models trained on sensitive only retain non-identifying (i.e., generalizable) information. The workshop aims bring together researchers...

10.1145/3336191.3371881 article EN 2020-01-20

Private Release of Text Embedding Vectors

OPENALEX - Publications

Oluwaseyi Feyisetan Shiva Prasad Kasiviswanathan

Ensuring strong theoretical privacy guarantees on text data is a challenging problem which usually attained at the expense of utility. However, to improve practicality preserving analyses, it essential design algorithms that better optimize this tradeoff. To address challenge, we propose release mechanism takes any (text) embedding vector as input and releases corresponding private vector. The satisfies an extension differential metric spaces. Our idea based first randomly projecting vectors...

10.18653/v1/2021.trustnlp-1.3 article EN cc-by 2021-01-01

On a Utilitarian Approach to Privacy Preserving Text Generation

OPENALEX - Publications

Zekun Xu Abhinav Aggarwal Oluwaseyi Feyisetan Nathanael Teissier

Differentially-private mechanisms for text generation typically add carefully calibrated noise to input words and use the nearest neighbor noised as output word. When is small in magnitude, these are susceptible reconstruction of original sensitive text. This because likely be input. To mitigate this empirical privacy risk, we propose a novel class differentially private that parameterizes selection criterion traditional mechanisms. Motivated by Vickrey auction, where only second highest...

10.18653/v1/2021.privatenlp-1.2 article EN cc-by 2021-01-01

TEM: High Utility Metric Differential Privacy on Text

OPENALEX - Publications

Ricardo Silva Carvalho Theodore Vasiloudis Oluwaseyi Feyisetan

Ensuring the privacy of users whose data are used to train Natural Language Processing (NLP) models is necessary build and maintain customer trust. Differential Privacy (DP) has emerged as most successful method protect individuals. However, applying DP NLP domain comes with unique challenges. The previous methods use a generalization for metric spaces, apply privatization by adding noise inputs in space word embeddings. these assume that one specific distance measure being used, ignore...

10.48550/arxiv.2107.07928 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Beyond Monetary Incentives

OPENALEX - Publications

Oluwaseyi Feyisetan Elena Simperl

In this article, we aim to gain a better understanding into how paid microtask crowdsourcing could leverage its appeal and scaling power by using contests boost crowd performance engagement. We introduce our microtask-based annotation platform Wordsmith , which features incentives such as points, leaderboards, badges on top of financial remuneration. Our analysis focuses particular type incentive, contests, means apply in near-real-time scenarios, requesters need labels quickly. model...

10.1145/3321700 article EN ACM Transactions on Social Computing 2019-06-13

Density-Aware Differentially Private Textual Perturbations Using Truncated Gumbel Noise

OPENALEX - Publications

Nan Xu Oluwaseyi Feyisetan Abhinav Aggarwal Zekun Xu Nathanael Teissier

Deep Neural Networks, despite their success in diverse domains, are provably sensitive to small perturbations which cause the models return erroneous predictions minor transformations. Recently, it was proposed that this effect can be addressed text domain by optimizing for worst case loss function over all possible word substitutions within training examples. However, approach is prone weighing semantically unlikely replacements higher, resulting accuracy loss. In paper, we study robustness...

10.32473/flairs.v34i1.128463 article EN Proceedings of the ... International Florida Artificial Intelligence Research Society Conference 2021-04-18

An extended study of content and crowdsourcing-related performance factors in named entity annotation

OPENALEX - Publications

Oluwaseyi Feyisetan Elena Simperl Markus Luczak–Roesch Ramine Tinati Nigel Shadbolt

Hybrid annotation techniques have emerged as a promising approach to carry out named entity recognition on noisy microposts. In this paper, we identify set of content and crowdsourcing-related features (number type entities in post, average length sentiment tweets, composition skipped time spent complete the tasks, interaction with user interface) analyse their impact correct incorrect human annotations. We then carried further studies extended instructions disambiguation guidelines factors...

10.3233/sw-170282 article EN Semantic Web 2017-05-12

Quick-and-clean extraction of linked data entities from microblogs

OPENALEX - Publications

Oluwaseyi Feyisetan Elena Simperl Ramine Tinati Markus Luczak–Roesch Nigel Shadbolt

In this paper, we address the problem of finding Named Entities in very large micropost datasets. We propose methods to generate a sample representative microposts by discovering tweets that are likely refer new entities. Our approach is able significantly speed-up semantic analysis process discarding retweets, without pre-identifiable entities, as well similar and redundant tweets, while retaining information content.

10.1145/2660517.2660527 article EN 2014-09-02

Privacy-Preserving Natural Language Processing

OPENALEX - Publications

Ivan Habernal Fatemehsadat Mireshghallah Patricia Thaine Sepideh Ghanavati Oluwaseyi Feyisetan

Ivan Habernal, Fatemehsadat Mireshghallah, Patricia Thaine, Sepideh Ghanavati, Oluwaseyi Feyisetan. Proceedings of the 17th Conference European Chapter Association for Computational Linguistics: Tutorial Abstracts. 2023.

10.18653/v1/2023.eacl-tutorials.6 article EN cc-by 2023-01-01

Removing Spurious Correlation from Neural Network Interpretations

OPENALEX - Publications

Morteza Fotouhi Mohammad Taha Bahadori Oluwaseyi Feyisetan Payman Arabshahi David Heckerman

The existing algorithms for identification of neurons responsible undesired and harmful behaviors do not consider the effects confounders such as topic conversation. In this work, we show that can create spurious correlations propose a new causal mediation approach controls impact topic. experiments with two large language models, study localization hypothesis adjusting effect conversation topic, toxicity becomes less localized.

10.48550/arxiv.2412.02893 preprint EN arXiv (Cornell University) 2024-12-03

Fast Training Dataset Attribution via In-Context Learning

OPENALEX - Publications

Morteza Fotouhi Mohammad Taha Bahadori Oluwaseyi Feyisetan Payman Arabshahi David Heckerman

We investigate the use of in-context learning and prompt engineering to estimate contributions training data in outputs instruction-tuned large language models (LLMs). propose two novel approaches: (1) a similarity-based approach that measures difference between LLM with without provided context, (2) mixture distribution model frames problem identifying contribution scores as matrix factorization task. Our empirical comparison demonstrates is more robust retrieval noise learning, providing...

10.48550/arxiv.2408.11852 preprint EN arXiv (Cornell University) 2024-08-14

Differentially Private Adversarial Robustness Through Randomized Perturbations

OPENALEX - Publications

Nan Xu Oluwaseyi Feyisetan Abhinav Aggarwal Zekun Xu Nathanael Teissier

Deep Neural Networks, despite their great success in diverse domains, are provably sensitive to small perturbations on correctly classified examples and lead erroneous predictions. Recently, it was proposed that this behavior can be combatted by optimizing the worst case loss function over all possible substitutions of training examples. However, prone weighing unlikely higher, limiting accuracy gain. In paper, we study adversarial robustness through randomized perturbations, which has two...

10.48550/arxiv.2009.12718 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Coming Soon ...