NFDI4DS | UHH-SEMS - Publication Details

Skyler Speakman

ORCID: 0000-0003-0337-2312

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5029048857

Research Areas

Anomaly Detection Techniques and Applications
Data-Driven Disease Surveillance
Adversarial Robustness in Machine Learning
Global Maternal and Child Health
Cell Image Analysis Techniques
Aesthetic Perception and Analysis
Cutaneous Melanoma Detection and Management
Insurance, Mortality, Demography, Risk Management
Machine Learning in Materials Science
COVID-19 epidemiological studies
Statistical Methods and Inference
Bacillus and Francisella bacterial research
Advanced Statistical Methods and Models
Dermatology and Skin Diseases
Network Security and Intrusion Detection
Creativity in Education and Neuroscience
Domain Adaptation and Few-Shot Learning
Imbalanced Data Classification Techniques
Child Nutrition and Water Access
Explainable Artificial Intelligence (XAI)
Machine Learning and Data Classification
Data Stream Mining Techniques
Digital Media Forensic Detection
Zoonotic diseases and public health
Advanced Causal Inference Techniques

IBM Research - Africa
2020-2023

IBM (United States)
2021

Carnegie Mellon University
2013-2018

Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models

OPENALEX - Publications

Daniel Omeiza Skyler Speakman Celia Cintas Komminist Weldermariam

Gaining insight into how deep convolutional neural network models perform image classification and to explain their outputs have been a concern computer vision researchers decision makers. These are often referred as black box due low comprehension of internal workings. As an effort developing explainable learning models, several methods proposed such finding gradients class output with respect input (sensitivity maps), activation map (CAM), Gradient based Class Activation Maps (Grad-CAM)....

10.48550/arxiv.1908.01224 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Fast generalized subset scan for anomalous pattern detection

OPENALEX - Publications

Edward McFowland Skyler Speakman Daniel B. Neill

We propose Fast Generalized Subset Scan (FGSS), a new method for detecting anomalous patterns in general categorical data sets. frame the pattern detection problem as search over subsets of records and attributes, maximizing nonparametric scan statistic all such subsets. prove that statistics possess novel property allows efficient optimization exponentially many without an exhaustive search, enabling FGSS to scale massive high-dimensional evaluate performance three real-world application...

10.5555/2567709.2567713 article EN Journal of Machine Learning Research 2013-01-01

Fair Transfer Learning with Missing Protected Attributes

OPENALEX - Publications

Amanda Coston Karthikeyan Natesan Ramamurthy Dennis Wei Kush R. Varshney Skyler Speakman and 2 more

Risk assessment is a growing use for machine learning models. When used in high-stakes applications, especially ones regulated by anti-discrimination laws or governed societal norms fairness, it important to ensure that learned models do not propagate and scale any biases may exist training data. In this paper, we add on an additional challenge beyond fairness: unsupervised domain adaptation covariate shift between source target distribution. Motivated the real-world problem of risk new...

10.1145/3306618.3314236 article EN 2019-01-27

Skin Tone Analysis for Representation in Educational Materials (STAR-ED) using machine learning

OPENALEX - Publications

Girmaw Abebe Tadesse Celia Cintas Kush R. Varshney Peter Staar Chinyere I. Agunwa and 10 more

Abstract Images depicting dark skin tones are significantly underrepresented in the educational materials used to teach primary care physicians and dermatologists recognize diseases. This could contribute disparities disease diagnosis across different racial groups. Previously, domain experts have manually assessed textbooks estimate diversity images. Manual assessment does not scale many introduces human errors. To automate this process, we present Skin Tone Analysis for Representation...

10.1038/s41746-023-00881-0 article EN cc-by npj Digital Medicine 2023-08-18

Scalable Detection of Anomalous Patterns With Connectivity Constraints

OPENALEX - Publications

Skyler Speakman Edward McFowland Daniel B. Neill

We present GraphScan, a novel method for detecting arbitrarily shaped connected clusters in graph or network data. Given structure, data observed at each node, and score function defining the anomalousness of set nodes, GraphScan can efficiently exactly identify most anomalous (highest-scoring) subgraph. Kulldorff's spatial scan, which searches over circles consisting center location its k − 1 nearest neighbors, has been extended to include connectivity constraints by FlexScan. However,...

10.1080/10618600.2014.960926 article EN Journal of Computational and Graphical Statistics 2014-10-09

Sub-population identification of multimorbidity in sub-Saharan African populations

OPENALEX - Publications

Adebayo Oshingbesan Michelle Kamp Phelelani Thokozani Mpangase Kayode E. Adetunji Samuel Iddi and 12 more

Abstract This work provides three contributions that straddle the medical literature on multimorbidity and data science community with an interest exploratory analysis of health-related research data. First, we propose a definition for as co-occurrence (at least) two disease diagnoses from pre-determined list. interpretation adds to growing body working definitions emerging literature. Second, apply this novel outcome of-interest sub-Saharan populations located in Nairobi, Kenya Agincourt,...

10.1038/s41598-025-96569-4 article EN cc-by Scientific Reports 2025-04-22

Penalized Fast Subset Scanning

OPENALEX - Publications

Skyler Speakman Sriram Somanchi Edward McFowland Daniel B. Neill

We present the penalized fast subset scan (PFSS), a new and general framework for scalable accurate pattern detection. PFSS enables exact efficient identification of most anomalous subsets data, as measured by likelihood ratio statistic. However, also allows incorporation prior information about each data element's probability inclusion, which was not previously possible within framework. builds on two main results: first, we prove that large class statistics satisfy property additional,...

10.1080/10618600.2015.1029578 article EN Journal of Computational and Graphical Statistics 2015-04-18

Dynamic Pattern Detection with Temporal Consistency and Connectivity Constraints

OPENALEX - Publications

Skyler Speakman Yating Zhang Daniel B. Neill

We explore scalable and accurate dynamic pattern detection methods in graph-based data sets. apply our proposed Dynamic Subset Scan method to the task of detecting, tracking, source-tracing contaminant plumes spreading through a water distribution system equipped with noisy, binary sensors. While static patterns affect same subset over period time, may different subsets at each time step. These require new approach define optimize penalized likelihood ratio statistics scan framework, as well...

10.1109/icdm.2013.66 article EN 2013-12-01

Bridging the gap: leveraging data science to equip domain experts with the tools to address challenges in maternal, newborn, and child health

OPENALEX - Publications

Girmaw Abebe Tadesse William Ogallo Celia Cintas Skyler Speakman Aisha Walcott-Bryant and 1 more

Abstract The United Nations Sustainable Development Goals (SDGs) advocate for reducing preventable Maternal, Newborn, and Child Health (MNCH) deaths complications. However, many low- middle-income countries remain disproportionately affected by high rates of poor MNCH outcomes. Progress towards the 2030 sustainable development targets remains stagnated uneven within across countries, particularly in sub-Saharan Africa. current scenario is exacerbated a multitude factors, including COVID-19...

10.1038/s44294-024-00017-z article EN cc-by npj Women s Health 2024-05-10

Out-of-Distribution Detection in Dermatology Using Input Perturbation and Subset Scanning

OPENALEX - Publications

Hannah Kim Girmaw Abebe Tadesse Celia Cintas Skyler Speakman Kush R. Varshney

Recent advances in deep learning have led to breakthroughs the development of automated skin disease classification. As we observe an increasing interest these models dermatology space, it is crucial address aspects such as robustness towards input data distribution shifts. Current tend make incorrect inferences for test samples from different hardware devices and clinical settings or unknown samples, which are out-of-distribution (OOD) training samples. To this end, propose a simple yet...

10.1109/isbi52829.2022.9761412 article EN 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) 2022-03-28

Detecting Systematic Deviations in Data and Models

OPENALEX - Publications

Skyler Speakman Girmaw Abebe Tadesse Celia Cintas William Ogallo Tanya Akumu and 1 more

Trustworthy artificial intelligence researchers should seek to better detect and characterize systematic deviations in data models (that is, bias). This article provides scientists with motivation, theory, code, examples on how perform disciplined discovery of at the subset level.

10.1109/mc.2022.3213209 article EN cc-by-nc-nd Computer 2023-02-01

Detecting Adversarial Attacks via Subset Scanning of Autoencoder Activations and Reconstruction Error

OPENALEX - Publications

Celia Cintas Skyler Speakman Victor Akinwande William Ogallo Komminist Weldemariam and 2 more

Reliably detecting attacks in a given set of inputs is high practical relevance because the vulnerability neural networks to adversarial examples. These altered create security risk applications with real-world consequences, such as self-driving cars, robotics and financial services. We propose an unsupervised method for inner layers autoencoder (AE) by maximizing non-parametric measure anomalous node activations. Previous work this space has shown AE can detect images thresholding...

10.24963/ijcai.2020/122 article EN 2020-07-01

Identifying Audio Adversarial Examples via Anomalous Pattern Detection

OPENALEX - Publications

Victor Akinwande Celia Cintas Skyler Speakman Srihari Sridharan

Audio processing models based on deep neural networks are susceptible to adversarial attacks even when the audio waveform is 99.9% similar a benign sample. Given wide application of DNN-based recognition systems, detecting presence examples high practical relevance. By applying anomalous pattern detection techniques in activation space these models, we show that 2 recent and current state-of-the-art systems systematically lead higher-than-expected at some subset nodes can detect with up an...

10.48550/arxiv.2002.05463 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Identifying Factors Associated with Neonatal Mortality in Sub-Saharan Africa using Machine Learning

OPENALEX - Publications

William Ogallo Skyler Speakman Victor Akinwande Kush R. Varshney Aisha Walcott-Bryant and 4 more

Abstract This study aimed at identifying the factors associated with neonatal mortality. We analyzed Demographic and Health Survey (DHS) datasets from 10 Sub-Saharan countries. For each survey, we trained machine learning models to identify women who had experienced a death within 5 years prior survey being administered. then inspected by visualizing features that were important for model, how, on average, changing values of affected risk confirmed known positive correlation between birth...

10.1101/2020.10.14.20212225 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2020-10-16

Three Population Covariate Shift for Mobile Phone-based Credit Scoring

OPENALEX - Publications

Skyler Speakman Srihari Sridharan Isaac Markus

Mobile money platforms are gaining traction across developing markets as a convenient way of sending and receiving over mobile phones. Recent joint collaborations between banks mobile-network operators leverage customer's past phone transactions in order to create credit score for the individual. In this work, we address problem launching mobile-phone based scoring system new market without marginal distribution features borrowers market. This challenge rules out traditional transfer...

10.1145/3209811.3209856 article EN 2018-06-20

Towards Creativity Characterization of Generative Models via Group-Based Subset Scanning

OPENALEX - Publications

Célia Cintas Payel Das Brian Quanz Girmaw Abebe Tadesse Skyler Speakman and 1 more

Deep generative models, such as Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs), have been employed widely in computational creativity research. However, models discourage out-of-distribution generation to avoid spurious sample generation, thereby limiting their creativity. Thus, incorporating research on human into deep learning techniques presents an opportunity make outputs more compelling human-like. As we see the emergence of directed toward research, a need...

10.24963/ijcai.2022/683 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

Subset Scanning Over Neural Network Activations

OPENALEX - Publications

Skyler Speakman Srihari Sridharan Sekou L. Remy Komminist Weldemariam Edward McFowland

This work views neural networks as data generating systems and applies anomalous pattern detection techniques on that in order to detect when a network is processing an input. Detecting anomalies critical component for multiple machine learning problems including detecting adversarial noise. More broadly, this step towards giving the ability recognize out-of-distribution sample. first introduce "Subset Scanning" methods from domain task of input networks. Subset scanning treats problem...

10.48550/arxiv.1810.08676 preprint EN cc-by-nc-sa arXiv (Cornell University) 2018-01-01

Pattern detection in the activation space for identifying synthesized content

OPENALEX - Publications

Celia Cintas Skyler Speakman Girmaw Abebe Tadesse Victor Akinwande Edward McFowland and 1 more

10.1016/j.patrec.2021.12.007 article EN Pattern Recognition Letters 2021-12-17

Spatially Constrained Adversarial Attack Detection and Localization in the Representation Space of Optical Flow Networks

OPENALEX - Publications

Hannah Halin Kim Celia Cintas Girmaw Abebe Tadesse Skyler Speakman

Optical flow estimation have shown significant improvements with advances in deep neural networks. However, these networks recently been to be vulnerable patch-based adversarial attacks, which poses security risks real-world applications, such as self-driving cars and robotics. We propose SADL, a Spatially constrained Attack Detection Localization framework, detect localize attack without requiring dedicated training. The detection of an attacked input sequence is performed via iterative...

10.24963/ijcai.2023/107 article EN 2023-08-01

Weakly Supervised Detection of Hallucinations in LLM Activations

OPENALEX - Publications

Miriam Rateike Célia Cintas John Wamburu Tanya Akumu Skyler Speakman

We propose an auditing method to identify whether a large language model (LLM) encodes patterns such as hallucinations in its internal states, which may propagate downstream tasks. introduce weakly supervised technique using subset scanning approach detect anomalous LLM activations from pre-trained models. Importantly, our does not need knowledge of the type a-priori. Instead, it relies on reference dataset devoid anomalies during testing. Further, enables identification pivotal nodes...

10.48550/arxiv.2312.02798 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Racial and Neighborhood Disparities in Legal Financial Obligations in Jefferson County, Alabama

OPENALEX - Publications

Óscar Lara Yejas Arvind K. Joshi Andrew Martinez Leah Nelson Skyler Speakman and 4 more

Legal financial obligations (LFOs) such as court fees and fines are commonly levied on individuals who convicted of crimes. It is expected that LFO amounts should be similar across social, racial, geographic subpopulations the same crime. This work analyzes distribution LFOs in Jefferson County, Alabama highlights disparities different individual neighborhood demographic characteristics. Data-driven discovery methods used to detect experience higher than overall population offenders....

10.1609/aies.v7i1.31682 article EN 2024-10-16

Systematic Discovery of Bias in Data

OPENALEX - Publications

John Wamburu Girmaw Abebe Tadesse Celia Cintas Adebayo Oshingbesan Tanya Akumu and 1 more

Detecting bias in data is an integral component of trustworthy and responsible ML. For researchers scientists, investigating, detecting, becoming aware biases present important step to correcting making better ML decisions. Bias exists the form subsets that deviate from global expectations. Typically, begin with a set pre-defined protected/sensitive attributes use them as basis upon which deviation expectation examined. instance, researcher may examine under- or over-representation...

10.1109/bigdata55660.2022.10020781 article EN 2021 IEEE International Conference on Big Data (Big Data) 2022-12-17

Coming Soon ...