- Privacy-Preserving Technologies in Data
- Ethics in Clinical Research
- Machine Learning in Healthcare
- Electronic Health Records Systems
- Data Quality and Management
- Cryptography and Data Security
- Artificial Intelligence in Healthcare and Education
- Biomedical Text Mining and Ontologies
- Privacy, Security, and Data Protection
- Topic Modeling
- Data-Driven Disease Surveillance
- Genomics and Rare Diseases
- Artificial Intelligence in Healthcare
- Cancer Genomics and Diagnostics
- Network Security and Intrusion Detection
- Chronic Disease Management Strategies
- Biomedical Ethics and Regulation
- Access Control and Trust
- BRCA gene mutations in cancer
- Adversarial Robustness in Machine Learning
- Information and Cyber Security
- Internet Traffic Analysis and Secure E-voting
- Health Literacy and Information Accessibility
- Explainable Artificial Intelligence (XAI)
- Tensor decomposition and applications
Vanderbilt University
2016-2025
Vanderbilt University Medical Center
2016-2025
University of Illinois Urbana-Champaign
2024
The University of Texas Health Science Center at Houston
2013-2024
University of Pennsylvania
2024
University of Ottawa
2022
University of Milan
2013-2022
Santa Clara University
2022
University of Newcastle Australia
2022
Colorado State University
2022
In the context of sharing video surveillance data, a significant threat to privacy is face recognition software, which can automatically identify known people, such as from database drivers' license photos, and thereby track people regardless suspicion. This paper introduces an algorithm protect individuals in data by deidentifying faces that many facial characteristics remain but cannot be reliably recognized. A trivial solution involves blacking out each face. thwarts any possible...
Access to electronic health record (EHR) data has motivated computational advances in medical research. However, various concerns, particularly over privacy, can limit access and collaborative use of EHR data. Sharing synthetic could mitigate risk. In this paper, we propose a new approach, Generative Adversarial Network (medGAN), generate realistic patient records. Based on input real records, medGAN high-dimensional discrete variables (e.g., binary count features) via combination an...
Objective Many healthcare organizations follow data protection policies that specify which patient identifiers must be suppressed to share "de-identified" records. Such policies, however, are often applied without knowledge of the risk "re-identification". The goals this work are: (1) estimate re-identification for sharing Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule; (2) evaluate a specific attack using voter registration lists. Measurements We define several...
There is a strong movement to share individual patient data for secondary purposes, particularly research. A major obstacle broad sharing has been the concern privacy. One of methods protecting privacy patients in accordance with laws and regulations anonymise before it shared. This article describes key concepts principles anonymising health while ensuring remains suitable meaningful analysis.
The potential of artificial intelligence (AI) to reduce health care disparities and inequities is recognized, but it can also exacerbate these issues if not implemented in an equitable manner. This perspective identifies biases each stage the AI life cycle, including data collection, annotation, machine learning model development, evaluation, deployment, operationalization, monitoring, feedback integration. To mitigate biases, we suggest involving a diverse group stakeholders, using...
Drug repurposing represents an attractive alternative to the costly and time-consuming process of new drug development, particularly for serious, widespread conditions with limited effective treatments, such as Alzheimer's disease (AD). Emerging generative artificial intelligence (GAI) technologies like ChatGPT offer promise expediting review summary scientific knowledge. To examine feasibility using GAI identifying candidates, we iteratively tasked proposing twenty most promising drugs in...
Computational phenotyping is the process of converting heterogeneous electronic health records (EHRs) into meaningful clinical concepts. Unsupervised methods have potential to leverage a vast amount labeled EHR data for phenotype discovery. However, existing unsupervised do not incorporate current medical knowledge and cannot directly handle missing, or noisy data. We propose Rubik, constrained non-negative tensor factorization completion method phenotyping. Rubik incorporates 1) guidance...
To support large-scale biomedical research projects, organizations need to share person-specific genomic sequences without violating the privacy of their data subjects. In past, protected subjects' identities by removing identifiers, such as name and social security number; however, recent investigations illustrate that deidentified can be "reidentified" named individuals using simple automated methods. this paper, we present a novel cryptographic framework enables mining disclosing raw...
Objective De-identified clinical data in standardized form (eg, diagnosis codes), derived from electronic medical records, are increasingly combined with research DNA sequences) and disseminated to enable scientific investigations. This study examines whether released can be linked identified records that accessible via various resources jeopardize patients' anonymity, the ability of popular privacy protection methodologies prevent such an attack.
Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet personalized medicine. The increasing adoption electronic medical records allows large amounts patients’ standardized clinical features to be combined with genomic sequences these patients and shared support validation GWAS findings enable novel discoveries. However, disseminating data “as is” may lead patient reidentification...
To design and implement a tool that creates secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in large metropolitan area the United States (Chicago, IL), for use clinical research.The authors developed distributed software application performs standardized cleaning, preprocessing, hashing patient identifiers to remove all protected information. The seeded hash code combinations using Health Insurance Portability Accountability Act compliant...