Enhancing Small Tabular Clinical Trial Dataset through Hybrid Data Augmentation: Combining SMOTE and WCGAN-GP

Oversampling Synthetic data Generative adversarial network
DOI: 10.3390/data8090135 Publication Date: 2023-08-23T14:28:05Z
ABSTRACT
This study addressed the challenge of training generative adversarial networks (GANs) on small tabular clinical trial datasets for data augmentation, which are known to pose difficulties in due limited sample sizes. To overcome this obstacle, a hybrid approach is proposed, combining synthetic minority oversampling technique (SMOTE) initially augment original more substantial size improving subsequent GAN with Wasserstein conditional network gradient penalty (WCGAN-GP), proven its state-of-art performance and enhanced stability. The ultimate objective research was demonstrate that quality generated by final WCGAN-GP model maintains structural integrity statistical representation dataset using approach. focus particularly relevant trials, where availability privacy concerns restricted accessibility subject enrollment common challenges. Despite limitation data, findings successfully generates closely preserved characteristics dataset. By harnessing power generate faithful potential enhancing data-driven drug trials become evident. includes enabling robust analysis datasets, supplementing lack facilitating utility machine learning tasks, even extending anomaly detection ensure better control during collection, all while prioritizing implementing strict protection measures.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (23)
CITATIONS (13)