Warwick electron microscopy datasets

0301 basic medicine FOS: Computer and information sciences Computer Science - Machine Learning 0303 health sciences Image and Video Processing (eess.IV) Electrical Engineering and Systems Science - Image and Video Processing QA76 Machine Learning (cs.LG) 03 medical and health sciences ZA FOS: Electrical engineering, electronic engineering, information engineering QC
DOI: 10.1088/2632-2153/ab9c3c Publication Date: 2020-06-12T22:16:04Z
ABSTRACT
Abstract Large, carefully partitioned datasets are essential to train neural networks and standardize performance benchmarks. As a result, we have set up new repositories to make our electron microscopy datasets available to the wider community. There are three main datasets containing 19769 scanning transmission electron micrographs, 17266 transmission electron micrographs, and 98340 simulated exit wavefunctions, and multiple variants of each dataset for different applications. To visualize image datasets, we trained variational autoencoders to encode data as 64-dimensional multivariate normal distributions, which we cluster in two dimensions by t-distributed stochastic neighbor embedding. In addition, we have improved dataset visualization with variational autoencoders by introducing encoding normalization and regularization, adding an image gradient loss, and extending t-distributed stochastic neighbor embedding to account for encoded standard deviations. Our datasets, source code, pretrained models, and interactive visualizations are openly available at https://github.com/Jeffrey-Ede/datasets.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (112)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....