Warwick electron microscopy datasets
0301 basic medicine
FOS: Computer and information sciences
Computer Science - Machine Learning
0303 health sciences
Image and Video Processing (eess.IV)
Electrical Engineering and Systems Science - Image and Video Processing
QA76
Machine Learning (cs.LG)
03 medical and health sciences
ZA
FOS: Electrical engineering, electronic engineering, information engineering
QC
DOI:
10.1088/2632-2153/ab9c3c
Publication Date:
2020-06-12T22:16:04Z
AUTHORS (1)
ABSTRACT
Abstract
Large, carefully partitioned datasets are essential to train neural networks and standardize performance benchmarks. As a result, we have set up new repositories to make our electron microscopy datasets available to the wider community. There are three main datasets containing 19769 scanning transmission electron micrographs, 17266 transmission electron micrographs, and 98340 simulated exit wavefunctions, and multiple variants of each dataset for different applications. To visualize image datasets, we trained variational autoencoders to encode data as 64-dimensional multivariate normal distributions, which we cluster in two dimensions by t-distributed stochastic neighbor embedding. In addition, we have improved dataset visualization with variational autoencoders by introducing encoding normalization and regularization, adding an image gradient loss, and extending t-distributed stochastic neighbor embedding to account for encoded standard deviations. Our datasets, source code, pretrained models, and interactive visualizations are openly available at https://github.com/Jeffrey-Ede/datasets.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (112)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....