Exploring scoring methods for research studies: Accuracy and variability of visual and automated sleep scoring

Kappa Cohen's kappa
DOI: 10.1111/jsr.12994 Publication Date: 2020-02-18T08:24:44Z
ABSTRACT
Sleep studies face new challenges in terms of data, objectives and metrics. This requires reappraising the adequacy existing analysis methods, including scoring methods. Visual automatic sleep healthy individuals were compared reliability (i.e., accuracy stability) to find a method capable giving access actual data variability without adding exogenous variability. A first dataset (DS1, four recordings) scored by six experts plus an autoscoring algorithm was used characterize inter-scoring second (DS2, 88 few weeks later explore intra-expert Percentage agreements Conger's kappa derived from epoch-by-epoch comparisons on pairwise consensus scorings. On DS1 number epochs agreement decreased when increased, ranging 86% (pairwise) 69% (all experts). Adding visual scorings changed value 0.81 0.79. Agreement between expert 93%. DS2 hypothesis supported systematic decrease scores as reference each single datasets (.75-.70). Although induces inter- variability, methods can cope with intra-scorer making them sensible option reduce give endogenous data.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (40)
CITATIONS (35)