Integrating data-cleaning with data analysis to enhance usability of biodiversity big-data
Upload
Robustness
DOI:
10.3897/tdwgproceedings.1.20244
Publication Date:
2017-08-14T06:07:07Z
AUTHORS (2)
ABSTRACT
Biodiversity big-data (BBD) has the potential to provide answers some unresolved questions – at spatial and taxonomic swathes that were previously inaccessible. However, BBDs contain serious error bias. Therefore, any study uses BBD should ask whether data quality is sufficient a reliable answer research question. We propose question of could be addressed simultaneously, by binding data-cleaning analysis. The change in signal between pre- post-cleaning phases, addition itself, can used evaluate findings, their implications, robustness. This approach includes five steps: Downloading raw occurrence from BBD. Data analysis, statistical / or simulation modeling order question, using after necessary basic cleaning. part similar common practice. Comprehensive data-cleaning. Repeated analysis cleaned data. Comparing results steps 2 4 (i.e., before- data-cleaning). comparison will address issue quality, as well itself. step alone may misleading, due bias Even not trustworthy, since never complete, much remain changes cleaning are important keys If reveal stronger clearer than data, then most likely respective hypothesis confirmed. Conversely, if show weaker obtained hypothesis, even confirmed original needs rejected. Lastly, there mixed trend, whereby cases others it probably inadequate findings cannot considered conclusive. Thus, we conducted jointly. present case on effects environmental factors species distribution, GBIF all Australian mammals. performance distribution model (SDM) proxy for strength determining gradients richness. implemented three different SDM algorithms 190 several grid cells, vary examined correlations richness 10 indices. Species-environment affinity was species-rich areas, across algorithms. support notion impact continental scale decreases with increasing Seemingly, also continuum namely species-poor have strong affinities particular niches, but this structure breaks communities. Furthermore, revealed joint provides more means BBDs.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....