1.4 Using software in research

Many people use spreadsheets (such as Microsoft Excel) for analysis of data in research.

Using spreadsheets requires extreme care; many extremely expensive and dangerous errors have been made due to using spreadsheets (AlTarawneh and Thorne 2017), including problems when reporting the 2020 COVID-19 pandemic.

Problems may emerge for many different reasons:

Spreadsheets can be used for research and analysis… but you must be very careful!

Many of the problems with using spreadsheets are due to human error, but spreadsheets make the errors hard to find. Some errors emerge because Excel is being used for purposes it is not really designed for (i.e., scientific analysis).

In this course, we will sometimes show output from the statistical software packages jamovi (The jamovi Project) and SPSS (IBM Corp 2016).

References

AlTarawneh G, Thorne S. A pilot study exploring spreadsheet risk in scientific research. arXiv preprint arXiv:170309785. 2017;
Berger RL. Nonstandard operator precedence in Excel. Computational Statistics & Data Analysis. Elsevier; 2007;51(6):2788–91.
Galletta DF, Hartzel KS, Johnson SE, Joseph JL, Rustagi S. Spreadsheet presentation and error detection: An experimental study. Journal of Management Information Systems. Taylor & Francis; 1996;13(3):45–63.
Hargreaves BR, McWilliams TP. Polynomial trendline function flaws in Microsoft Excel. Computational Statistics & Data Analysis. Elsevier; 2010;54(4):1190–6.
IBM Corp. IBM SPSS statistics for Windows, version 24.0. Armonk, NY: IBM Corp; 2016.
Keeling KB, Pavur RJ. Numerical accuracy issues in using Excel for simulation studies. Proceedings of the 2004 winter simulation conference, 2004. IEEE; 2004. p. 1513–8.
London RE, Slagter HA. Statement of retraction: Effects of transcranial direct current stimulation over left dorsolateral pFC on the attentional blink depend on individual baseline performance. Journal of Cognitive Neuroscience. 2021;1.
McCullough BD, Wilson B. On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP. Computational Statistics & Data Analysis. Elsevier; 2002;40(4):713–21.
Mélard G. On the accuracy of statistical procedures in Microsoft Excel 2010. Computational Statistics. Springer; 2014;29(5):1095–128.
Panko R. What we don’t know about spreadsheet errors today: The facts, why we don’t believe them, and what we need to do. arXiv preprint arXiv:160202601. 2016;
Panko RR, Sprague Jr RH. Hitting the wall: Errors in developing and code inspecting a ‘simple’ spreadsheet model. Decision Support Systems. Elsevier; 1998;22(4):337–53.
Simons JE, Holmes DT. Reproducible research and reports with R. Journal of Applied Laboratory Medicine. Oxford University Press; 2019;4(3):471–3.
The jamovi Project. jamovi (version 1.0) [computer software] [Internet]. Available from: https://www.jamovi.org.
Ziemann M, Eren Y, El-Osta A. Gene name errors are widespread in the scientific literature. Genome Biology. BioMed Central; 2016;17(1):1–3.