Using jamovi and SPSS rather than a spreadsheet

Many people use spreadsheets (such as Microsoft Excel) rather than statistical software. The following information comes from (Dunn 2019):

Using spreadsheets requires extreme care; many extremely expensive and dangerous errors have been made through using spreadsheets (AlTarawneh and Thorne 2017), including problems when reporting the 2020 COVID-19 pandemic.

These problems emerge for many different reasons:

  • Spreadsheets often change the entered data (for example, reformatting entries as dates if the spreadsheet thinks the data should be a date) in the interest of being ‘user-friendly,’ even when not appropriate. This has had dire consequences (Ziemann, Eren, and El-Osta 2016).
  • Many spreadsheets have errors in formulas (R. R. Panko and Sprague Jr 1998), but these are incredible difficult to locate and hence fix (Galletta et al. (1996), R. Panko (2016)).
  • Spreadsheets do not leave a record of how the data have been analysed; for example, formulas can be very difficult to understand and parse. Keeping a record of the analysis, new variables that have been created, and other operations with the data is called reproducible research. Reproducibility ensures, among other advantages, that the results can be checked by others.
  • Excel has bugs (Keeling and Pavur (2004), Mélard (2014)) even in very basic operations (Berger (2007), Hargreaves and McWilliams (2010)). After trying to fix these bugs, sometimes they are made even worse (McCullough and Wilson 2002).

Spreadsheets can be used for research and analysis… but you must be very careful! Many of the problems are due to human error, and some emerge because Excel is being used for purposes it is not really designed for (i.e. scientific analysis).

References

AlTarawneh, Ghada, and Simon Thorne. 2017. “A Pilot Study Exploring Spreadsheet Risk in Scientific Research.” arXiv Preprint arXiv:1703.09785.
Berger, Roger L. 2007. “Nonstandard Operator Precedence in Excel.” Computational Statistics & Data Analysis 51 (6): 2788–91.
Dunn, Peter K. 2019. Scientific Research Methods: An Introduction to Quantitative Research and Statistics in Science and Health. https://srm-course.netlify.com.
Galletta, Dennis F., Kathleen S. Hartzel, Susan E. Johnson, Jimmie L. Joseph, and Sandeep Rustagi. 1996. “Spreadsheet Presentation and Error Detection: An Experimental Study.” Journal of Management Information Systems 13 (3): 45–63.
Hargreaves, Bruce R., and Thomas P. McWilliams. 2010. “Polynomial Trendline Function Flaws in Microsoft Excel.” Computational Statistics & Data Analysis 54 (4): 1190–96.
Keeling, Kellie B., and Robert J. Pavur. 2004. “Numerical Accuracy Issues in Using Excel for Simulation Studies.” In Proceedings of the 2004 Winter Simulation Conference, 2004, 2:1513–18. IEEE.
McCullough, B. D., and Berry Wilson. 2002. “On the Accuracy of Statistical Procedures in Microsoft Excel 2000 and Excel XP.” Computational Statistics & Data Analysis 40 (4): 713–21.
Mélard, Guy. 2014. “On the Accuracy of Statistical Procedures in Microsoft Excel 2010.” Computational Statistics 29 (5): 1095–1128.
Panko, Ray. 2016. “What We Don’t Know about Spreadsheet Errors Today: The Facts, Why We Don’t Believe Them, and What We Need to Do.” arXiv Preprint arXiv:1602.02601.
Panko, Raymond R., and Ralph H. Sprague Jr. 1998. “Hitting the Wall: Errors in Developing and Code Inspecting a ‘Simple’ Spreadsheet Model.” Decision Support Systems 22 (4): 337–53.
Ziemann, Mark, Yotam Eren, and Assam El-Osta. 2016. “Gene Name Errors Are Widespread in the Scientific Literature.” Genome Biology 17 (1): 1–3.