11 Software and other tools
I am compiling a list of useful resources, focusing on open-access and open science in this collaborative Airtable (Hope I remember to keep that updated!)
You can sort and filter by any category especially ‘resource’; I did an ad-hoc rating of ‘relevance to UG and Msc students’ in the final column.
To interact more with this, you can have full ‘commenter’ access link
11.1 Computers and network resources
Accessing your university network, etc
11.2 Document preparation systems
Word processors
(e.g., MS Word and the equation editor)
Latex and Latex interfaces
Latex
Overleaf
Lyx, SWP, etc
Markdown-based and ‘dynamic documents’ and notebooks
Thanks to technology, you can now produce ‘dynamic documents’ that integrate the writing of your paper with the code that produces (or at least refers to) the stats
KnitR, Rmarkdown and Bookdown (what this book is created with)
- See especially R-markdown, the definitive guide
Markdown: Basic syntax for creating content with raw text. Simpler than latex; used in the above
Pandoc: A tools that has many many libraries to convert between different document/scripting formats; integral to the above
Other options include
JuPytR notebooks (using any combination of Julia, Python, and R)
Stata 16 is offering some tool that integrates with MS Word, I believe.
11.3 Text editors
A text editor is just what it sounds like. Unlike in a word processor a text editor is showing you the ‘raw text’, without formatting (but it can involve ‘syntax highlighting’). You typically write code within a text editor. I also use it for taking notes and writing ‘markup’ and markdown. My favorite text editor is ‘Vim’. Old-school nerds had a rivalry between Vim and e-macs, I believe. Both of these have great features, enabling efficient typing, navigation, fancy copy-paste, macros, regular expressions, etc. Vim also has its own language to help you ‘clean’ data, code, and files. The most popular text-editor of today might be Atom.
Programming package interfaces like Stata and R (RStudio) typically have their own text-editors built in, enabling you to write, save, and executer code directly from within these. You can configure these in many ways, changing ‘key bindings’, syntax highlighting, ‘code folding’ etc. (You can also configure other text editors to automatically send the code to programs like Stata and R.)
The ‘terminal’/command window/shell in Windows, Mac, and Unix
(I’ll come back to this)
11.4 Citation management tools
You do not need to enter in all your references and citations manually: there are many tools for this.
Storing and organizing your references
- Zotero, Mendeley, Endnote, Jabref etc.
See the following link: http://jabref.sourceforge.net/. Note that to use Jabref with word you should use Bibtex4Word see the following link: http://www.ee.ic.ac.uk/hp/staff/dmb/perl/index.html
Including citations and ‘bibliographies’ in your paper
Bibtex (for Latex and Markdown)
Plugins for word processors
11.5 Spreadsheets: just say no!
You are not advised to do data cleaning or statistical analysis in spreadsheet tools like Excel. You need a more powerful, systematic tool, that keeps a record of the steps you have taken. You need to ‘write code’ in some way.
11.6 Statistical and coding software
Stata
Stata is used by the majority of empirical economists, but this may be changing. It is fairly easy to learn as well as powerful and adaptable. It is not free or open-source (although people contribute a lot of code and scripts): you need a license, which students can get through (most) universities (including Exeter).
It is basically a language for working with data and doing statistics (especially econometrics). It’s not a ‘real programming language’.
Recommended online resources/guides (including more than just coding tips):
StataCorp resources listed here
R
The language statisticians use and more and more people in social science. It’s also a ‘real programming language’ although most people who use it are working with data/statistics. Very big in the new booming field of ‘data science’.
Completely open-source, collaborative and free.
The cutting-edge statistics and research-methods tools usually come out in R first (e.g., new machine-learning packages, the ‘declaredesign’ package for experiments, etc.).
Some killer features/tools include:
R-markdown, knitr, bookdown: Dynamic documents that can be made into pdfs, web-books, web-pages, web-based slides, etc.
ggplot: the best tool for graphical analysis/presentation of data
‘tidyr’ and the ‘tidyverse’
Rstudio
Recommended online resources/guides (including more than just coding tips):
However, it’s a bit harder to learn than Stata.
Other stats packages and coding tools
Python: Perhaps the most-popular coding language today, supposed to be easy to learn. It’s a ‘general programming language’ but many many statistical tools have been integrated. Very big in the new booming field of ‘data science’, maybe even more important than R.
SAS: Old but known to be very good with large data sets and thus has some popularity again
Spss (not recommended)
11.7 (Other) Maths software
(Maple, Matlab, Mathematica)