11 Software and other tools

Check the date of the last update of this file/site! As this is technology, whatever I say here is liable to go stale within a year or so.

I am compiling a list of useful resources, focusing on open-access and open science in this collaborative Airtable (Hope I remember to keep that updated!)

You can sort and filter by any category especially ‘resource’; I did an ad-hoc rating of ‘relevance to UG and Msc students’ in the final column.

To interact more with this, you can have full ‘commenter’ access link

11.1 Computers and network resources

Accessing your university network, etc

11.2 Document preparation systems

Word processors

(e.g., MS Word and the equation editor)

Latex and Latex interfaces

  • Latex

  • Overleaf

  • Lyx, SWP, etc

Markdown-based and ‘dynamic documents’ and notebooks

Thanks to technology, you can now produce ‘dynamic documents’ that integrate the writing of your paper with the code that produces (or at least refers to) the stats

  • KnitR, Rmarkdown and Bookdown (what this book is created with)

  • Markdown: Basic syntax for creating content with raw text. Simpler than latex; used in the above

  • Pandoc: A tools that has many many libraries to convert between different document/scripting formats; integral to the above

Other options include

  • JuPytR notebooks (using any combination of Julia, Python, and R)

  • Stata 16 is offering some tool that integrates with MS Word, I believe.

11.3 Text editors

A text editor is just what it sounds like. Unlike in a word processor a text editor is showing you the ‘raw text’, without formatting (but it can involve ‘syntax highlighting’). You typically write code within a text editor. I also use it for taking notes and writing ‘markup’ and markdown. My favorite text editor is ‘Vim’. Old-school nerds had a rivalry between Vim and e-macs, I believe. Both of these have great features, enabling efficient typing, navigation, fancy copy-paste, macros, regular expressions, etc. Vim also has its own language to help you ‘clean’ data, code, and files. The most popular text-editor of today might be Atom.

Programming package interfaces like Stata and R (RStudio) typically have their own text-editors built in, enabling you to write, save, and executer code directly from within these. You can configure these in many ways, changing ‘key bindings’, syntax highlighting, ‘code folding’ etc. (You can also configure other text editors to automatically send the code to programs like Stata and R.)

The ‘terminal’/command window/shell in Windows, Mac, and Unix

(I’ll come back to this)

11.4 Citation management tools

You do not need to enter in all your references and citations manually: there are many tools for this.

Storing and organizing your references

  • Zotero, Mendeley, Endnote, Jabref etc.

See the following link: http://jabref.sourceforge.net/. Note that to use Jabref with word you should use Bibtex4Word see the following link: http://www.ee.ic.ac.uk/hp/staff/dmb/perl/index.html

Including citations and ‘bibliographies’ in your paper

  • Bibtex (for Latex and Markdown)

  • Plugins for word processors

11.5 Spreadsheets: just say no!

You are not advised to do data cleaning or statistical analysis in spreadsheet tools like Excel. You need a more powerful, systematic tool, that keeps a record of the steps you have taken. You need to ‘write code’ in some way.

11.6 Statistical and coding software

Stata

Stata is used by the majority of empirical economists, but this may be changing. It is fairly easy to learn as well as powerful and adaptable. It is not free or open-source (although people contribute a lot of code and scripts): you need a license, which students can get through (most) universities (including Exeter).

It is basically a language for working with data and doing statistics (especially econometrics). It’s not a ‘real programming language’.

Recommended online resources/guides (including more than just coding tips):

R

The language statisticians use and more and more people in social science. It’s also a ‘real programming language’ although most people who use it are working with data/statistics. Very big in the new booming field of ‘data science’.

Completely open-source, collaborative and free.

The cutting-edge statistics and research-methods tools usually come out in R first (e.g., new machine-learning packages, the ‘declaredesign’ package for experiments, etc.).


Some killer features/tools include:

  • R-markdown, knitr, bookdown: Dynamic documents that can be made into pdfs, web-books, web-pages, web-based slides, etc.

  • ggplot: the best tool for graphical analysis/presentation of data

  • ‘tidyr’ and the ‘tidyverse’

  • Rstudio


Recommended online resources/guides (including more than just coding tips):

However, it’s a bit harder to learn than Stata.

Other stats packages and coding tools

  • Python: Perhaps the most-popular coding language today, supposed to be easy to learn. It’s a ‘general programming language’ but many many statistical tools have been integrated. Very big in the new booming field of ‘data science’, maybe even more important than R.

  • SAS: Old but known to be very good with large data sets and thus has some popularity again

  • Spss (not recommended)

11.7 (Other) Maths software

(Maple, Matlab, Mathematica)

11.8 Software for creating explanatory figures (not data-driven)

11.9 Resources for further study and research

11.10 Backing up, saving/storing your workflow

11.10.1 Backups e-}

Git/Github and version management