Prelude to Econometrics Using R
In preparation for my first semester teaching Econometrics using R, I prepared a series of R Notebooks for use in class. Not only did I use these notebooks to teach the material in class, I also provided them to students for their use, study, and such. At one point, in referring to this collection of notebooks, a student told me that they liked my book, and my first instinct was to reply that it wasn’t really a book, it was just a collection of teaching notes.
But this student’s statement stuck with me, and I realized that, while it wasn’t really presented or written like a book, it was awfully close to being a book. The notebooks, when executed, were something in the realm of 250 pages of almost entirely original work (I think I “borrowed” something like 2 graphs, which I have since created my own versions of from scratch) and relied on data that was freely available via a variety of R packages. Plus, the work of turning it into a book would give me an excuse to expand my knowledge of
rmarkdown, learn how to use
bookdown, and there were some revisions that were needed anyhow, so I decided that I may as well turn it into a proper book. And so here we are.
This book is written primarily for use in the econometrics course I teach at Methodist University (ECO 3160). This course has basic statistics as a prerequisite, and is a preparatory class for the capstone for Economics majors and for the data mining and data analytics courses for Business Analytics majors. The fact that there is no calculus or computer programming prerequisite for this course greatly informs the approach taken in this text. The calculus and linear algebra underpinnings of econometrics are kept to a minimum, and only appear in a few places (e.g., in Chapter 6 basic calculus is used to demonstrate how to find the maximum or minimum in a quadratic regression, Chapter 6 discusses why the dummy variable trap exists with matrices, etc.). The focus is on using R is on scripting, not coding or programming, so there is relatively little time spent on coding logic, looping, and such. The focus in this book is much more “cookbook” in nature; the goal is to (1) understand the underlying intuition of the various econometric procedures, (2) identify when each the procedures are generally appropriate, (3) diagnose some of the most common things that might go wrong, and (4) learn how to use R to make all of these things happen.
While the previous paragraph laid out the specific context for which this text has been written, it seems highly likely to me that the assumed background of the reader of this text is one that is rather common outside of my primary audience. I suspect there are a great many individuals/students who are looking for material that combines basic R scripting with applied econometrics without getting too into the weeds of linear algebra, calculus, and/or statistical proofs.
Though some may view this as a weird design choice, I’ve opted to print the vast majority of the code as text in the book. I am of course aware that I could hide a lot of the code, and have the output look “cleaner,” but overall I believe that if one of the goals of this text is to teach basic R, that it may be beneficial to the reader to see all of the R code.
This book is freely available for anyone to read. Unfortunately, I have no way of knowing who, if anybody, is actually reading it. So my humble request is to simply drop me a note if you read this book. Let me know how you found out about it. Let me know if you see an error or typo. Let me know if you like or hate it. Let me know if you have a good idea for a new meme to add. Let me know if somebody suggested this book to you. Let me know your opinions on the memes.
If I get a few nice comments, I might even add a section to this book the next time I revise it with some of them. Anyhow, I’m pretty easy to find–probably the easiest is on social media at Twitter or Facebook. I also go by the gmail username of mattdobra should you be an email aficionado.
To aid the reader in following along with the text, I have uploaded RMarkdown notebooks to the companion website that contain nearly all of the code from this book.
While I am making these notebooks available, I would strongly avoid you avoid the temptation to simply copy-and-paste this code into R as you work through the book, or worse, to simply execute the code in these notebooks as-is. Sure, this will produce the correct output and this is certainly less work than typing the code yourself. But I think you are far more likely to cultivate your own coding skills by typing the code yourself. In teaching myself R, I found that a large part of the learning process comes from screwing something up and figuring out why the thing I was trying to do failed. And this doesn’t happen very often when you copy-and-paste code that already works.
While the text of this book is, to the best of my knowledge, entirely of my own writing, it does make use of a variety of data repositories (created by others) that are part of the following R packages:
- datasets (the datasets built into base R)
Additionally, there are places in which data is retrieved live from the web via such packages as WDI, quantmod, etc.