3 Data collection

  • Sample data should be always organized in matrix form, with observations presented in rows and variables in columns, and saved in a file format such as .xlsx, .txt or .csv

  • Most common data issues:

  1. Missing values (NA)
  2. Measurement errors (collected data may not always present true values)
  3. Outliers (extreme values above or below the mean)
  • Raw data are usually transformed:
  1. Taking the logs, squares, inverse values, square roots, …
  2. Seasonally and/or calendar adjusted
  3. First differences are sometimes required as well as lagged values
  4. Deflating nominal values