2.1 Stylized Facts
Stylized facts are properties common across many instruments, markets, and time periods that have been observed by independent studies (Cont, 2001; McNeil et al., 2015). Some notable examples include:
Lack of stationarity: The statistics of financial time series change over time (past returns do not necessarily reflect future performance).
Volatility clustering: Large price changes tend to be followed by large price changes (ignoring the sign), whereas small price changes tend to be followed by small price changes (Fama, 1965; Mandelbrot, 1963).
Absence of autocorrelations: Autocorrelations of returns are often insignificant (Ding and Granger, 1996), which can be explained by the efficient-market hypothesis (Fama, 1970).
Heavy tails: Gaussian distributions generally do not hold in financial data; instead, distributions typically exhibit so-called heavy tails.
Gain/loss asymmetry: The distribution of the returns is not symmetric.
Positive correlation of assets: Returns are often positively correlated since assets typically move together with the market.
It is worth noting that different data frequency regimes may exhibit a variation of characteristics:
Low frequency (weekly, monthly, quarterly): Gaussian distributions may fit reasonably well after correcting for volatility clustering (except for the asymmetry), but the scarcity of data is a big issue in a statistical sense.
Medium frequency (daily): Heavy tails cannot be ignored (even after correcting for volatility clustering) and the amount of data may be acceptable for statistical significance provided the models do not contain too many parameters (or overfitting will be inevitable).
High frequency (intraday, 30 min, 5 min, tick-data): Large amounts of data are available, which makes this regime more amenable to data analytics and machine learning techniques. Furthermore, as the frequency of the data increases, the influence of microstructure noise becomes more prominent, which requires alternative models.