CHAPTER 8
Searching for High-Frequency Trading Opportunities
This chapter reviews the most important econometric concepts used in the subsequent parts of the book. The treatment of topics is by no means exhaustive; it is instead intended as a high-level refresher on the core econometric concepts applied to trading at high frequencies. Yet, readers relying on software packages with preconfigured statistical procedures may find the level of detail presented here to be sufficient for quality analysis of trading opportunities. The depth of the statistical content should be also sufficient for readers to understand the models presented throughout the remainder of this book. Readers interested in a more thorough treatment of statistical models may refer to Tsay (2002); Campbell, Lo, and MacKinlay (1997); and Gouriéroux and Jasiak (2001).
This chapter begins with a review of the fundamental statistical estimators, moves on to linear dependency identification methods and volatility modeling techniques, and concludes with standard nonlinear approaches for identifying and modeling trading opportunities.
STATISTICAL PROPERTIES OF RETURNS
According to Dacorogna et al. (2001, p. 121), âhigh-frequency data opened up a whole new field of exploration and brought to light some behaviors that could not be observed at lower frequencies.â Summary statistics about aggregate behavior of data, known as âstylized facts,â help distill particularities of high-frequency data. Dacorogna et al. (2001) review stylized facts for foreign exchange rates, interbank money market rates, and Eurofutures (futures on Eurodollar deposits).
Financial data is typically analyzed using returns. A return is a difference between two subsequent price quotes normalized by the earlier price level. Independent of the price level, returns are convenient for direct performance comparisons across various financial instruments. A simple return measure can be computed as shown in equation (8.1):
where Rt is the return for period t, Pt is the price of the financial instrument of interest in period t, and Ptâ1 is the price of the financial instrument in period t â 1. As discussed previously, determination of prices in high-frequency data may not always be straightforward; quotes arrive at random intervals, but the analysis demands that the data be equally spaced.
Despite the intuitiveness of simple returns, much of the financial literature relies on log returns. Log returns are defined as follows:
Log returns are often preferred to simple returns for the following reasons:
1. If log returns are assumed to follow a normal distribution, then the underlying simple returns and the asset prices used to compute simple returns follow a lognormal distribution. Lognormal distributions better reflect the actual distributions of asset prices than do normal distributions. For example, asset prices are generally positive. Lognormal distribution models this property perfectly, whereas normal distributions allow values to be negative.
2. Like distributions of asset prices, lognormal distributions have fatter tails than do normal distributions. Although lognormal distributions typically fail to model the fatness of the tails of asset prices exactly, lognormal distributions better approximate observed fat tails than do normal distributions.
3. Once log prices have been computed, log returns are easy and fast to manipulate.
Returns can be computed on bid prices, ask prices, last trade prices, or mid prices. Mid prices can be taken to be just an arithmetic average, or a mid-point between a bid and an ask price at any point in time. In the absence of synchronous quotes, mid prices can be computed using the last bid and ask quotes.
Both simple and log returns can be averaged over time to obtain lower-frequency return estimates. An average of simple and log returns can be computed as normal arithmetic averages:
Variation in sequential returns is known as volatility. Volatility can be measured in a variety of ways. The simplest measure of volatility is variance of simple or log returns, computed according to equations (8.5) and (8.6).
Note that the division factor in volatility computation is (T â 1), not T. The reduced number of normalizing observations accounts for reduced number of degrees of freedomâthe variance equation includes the average return, which in most cases is itself estimated from the sample data. Standard deviation is a square root of the variance.
Other common statistics used to describe distributions of prices or simple or log returns are skewness and kurtosis. Skewness measures whether a distribution skews towards either the positive or the negative side of the mean, as compared with the standardized normal distribution. Skewness of the standardized normal distribution is 0. Skewness can be measured as follows...