Part 1
Analysis of Empirical Data
Chapter 1
Estimation of NIG and VG Models for High Frequency Financial Data
JosĂ E. Figueroa-LĂpez and Steven R. Lancette
Department of Statistics, Purdue University, West Lafayette, IN
Kiseop Lee
Department of Mathematics, University of Louisville, Louisville, KY; Graduate Department of Financial Engineering, Ajou University, Suwon, South Korea
Yanhui Mi
Department of Statistics, Purdue University, West Lafayette, IN
1.1 Introduction
Driven by the necessity to incorporate the observed stylized features of asset prices, continuous-time stochastic modeling has taken a predominant role in the financial literature over the past two decades. Most of the proposed models are particular cases of a stochastic volatility component driven by a Wiener process superposed with a pure-jump component accounting for the discrete arrival of major influential information. Accurate approximation of the complex phenomenon of trading is certainly attained with such a general model. However, accuracy comes with a high cost in the form of hard estimation and implementation issues as well as overparameterized models. In practice, and certainly for the purpose motivating the task of modeling in the first place, a parsimonious model with relatively few parameters is desirable. With this motivation in mind, parametric exponential Lévy models (ELM) are one of the most tractable and successful alternatives to both stochastic volatility models and more general ItÎ semimartingale models with jumps.
The literature of geometric LĂ©vy models is quite extensive (see Cont & Tankov (2004) for a review). Owing to their appealing interpretation and tractability in this work, we concentrate on two of the most popular classes: the variance-gamma (VG) and normal inverse Gaussian (NIG) models proposed by Carr et al. (1998) and Barndorff-Nielsen (1998), respectively. In the âsymmetric caseâ (which is a reasonable assumption for equity prices), both models require only one additional parameter, Îș, compared to the two-parameter geometric Brownian motion (also called the BlackâScholes model). This additional parameter can be interpreted as the percentage excess kurtosis relative to the normal distribution and, hence, this parameter is mainly in charge of the tail thickness of the log return distribution. In other words, this parameter will determine the frequency of âexcessivelyâ large positive or negative returns. Both models are pure-jump models with infinite jump activity (i.e., a model with infinitely many jumps during any finite time interval [0, T]). Nevertheless, one of the parameters, denoted by Ï, controls the variability of the log returns and, thus, it can be interpreted as the volatility of the price process.
Numerous empirical studies have shown that certain parametric ELM, including the VG and the NIG models, are able to fit daily returns extremely well using standard estimation methods such as maximum likelihood estimators (MLE) or method of moment estimators (MME) (c.f. Eberlein & Keller (1995); Eberlein & Ăzkan (2003); Carr et al. (1998); Barndorff-Nielsen (1998); Kou & Wang (2004); Carr et al. (2002); Seneta (2004); Behr & Pötter (2009), Ramezani & Zeng (2007), and others). On the other hand, in spite of their current importance, very few papers have considered intraday data. One of our main motivations in this work is to analyze whether pure LĂ©vy models can still work well to fit the statistical properties of log returns at the intraday level.
As essentially any other model, a LĂ©vy model will have limitations when working with very high frequency transaction data and, hence, the question is rather to determine the scales where a LĂ©vy model is a good probabilistic approximation of the underlying (extremely complex and stochastic) trading process. We propose to assess the suitability of the LĂ©vy model by analyzing the signature plots of the point estimates at different sampling frequencies. It is plausible that an apparent stability of the point estimates for certain ranges of sampling frequencies provides evidence of the adequacy of the LĂ©vy model at those scales. An earlier work along these lines is Eberlein & Ăzkan (2003), where this stability was empirically investigated using hyperbolic LĂ©vy models and MLE (based on hourly data). Concretely, one of the main points therein was to estimate the model's parameters from daily mid-day log returns
1 and, then, measure the distance between the empirical density based on hourly returns and the 1-h density implied by the estimated parameters. It is found that this distance is approximately minimal among any other implied densities. In other words, if
denotes the implied density of
XÎŽ when using the parameters
estimated from daily mid-day returns and if
denotes the empirical density based on hourly returns, then the distance between
and
is minimal when ÎŽ is approximately 1 h. Such a property was termed the
time consistency of Lévy processes.
In this chapter, we further investigate the consistency of ELM for a wide rage of intraday frequencies using intraday data of the US equity market. Although natural differences due to sampling variation are to be expected, our empirical results under both models exhibit some very interesting common features across the different stocks we analyzed. We find that the estimator of the volatility parameter Ï is quite stable for sampling frequencies as short as 20 min or less. For higher frequencies, the volatility estimates exhibit an abrupt tendency to increase (see Fig. 1.6 below), presumably due to microstructure effects. In contrast, the kurtosis estimator is more sensitive to microstructure effects and a certain degree of stability is achieved only for mid-range frequencies of 1 h and more (see Fig. 1.6 below). For higher frequencies, the kurtosis decreases abruptly. In fact, opposite to the smooth signature plot of Ï at those scales, the kurtosis estimates consistently change by more than half when going from hourly to 30-min log returns. Again, this phenomenon is presumably due to microstructure effects since the effect of an unaccounted continuous component will be expected to diminish when the sampling frequency increases.
One of the main motivations of Lévy models is that log returns follow ideal conditions for statistical inference in that case; namely, under a Lévy model the log returns at any frequency are independent with a common distribution. Owing to this fact, it is arguable that it might be preferable to use a parsimonious model for which efficient estimation is feasible, rather t...