15  Manual estimation

When estimating the volatility models in Chapter 13, we used the rugarch library. It is very comprehensive and provides almost all the functionality needed to estimate volatility models.

However, it can be beneficial to have a deeper understanding of how estimation actually works. That is what we provide in this chapter. We are manually implementing two features that are automatically provided in rugarch.

The first is the actual programming of the likelihood function, and the second is how we use optimisation algorithms to maximise the likelihood.

The benefit of doing such manual optimisation is that we gain familiarity with optimisation procedures, something quite common in quantitative finance. In a lot of cases, we don’t have the benefit of high-quality libraries and instead have to implement the estimation procedure ourselves.-ch-univariate rugarch

15.1 Libraries


15.2 Data

The data we use is returns on the S&P500 index, and for convenience, we put them into the variable y.


15.3 Optimisers

Recall the discussion in Chapter 13 about maximising the likelihood function. The algorithm used to do that maximisation is called an optimiser.

The basic R distribution provides a function called optim() for optimising functions. It provides several algorithms, including the very old and robust Nelder-Mead. That might not be the fastest algorithm, but it does have the advantage of not depending on derivatives, and it will always find the next optimum. Nelder-Mead has at least two disadvantages: it requires more steps to find the optimum than some other algorithms, and it only does local optimisation, meaning it finds the next peak.

There is an alternative library called NLopt that provides a large number of state-of-the-art optimisers. NLopt can be used with almost every programming language, and the R implementation is nloptr.

nloptr requires more steps than the built-in optim.

15.3.1 Optimisers minimise

One thing to keep in mind is that optimisers minimise functions by default instead of maximising them, as we require here. This means that when we return a functional value, we have to invert the sign before returning it.

15.3.2 Ensuring positive parameters

In our case, the parameters that go into the likelihood function must be positive. There are at least two different ways we can ensure that positivity. The easiest is to take the absolute value of the parameters, but we can often also impose constraints when we call the optimising function. We show both below.

15.3.3 Optimisation in R

Suppose we want to minimise some function f(parameters), where we have three parameters. Build-in optim

The easiest way is to use the built-in optim function, which by default uses the Nelder-Mead algorithm. It returns a list that contains the parameter estimates, the function value at the optimum and some other diagnostics.

res = optim(c(0.000001, 0.1, 0.85) ,fn= f) # Optimization NLopt

NLopt is a bit more complicated. We first have to set up control parameters for the estimation, like one of the:

opts =list("algorithm"="NLOPT_GN_CRS2_LM", "xtol_rel"=1.0e-8, maxeval=1e5)
opts =list("algorithm"="NLOPT_GN_DIRECT_L", "xtol_rel"=1.0e-8, maxeval=1e5)
opts =list("algorithm"="NLOPT_LN_NELDERMEAD", "xtol_rel"=1.0e-8, maxeval=1e5)

Here, we see three different algorithm choices. We also set the tolerance xtol_rel and the maximum number of times the function can be called, maxeval.

After that, we call the actual optimisation nloptr. We set the starting values x0, the function name eval_f, the blower and upper bounds on the parameters lb and ub, and the control list.

res = nloptr(x0=c(0.000001, 0.1, 0.85), 

It returns a list that contains the parameter estimates, the function value at the optimum and some other diagnostics.

15.4 Coding the likelihood

15.4.1 Issues

  • Value of \(\sigma_1^2\): We need to assign a value to \(\sigma_1^2\), the first iteration of the conditional volatility. In general, we choose the unconditional variance of the data for this;
  • Initial values of the parameters: Optimisation works iteratively. It would help if you assigned values to initialise the parameters, making sure these are compliant with the function’s restrictions and the parameter bounds. This will have a small effect on the parameter values but can have a large effect on computing time.

We also need to pick an optimisation method and algorithm.

15.4.2 Normal GARCH

The likelihood function and its derivation can be found in Chapter 2 of the book and slides.

\[ \begin{split} \log\mathcal{L} & = \underbrace{{-\dfrac{T-1}{2}\log(2\pi)}}_{\text{Constant }}\\ & - \dfrac{1}{2}\sum\limits_{t=2}^{T}\left({(\log (\omega+\alpha{y^2_{t-1}+\beta{{\hat\sigma^2_{t-1}})})}} + \dfrac{y^2_t}{\omega+\alpha{y^2_{t-1}}+\beta{{{\hat\sigma^2}_{t-1}}}}\right) \end{split} \]

15.4.3 tGARCH

The Student-t density is given by \[ \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{\nu\pi}\Gamma\left(\frac{\nu}{2}\right)} \left(1+\frac{x^2}{\nu} \right)^{-\frac{\nu+1}{2}} \]

We want to make two alterations.

  1. Make it standardised;
  2. Write it in terms of a time-changing volatility model,

Which gives us

\[ \frac{1}{\sigma_t} \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{(\nu-2)\pi\Gamma\left(\frac{\nu}{2}\right)}} \left(1+\frac{y_t^2}{\sigma_t^2}\frac{1}{\nu-2} \right)^{-\frac{\nu+1}{2}} \]

And take logs. There are two parts: the constant and the time-changing part.

The constant is

\[ \log\Gamma\left(\frac{\nu+1}{2}\right)- \frac{1}{2}\log(\nu\pi)-\log\Gamma\left(\frac{\nu}{2}\right) \] While the time-changing part is

\[ -\log\sigma_t -\frac{\nu+1}{2}\left( \log\left(\sigma_t^2(\nu-2)+y_t^2\right) -\log\left(\sigma_t^2(\nu-2) \right) \right) \]

15.5 Implementation

15.5.1 Issues

Three issues arise.

The first is that we have to ensure the parameters are positive, which we do below by par=abs(par) but also by setting lb and ub in nloptr. We don’t need both, but since we are comparing two different ways of doing optimisation, keep it here.

Second, we want to avoid repeating calculations. We might call the likelihood function a large number of times, so we don’t want to square the returns, count the elements, or take the variance more often than necessary. So, we pre-calculate everything we can. It is especially important to do only the necessary calculations inside the loops.

Finally, we need to get the last volatility. While we can write clever code for that, in this case, we are just doing the simple way of putting it into global memory with <<-, which is not how one is supposed to do things, but it can be very convenient.

15.5.2 Normal GARCH

T = length(y2) 

LL.GARCH = function(par) { 
 omega = par[1] 
 alpha = par[2] 
 beta = par[3] 
 loglikelihood = 0
 for(i in 2:T){
 sigma2 = omega + alpha*y2[i-1] + beta*sigma2 
 loglikelihood = -0.5* loglikelihood -(T-1)/2 * log(2*pi)
 sigma2Last <<- sigma2
res = optim(c(0.000001, 0.1, 0.85) ,fn= LL.GARCH) 

opts =list("algorithm"="NLOPT_LN_NELDERMEAD", "xtol_rel"=1.0e-8, maxeval=1e4)

res = nloptr(x0=c(0.000001, 0.1, 0.85), 

spec.2 = ugarchspec(
 variance.model = list(garchOrder = c(1,1)),
 mean.model = list(armaOrder = c(0,0), include.mean = FALSE)
Res = ugarchfit(spec = spec.2, data = y)

cat("Omega:", res$solution[1], coef(Res)[1],"\n",
 "Alpha:", res$solution[2], coef(Res)[2], "\n",
 "Beta:", res$solution[3], coef(Res)[3],"\n")
Omega: 2.459155e-06 2.45887e-06 
 Alpha: 0.1238671 0.1238025 
 Beta: 0.8557986 0.8558381 

15.5.3 tGARCH

T = length(y2) 

LL.tGARCH = function(par) { 
 omega = par[1] 
 alpha = par[2] 
 beta = par[3] 
 nu = par[4] 
 loglikelihood = 0

 for(i in 2:T){
 sigma2 = omega + alpha*y2[i-1] + beta*sigma2 
 n1*(log(n2*sigma2+y2[i])-log(n2*sigma2) )
 loglikelihood = -loglikelihood/2.0;     
 sigma2Last<<- sigma2
res = optim(c(0.000001, 0.1, 0.85,6) ,fn= LL.tGARCH) # Optimization

opts =list("algorithm"="NLOPT_GN_CRS2_LM", "xtol_rel"=1.0e-6, maxeval=1e6)
opts =list("algorithm"="NLOPT_LN_NELDERMEAD", "xtol_rel"=1.0e-8, maxeval=1e5)

resn = nloptr(x0=c(0.000001, 0.1, 0.85,6), 

sigma2 0.0001619243 
spec.3 = ugarchspec(
 variance.model = list(garchOrder = c(1,1)),
 mean.model = list(armaOrder = c(0,0), include.mean = FALSE),
 distribution.model = "std"
Res = ugarchfit(spec = spec.3, data = y)

cat("Omega:", resn$solution[1], coef(Res)[1],"\n",
 "Alpha:", resn$solution[2], coef(Res)[2], "\n",
 "Beta:", resn$solution[3], coef(Res)[3],"\n",
 "nu:", resn$solution[4], coef(Res)[4],"\n")
Omega: 1.519132e-06 1.507668e-06 
 Alpha: 0.1242821 0.1240843 
 Beta: 0.8703039 0.8706493 
 nu: 6.067725 6.054042 

15.6 Exercise

Apart from rugarch, which we used in this chapter, there are two popular alternatives in R that estimate time series models: fGarch and tseries. Fit a GARCH(1,1) model to the S&P500 return series (y) using these three packages. Are the results identical? If not, what can account for the difference? Discuss.