library(rugarch)
library(zoo)
library(lubridate)
library(microbenchmark)
library(parallel)
library(doParallel)
library(foreach)
source("common/functions.r",chdir=TRUE)
18 Backtesting
When forecasting risk, it is important to evaluate the quality of the forecasts. Compared to many other applications, such as price or weather forecasting, we cannot directly compare the forecast to an eventual realised outcome because the risk is a latent variable. It cannot be directly measured; it can only be inferred, and that means we need indirect ways to evaluate forecasting. That said, there are some ways to evaluate the quality of various risk forecasts, and that is what one does with backtesting.
18.1 Libraries
18.2 Data
=ProcessRawData()
data=data$sp500 sp500
18.3 Intermediate files
We argued in Section Section 12.2 that we should normally avoid intermediate files unless absolutely necessary. Backtesting might provide the exception. The reason is that the time to estimate some of the models we use is not trivial. When we have to repeat the estimation every day in the testing window, the estimation time can become excessive. Since we then have to process the backtesting results, it may be more efficient to run them in a separate file, save them, and then code up the processing separately.
That said, it might be sensible to do that when developing the code that reports on the backtest, but it would usually be best not to create intermediate files for the final run.
18.4 Setup
Specify the VaR probability, portfolio value, estimation window size and portfolio weight:
= list()
par $p = 0.05
par$value = 1000
par$WE = 2000
par$T = length(sp500$y)
par$WT = par$T-par$WE
par$lambda = 0.9 par
18.5 Backtesting functions
We divided the code to run the backtesting into two functions, OneDayBacktesting()
and simply Backtesting()
. While not strictly necessary, this separation of functionality is helpful in our case because we want to evaluate the time it takes to do the backtesting.
18.5.1 RunOneDay()
Note the line
=do.call(
xwhat=paste0("Risk.",Methods[i]),
args=
)
in function RunOneDay()
. Here, paste0("Risk.",Methods[i])
just takes an estimation method and combines it with the string Risk
. For example, if Methods[i]
is HS
, we get Risk.HS
, which is one of the risk functions we use.
do.call(what)
then calls the function in the what
argument, in this case Risk.HS()
with arguments y=sp500$y[t1:t2],par=par,Print=Print
.
Note that the vector res
holds both the VaR
and ES
results. The reason we do it that way is to facilitate the multi-core implementation below.
=function(t,y=y,Methods,Print=FALSE) {
RunOneDay=t-1
t1=t1-par$WE+1
t2if(Print) cat(t,t2,t1,"\n")
=rep(0,2*length(Methods))
resfor(i in 1:length(Methods)){
=do.call(what=paste0("Risk.",Methods[i]),args=list(y=y[t1:t2],par=par,Print=Print ))
x=x$VaR
res[i]+length(Methods)]=x$ES
res[i
}return(res)
}
18.5.2 Backtesting()
We run RunOneDay()
in the Backtesting()
function, and it takes four arguments: the returns y
, control parameters par
, the estimation Methods
and diagnostic Print
.
We start by creating a matrix, res
, to hold the output and then loop over the estimation window, putting the results into the corresponding row in res
.
We then split the res
matrix into the VaR
and ES
matrices and gave the column names.
We define a profit and loss vector, PL
, and use that to create matrixes of violations, VaR.V
and ES.V
.
=function(y,par,Methods,Print=FALSE){
Backtesting=matrix(ncol=length(Methods)*2,nrow=par$T)
resfor(t in (par$WE+1):par$T){
=RunOneDay(t=t,y=y,Methods=Methods,Print=Print)
res[t,]
}=res[,1:length(Methods)]
VaR=res[,(length(Methods)+1):(2*length(Methods))]
ES
colnames(ES)=Methods
colnames(VaR)=Methods
=y*par$value
PL
=-(PL +VaR)
VaR.V=-(PL +ES)
ES.V
<0]=0
VaR.V[VaR.V>0]=1
VaR.V[VaR.V
<0]=0
ES.V[ES.V>0]=1
ES.V[ES.V
=return(list(ES=ES,VaR=VaR,VaR.V=VaR.V,ES.V=ES.V,PL=PL))
x }
18.6 Run backtest
We only run the backtest here for HS and EWMA since these are quite quick to run, while the other methods take considerably longer. We run them all below.
=Backtesting(y=sp500$y,
respar=par,
Methods=c("HS","EWMA")
)matplot(sp500$date.ts,cbind(res$PL,-res$VaR),type='l',lty=1)
What you see from the plot is that even if the returns start from day one, we only get backtests starting from day
print((par$WE -2):(par$WE +2))
[1] 1998 1999 2000 2001 2002
print(res$VaR[(par$WE -2):(par$WE +2),])
HS EWMA
[1,] NA NA
[2,] NA NA
[3,] NA NA
[4,] 19.73436 22.93636
[5,] 19.73436 20.88139
One way to do that is to create new variables that start from the day par$WE+1
, or 2001
=res$VaR[(par$WE +1):par$T,]
VaR=res$ES[(par$WE +1):par$T,]
ES=res$VaR.V[(par$WE +1):par$T,]
VaR.V=res$ES.V[(par$WE +1):par$T,]
ES.V=res$PL[(par$WE +1):par$T]
PL=sp500$date.ts[(par$WE +1):par$T] date.ts
Then
matplot(date.ts,cbind(PL,-VaR),type='l',lty=1)
And we can then analyse the number of violations.
cat("Expected number of violations",par$p*par$WT,"\n")
Expected number of violations 151.7
print(colSums(VaR.V))
HS EWMA
144 236
18.7 How long does it take to run?
Backtesting is slow, except for the fastest methods. There are several ways to measure the time it takes to run an R comment. The easiest is to take note of Sys.time()
, which measures the actual time it is run. We can then run some code and compare the changes in Sys.time()
. This is sometimes known as wall time because it measures the time a wall clock would take. The benefit is that this is simple.
The downside is that many factors can affect how long the code takes. The computer might be doing something else at the same time, which affects the results. Often, if we call the same function repeatedly, the first call takes the longest time, as we see here.
A more accurate method is to repeat the run several times and use the minimum or medium value. The easiest way to do that is with the R command microbenchmark()
.
18.7.1 Wall time
=sp500$y
y=Sys.time()
time1=RunOneDay(par$T,y=y,Methods=c("HS"))
resprint(Sys.time() -time1)
Time difference of 0.0006859303 secs
=Sys.time()
time1=RunOneDay(par$T,y=y,Methods=c("EWMA"))
resprint(Sys.time() -time1)
Time difference of 0.0008900166 secs
=Sys.time()
time1=RunOneDay(par$T,y=y,Methods=c("GARCH"))
resprint(Sys.time() -time1)
Time difference of 0.06194687 secs
=Sys.time()
time1=RunOneDay(par$T,y=y,Methods=c("tGARCH"))
resprint(Sys.time() -time1)
Time difference of 0.2529449 secs
18.7.2 With microbenchmark
Start by running the microbenchmark
for a GARCH model. In this case, it reports the results in milliseconds or one-thousandth of a second.
=microbenchmark(
mRunOneDay(par$T,y=y,Methods=c("GARCH"))
)$expr="GARCH"
mprint(m)
Unit: milliseconds
expr min lq mean median uq max neval
GARCH 34.89965 35.55042 42.15387 36.03652 38.93754 200.4906 100
If we want to compare several methods, it is best to collect the output from microbenchmark
in a vector. Note that the raw output is measured in nanoseconds or one billionth of a second.
=c("HS","EWMA","GARCH","tGARCH")
Methods=vector(length=length(Methods))
resfor(i in 1:length(Methods)){
=microbenchmark(
xRunOneDay(par$T,y=y,Methods=Methods[i])
)=median(x$time)
res[i]
}print(res)
[1] 61643.5 372464.5 42252632.0 95180598.0
We can then plot them together, in this case, with a bar
plot. Since the numbers are very different, a log scale helps.
=res/1e6
x=barplot(round(x,4),
alog='y',
las=1,
border=FALSE,
ylab="Milliseconds",
names.arg=Methods,
col="lightblue"
)text(a,x*1.14,x,xpd=TRUE)
While 0.088 seconds for tGARCH does not seem that long when repeated 3034 times over the testing window, it ends up taking 4.4 minutes.
18.8 Do it faster
It can be very annoying when a backtest takes a long time. There are several ways we can speed it up, and one of the most obvious is to take advantage of the fact that the processor in your computer has multiple cores. The central processor (computing) in your computer likely has multiple cores. By default, R uses one core. But most computers have at least 4, and some laptops have 16. You can run your job on all cores, and therefore, if your laptop has ten cores, your job will be ten times faster.
There are two problems with running jobs on multiple cores.
The first is that it is inevitably more complex. You need to write your code in a way that makes it easy for it to run on multiple cores and then call a special function actually to do that.
The second problem is that running multiple courses has some overhead, so small jobs can take longer to run on multiple cores than a single one.
18.8.1 Apply family
R comes with a family of functions, usually called the Apply family because they are all variants of a function called apply
. There are many pages on the Internet describing the apply family, including this one. The idea is that the apply function takes every value of a vector or a matrix and applies to a function. The output from the apply function is usually a list of the output from the function call. Since we ultimately need to collapse that list into a data frame, it is easiest if the output from the function is a vector and not a list. This is why we coded RunOneDay()
to return a vector.
18.8.2 Windows, Linux and Macs
We have an issue that arises because of differences between Windows computers and UNIX, which includes Macs and Linux. The way R does multi-core processing is more efficient on UNIX computers. The Windows way also runs on Macs.
We will use mclapply
on Mac and Linux and foreach()
on Windows. Both work on Mac and Linux.
mclapply()
needs library(parallel)
, while foreach()
needs library(foreach)
and library(doParallel).
18.8.3 Number of cores
We need to identify the number of cores in the computer we are using. You might know the value, but different computers have different numbers, so it makes sense to identify the number automatically.
=detectCores()
coresprint(cores)
[1] 12
The computer we are now using has 12 cores.
18.8.4 Sequence
When using the functions to do parallel computing, we do not guarantee that we get the answers back in the order of the input vector. The reason is that it optimises rather than splitting the jobs up. For many applications, such as in Monte Carlo simulations, that is not important. But here, it is important since we have to match a vector of risk forecasts to a day. We therefore have to take special care below to ensure that.
18.8.5 Mac/Linux — mclapply()
The mclapply()
function takes the input vector we will apply to (par$WE+1):par$T
, the function we want to run, RunOneDay
and the arguments that function needs, y=y,Methods=Methods
. If we don’t do anything else, mclapply()
simply becomes lapply()
; that is, by losing the mc
, it no longer does multi-core, so the job simply runs on one core with lapply()
.
=c("HS","EWMA","GARCH","tGARCH") Methods
=proc.time()
time1=lapply(
res1$WE+1):par$T,
(par
RunOneDay,y=y,
Methods=Methods
) =proc.time()-time1
time1print(time1)
user system elapsed
408.516 22.907 431.617
print(res1[1:2])
[[1]]
[1] 19.73436 22.93636 21.85700 21.42719 33.80597 28.76312 27.40956 28.99466
[[2]]
[1] 19.73436 20.88139 22.78245 22.37896 33.80597 26.18611 28.57012 30.28393
The output goes into a list res1
, where each element represents the risk forecast on a particular day. We print the first numbers just to see if we always get the same answers when trying different versions of multi-core calculations.
By setting mc.cores=cores
, we tell mclapply()
to use cores
cores, which, in our case, is 12.
mc.preschedule=TRUE
ensures we get the estimates in the right order.
We use proc.time()
to measure execution time.
=proc.time()
time2=mclapply(
res2$WE+1):par$T,
(par
RunOneDay,y=y,
Methods=Methods,
mc.cores=cores,
mc.preschedule=TRUE
) =proc.time()-time2
time2print(time2)
user system elapsed
522.167 21.531 59.526
print(res2[1:2])
[[1]]
[1] 19.73436 22.93636 21.85700 21.42719 33.80597 28.76312 27.40956 28.99466
[[2]]
[1] 19.73436 20.88139 22.78245 22.37896 33.80597 26.18611 28.57012 30.28393
We see the multicore is not quite 12 times faster. There is considerable overhead.
18.8.6 Windows (and Mac/Linux) — mclapply()
mclapply
only runs on UNIX computers, and we need something different for Windows. Unfortunately, that is quite a bit more complicated. We both have to create a cluster and then give each core in the cluster a copy of all the data we need to run the jobs.
The variable cl
is the cluster we make:
<- makeCluster(cores)
cl registerDoParallel(cl)
We then have to give each core a copy of all packages and functions with clusterEvalQ()
=clusterEvalQ(cl, {
xlibrary(reshape2)
library(rugarch)
library(lubridate)
library(zoo)
source("common/functions.r",chdir=TRUE)
})
And use clusterExport()
to give it our particular data
clusterExport(cl,c("par", "y","Methods"))
we run the code on the cluster with
=foreach(i = (par$WE+1):par$T) %dopar% {
res3RunOneDay(i,y=y,Methods=Methods)
}
And need to destroy the cluster when all is done
stopCluster(cl)
=proc.time()
time3<- makeCluster(cores[1]-1)
cl registerDoParallel(cl)
=clusterEvalQ(cl, {
xlibrary(reshape2)
suppressPackageStartupMessages(library(rugarch))
suppressPackageStartupMessages(library(lubridate))
suppressPackageStartupMessages(library(zoo))
source("common/functions.r",chdir=TRUE)
})
clusterExport(cl,c("par", "y","Methods"))
=(par$WE+1):par$T
r
=foreach(i = (par$WE+1):par$T) %dopar% {
res3RunOneDay(i,y=y,Methods=Methods)
}stopCluster(cl)
=proc.time()-time3
time3print(time3)
user system elapsed
0.686 0.170 59.773
print(res3[1:2])
[[1]]
[1] 19.73436 22.93636 21.85700 21.42719 33.80597 28.76312 27.40956 28.99466
[[2]]
[1] 19.73436 20.88139 22.78245 22.37896 33.80597 26.18611 28.57012 30.28393
18.8.7 Compare
=data.frame(cbind(
dfc("lapply","mclapply","foreach"),
c(time1["elapsed"],time2["elapsed"],time3["elapsed"])
)
)rownames(df)=NULL
names(df)=c("Approach","time")
print(df)
Approach time
1 lapply 431.617
2 mclapply 59.526
3 foreach 59.773
Are the answers identical?
identical(res1,res2)
[1] TRUE
identical(res2,res3)
[1] TRUE