```
library(moments)
suppressPackageStartupMessages(library(tseries))
suppressPackageStartupMessages(library(car))
```

# 9 Descriptive statistics

Below we do some descriptive statistics on our data, mostly for dependence and tail shape. We opted for the SP-500. ## Libraries

## 9.1 Data

```
load('data/data.RData')
=data$sp500 sp500
```

## 9.2 Statistical analysis

### 9.2.1 Sample stats

```
mean(sp500$y)
sd(sp500$y)
skewness(sp500$y)
kurtosis(sp500$y)
```

We can present these results a bit nicer by putting them into a data frame.

```
cat('Sample statistics for the SP-500 from',min(sp500$date),'to',max(sp500$date),'\n')
=data.frame()
stats=c('mean:',paste0(round(mean(sp500$y)*100,3),'%'))
stats=rbind(stats,c('sd:',paste0(round(sd(sp500$y)*100,2),'%')))
stats=rbind(stats,c('skewness:',round(skewness(sp500$y),1)))
stats=rbind(stats,c('kurtosis:',round(kurtosis(sp500$y),1)))
stats stats
```

`Sample statistics for the SP-500 from 20030103 to 20221230 `

stats | mean: | 0.029% |
---|---|---|

sd: | 1.22% | |

skewness: | -0.5 | |

kurtosis: | 15.8 |

Or printing them with formatting.

```
for(i in 1:dim(stats)[1])
cat(stats[i,1],rep("",10-nchar(stats[i,1])),stats[i,2],"\n")
```

```
mean: 0.029%
sd: 1.22%
skewness: -0.5
kurtosis: 15.8
```

### 9.2.2 Jarque-Bera

The Jarque-Bera (JB) test in the `tseries`

library uses the third and fourth central moments of the sample data to check if it has a skewness and kurtosis matching a normal distribution.

`jarque.bera.test(sp500$y)`

```
Jarque Bera Test
data: sp500$y
X-squared = 34398, df = 2, p-value < 2.2e-16
```

The p-value presented is the inverse of the JB statistic under the CDF of the asymptotic distribution. The p-value of this test is virtually zero, which means that we have enough evidence to reject the null hypothesis that our data has been drawn from a normal distribution.

### 9.2.3 Ljung-Box

The Ljung-Box (LB) test checks if the presented data exhibit serial correlation. We can implement the test using the `Box.test()`

function and specifying `type = "Ljung-Box"`

:

```
Box.test(sp500$y, type = "Ljung-Box")
Box.test(sp500$y^2, type = "Ljung-Box")
```

```
Box-Ljung test
data: sp500$y
X-squared = 81.253, df = 1, p-value < 2.2e-16
```

```
Box-Ljung test
data: sp500$y^2
X-squared = 551.08, df = 1, p-value < 2.2e-16
```

### 9.2.4 Autocorrelation

The autocorrelation function shows us the linear correlation between a value in our time series with its different lags. The `acf`

function plots the autocorrelation of an ordered vector. The horizontal lines are confidence intervals, meaning that if a value is outside of the interval, it is considered significantly different from zero.

```
par(mar=c(4,4,1,0))
acf(sp500$y, main = "Autocorrelation of returns")
```

The plots made by the acf function don’t not look all the nice. While there are other versions available for R, including one for `ggplot`

, for now we will simply modify the builtin `acf()`

. Call that `myACF()`

and put it into `functions.r`

.

```
=function (x, n = 50, lwd = 2, col1 = co2[2], col2 = co2[1], main = NULL)
myACF
{= acf(x, n, plot = FALSE)
a = qnorm((1 + 0.95)/2)/sqrt(sum(!is.na(x)))
significance_level barplot(a$acf[2:n, , 1], las = 1, col = col1, border = FALSE,
las = 1, xlab = "lag", ylab = "ACF", main = main)
abline(significance_level, 0, col = col2, lwd = lwd)
abline(-significance_level, 0, col = col2, lwd = lwd)
axis(side = 1)
}par(mar=c(4,4,1,0))
myACF(sp500$y, main = "Autocorrelation of SP-500 returns")
myACF(sp500$y^2, main = "Autocorrelation of squared SP-500 returns")
```

### 9.2.5 QQ-Plots

We can use the quantile-quantile (QQ) plot to see how our data fits particular distributions. To make the QQ plots, we will use the `qqPlot()`

function from the `car`

package:

Start with the normal plot, indicated by `distribution = "norm"`

which is the default.

```
par(mar=c(4,4,1,0))
=qqPlot(sp500$y, distribution = "norm", envelope = FALSE) x
```

We can also use QQ-Plots to see how our data fits tthe t distributions with various degrees of freedom.

```
par(mfrow=c(2,2), mar=c(4,4,1,0))
=qqPlot(sp500$y, distribution = "norm", envelope = FALSE,xlab="normal")
x=qqPlot(sp500$y, distribution = "t", df = 4, envelope = FALSE,xlab="t(4)")
x=qqPlot(sp500$y, distribution = "t", df = 3.5, envelope = FALSE,xlab="t(3.5)")
x=qqPlot(sp500$y, distribution = "t", df = 3, envelope = FALSE,xlab="t(3)") x
```