Week 5: Simulations and Functions

Version 4.0 - August 2025

Author

Affiliation

Jon Danielsson

London School of Economics

5 Simulations and functions

5.1 Why simulations and functions matter for financial risk

Simulations enable risk managers to model complex scenarios that cannot be solved analytically, such as portfolio behaviour under extreme market conditions or derivative pricing with path dependency. Functions allow us to create reusable, testable code that ensures consistency across risk calculations and reduces errors in production systems.

These programming techniques become essential for Monte Carlo VaR, stress testing and building robust risk management systems that can handle real-world complexity.

For more detail, see Simulations in the R notebook.

5.2 The plan for this week

Make random numbers
Compare distributions
Make functions
Advanced simulation techniques

5.3 Loading data and libraries

load('Returns.RData')
load('Prices.RData')

5.4 Random numbers

We can get random numbers drawn from a specific distribution by using the prefix r before the distribution name:

rnorm(1)

[1] -1.158346

rnorm(10, mean = 10, sd = 0.5)

 [1]  9.289770  9.250706 10.430462  9.791079  9.278798 10.302128 10.762259
 [8] 10.508337 10.071334  9.967363

rt(1, df = 5)

[1] -2.158852

Every time we run a function that outputs random numbers, it produces a different value. Usually, to allow for replication, we set a seed:

set.seed(888)
rnorm(5)

[1] -1.9513433 -1.5443662  0.7298327 -0.2775818 -1.6562843

set.seed(888)
rnorm(5)

[1] -1.9513433 -1.5443662  0.7298327 -0.2775818 -1.6562843

set.seed(666)
rnorm(5)

[1]  0.7533110  2.0143547 -0.3551345  2.0281678 -2.2168745

5.5 Comparing distributions using random numbers

To show the fat tails of the Student-t compared to the normal distribution, we will draw 1000 points from each distribution:

rnd_normal = rnorm(1000)
rnd_t = rt(1000, df = 3)
plot(rnd_t,
  col = "red",
  pch = 16,
  main = "Random points from a Normal and Student-t"
)
points(rnd_normal, col = "blue",pch = 16)
legend("bottomright",
  legend = c("Student-t", "Normal"),
  pch = 16,
  col = c("red", "blue")
)

Repeat these commands a few times.

We see that the points from a Student-t distribution take on more extreme values compared to the Normal. This is a consequence of fat tails.

5.6 Discrete random sampling in credit risk

The concepts we have learned about random number generation become important in financial applications, particularly in backtesting risk models. When testing how well our risk models perform, we often need to randomly sample from discrete sets of data or create bootstrap samples from historical observations.

To illustrate discrete sampling in a financial context, suppose we want to simulate credit rating assignments for a bond portfolio. Credit ratings are discrete categories and we need to sample from these categories based on historical probabilities. Drawing numbers from continuous distributions will not produce the discrete categories we need, so we use sample() which can handle discrete outcomes.

sample(x, size, prob) takes a vector x, a size and optional probabilities, returning a vector of random draws:

# Credit ratings from best to worst
ratings = c("AAA", "AA", "A", "BBB", "BB", "B", "CCC", "D")

# Historical probabilities for new bond issues
probs = c(0.02, 0.05, 0.15, 0.35, 0.25, 0.12, 0.05, 0.01)

# Simulate ratings for 100 bonds in a portfolio
bond_ratings = sample(ratings, 100, replace = TRUE, prob = probs)
table(bond_ratings)

bond_ratings
  A  AA AAA   B  BB BBB CCC   D 
 21   6   2  15  26  24   4   2

How many bonds are investment grade (BBB or better)?

investment_grade = sum(bond_ratings %in% c("AAA", "AA", "A", "BBB"))
cat("Investment grade bonds:", investment_grade, "out of", length(bond_ratings))

Investment grade bonds: 53 out of 100

We can visualize the distribution of ratings in our simulated portfolio:

barplot(table(bond_ratings),
        main = "Simulated Bond Portfolio by Credit Rating",
        ylab = "Number of Bonds",
        xlab = "Credit Rating",
        col = "lightblue",
        las = 1)

This approach is used in credit risk modelling to simulate portfolio compositions and assess concentration risk.

5.7 Small and large sample properties

Explore the distribution of random samples of different sizes, drawn from a standard normal distribution, compared to the distribution:

par(mfrow=c(2,2))
norm1 = rnorm(60)
norm2 = rnorm(100)
norm3 = rnorm(100000)
x = seq(-3,3,0.1)
hist(norm1,
  freq = FALSE,
  breaks = 20,
  main = "Sample size 60",
  col = "lightgrey",
  ylim = c(0, 0.5)
)
lines(x,dnorm(x), lwd = 3, col = "red")
hist(norm2,
  freq = FALSE,
  breaks = 20,
  main = "Sample size 100",
  col = "lightgrey",
  ylim = c(0, 0.5)
)
lines(x,dnorm(x), lwd = 3, col = "red")
hist(norm3,
  freq = FALSE,
  breaks = 20,
  main = "Sample size 100000",
  col = "lightgrey",
  ylim = c(0, 0.5)
)
lines(x,dnorm(x), lwd = 3, col = "red")

5.8 Make our own functions

Suppose we want to calculate volatility repeatedly for different return series.

test_returns = c(0.01, -0.02, 0.015, -0.008, 0.003)
vol = sd(test_returns, na.rm = TRUE) * sqrt(252)

We can create a reusable function for this:

calculate_volatility = function(returns, annualize = TRUE){
  vol = sd(returns, na.rm = TRUE)
  if(annualize){
    vol = vol * sqrt(252)  # Annualize using trading days
  }
  return(vol)
}

calculate_volatility(test_returns)

[1] 0.2242186

calculate_volatility(test_returns, annualize = FALSE)

[1] 0.01412445

And for our returns

load('Returns.RData')
ReturnAnalysis = function(name){
  load('Returns.RData')
  if(! name  %in% names(Returns)) {
    cat("Do not make mistakes,",name,"is not in the data set.\n\n")
    stop("Put the correct name in!")
  }
  y=Returns[[name]]
  cat(name,"mean",mean(y),"sd",sd(y),"\n")
  plot(y,main=name)
  return(mean(y))
}

ReturnAnalysis(name="AAPL")

AAPL mean 0.0008842013 sd 0.02524584

[1] 0.0008842013

ReturnAnalysis(name="GE")

GE mean 0.0001114734 sd 0.02092539

[1] 0.0001114734

5.9 Recap

5.9.1 In this seminar, we have covered:

Drawing random numbers from specified distributions
Using seeds for replicability
Comparing distributions visually using random samples
Small and large sample properties of random numbers
Creating custom functions in R
Discrete sampling techniques for financial applications
Function development for financial analysis

5.9.2 Some new functions used:

rnorm() — generate random numbers from normal distribution
rt() — generate random numbers from Student-t distribution
set.seed() — set random seed for reproducibility
sample() — random sampling from a vector with optional probabilities
hist() — create histograms
barplot() — create bar charts
table() — create frequency tables
if() — conditional statements
%in% — check if element is in vector
stop() — stop execution with error message
sum() — calculate sum of values
points() — add points to existing plot
sd() — calculate standard deviation
sqrt() — calculate square root

5.10 Optional exercises

Credit risk analysis:
- Modify the credit rating simulation to use different probability distributions
- Create a horizontal barplot showing rating percentages
- Calculate the percentage of bonds in each rating category
Advanced sampling techniques:
- Implement stratified sampling for credit ratings by industry sector
- Compare simple random sampling vs stratified sampling results
- Create visualizations showing the difference between methods
Bootstrap simulation:
- Create a function that bootstrap samples from historical returns
- Use bootstrap to estimate confidence intervals for volatility
- Compare bootstrap results with theoretical confidence intervals
Distribution comparison exercise:
- Generate 10,000 random returns using a normal distribution with mean and sd from actual stock returns
- Generate 10,000 random returns using Student-t distribution with df=3
- Create side-by-side QQ plots comparing each to the actual returns
- Which distribution better captures the tail behaviour of real stock returns?