2  The programming language we use here is R

We do not assume you have any knowledge of programming, but if you want to make use of these notes, we expect you are willing to learn a programming language.

2.1 Language choices

2.1.1 Don’t use Excel

While the obvious programming language for forecasting risk might be Excel, it is generally useless for what we are doing here. You might use VBA, but that would be very cumbersome in the types of applications you see in this book.

It is bad practice to process data with Excel, however tempting it may be. The data we get from data vendors is quite often in a format different from what we need for work, so it has to be processed. While Excel might be a sensible choice, it is a bad idea for several reasons:

  1. Every time you update your data you have repeat the Excel manipulation;
  2. Manipulating data in Excel is not transparent, you might not know what you did a few months back, or a collaborator might never know what you did in Excel;
  3. By contrast, data manipulation in R is transparent and repeatable. You know exactly how data is transformed, and you can repeat the analysis every time you update your data.

2.1.2 Options

While there are a large number of programming languages one could use, ranging from general-purpose languages such as c, c++ and rust, to mathematics languages such as fortran, all of these are designed for other purposes and are not recommended here, unless there is a special need for them, in which case you know you need them.

There are four main software choices, all of which would be very useful for risk forecasting. We show sample code in all four of those here.

  1. Matlab;
  2. Python (Numpy);
  3. Julia;

We recently compared them to some commonly used alternatives in Choosing a numerical programming language for economic research: Julia, MATLAB, Python or R.

2.1.3 Our choice: R

We opt for R, a widely used open source package, especially good for statistical analysis of the type we do here. The reason is that at the time of writing, it has better statistical libraries than the other three languages, the best user interface, and there is a large number of resources available for learning it.

2.2 Problems with R

R, like any other language, has problems. It is 40 years old, and comes with a huge number of design decisions that might have made sense decades ago, but are bizarre, or worse today. Patrick Burns in his R inferno does an excellent job exposing those problems.

Matlab and Python are not much better in this regard. Matlab is the same age as R, and Python+numpy is almost as old, and both come with a lot of unfortunate design decisions.

One might then say, “why not use Julia, a modern, and much better designed language?” Julia might be a contender in the future, but not yet. R has a richer ecosystem, with better libraries, documentation and development environments.