Thursday, July 23, 2015

GARCH model estimation, Backtesting the risk model and Forecasting

In this blog I will show you the GARCH model estimation, Backtesting the risk model and Forecasting. Here is my Video:

STEP-1: Installing required packages and loading them as library in R

We will need following packages so if you have not installed them use the following command.
install.packages(c("tseries", "zoo", "forecast", "FinTS", "rugarch"))

Now, load these packages in your R.

STEP-2: Downloading the Data

Let’s download the daily adjusted close data of Dow Jones Industrial Average (^DJI) from first of January 2010 till 23rd of July 2015 (today is 24th July) using Yahoo then return that to zoo class object named = get.hist.quote(instrument="^DJI", start="2010-01-01",  end="2015-07-23", quote="AdjClose", provider="yahoo", compression="d", retclass="zoo")

You can use following command to plot that object.
plot(, main = "DJIE Closing Prices on NASDAQ", ylab = "Price (USD)", xlab = "Date")


STEP-3: Getting the Returns of the Data

Let’s take the log return of and name the object ans DJI.ret and plot the data. You can see in the Fig-2 that high volatile days are followed by high volatile days and low volatile days by low volatile days.

DJI.ret <- diff(log( * 100
plot(DJI.ret, main = "Monthly Compound Returns", xlab = "Date", ylab = "Return in percent")


STEP-4: Finding Best Mean Model Using ARIMA

Now, let’s try to fit a best ARIMA (p,d,q) model based upon the Bayesian Information Criterions. Use the following command. Note that returns are always reverting to some mean returns hence you should expect the returns to be I(0). Hence, the ARIMA is actual only an ARMA (p,q) process. Get the intuition on stationary (here).

fit1 <- auto.arima(DJI.ret, trace=TRUE, test="kpss",  ic="bic")

The above command gave us an ARIMA(1,0,1) or ARMA(1,1) model. Usually, the high frequency data fails the basic econometric assumption of normality, non autocorrelation and heteroskedasticity.

STEP-5: Test for ARCH Effect

However to proceed ahead with GARCH model, the residual of ARIMA must need to have the ARCH effect. To test the ARCH, lets perform the Ljung-Box test on the first 12 lags of the squared residuals of the best ARIMA model under the null hypothesis of no ARCH effects.

Box.test(fit1$residuals^2,lag=12, type="Ljung-Box")

If p-value of Ljung-Box test is smaller than 5% level of significance then there exist the ARCH effect which shows the green light to proceed ahead to GARCH.

STEP-6: Developing a GARCH Model

Instead to fit a GARCH(P,Q), in this blog, I will only fit the GARCH(1,1) for the sake of simplicity. For GARCH theory, check my blog (here).  Lets specify object called res_garch11_spec in which we want to develop a GARCH(1,1) on ARIMA(1,0,1) or in our case ARMA(1,1).

res_garch11_spec <- ugarchspec(variance.model = list(garchOrder = c(1, 1)), mean.model = list(armaOrder = c(1, 1)))

Then, lets fit such specification in the data.

res_garch11_fit <- ugarchfit(spec = res_garch11_spec, data = DJI.ret)

STEP-7: Backtesting the risk model

Once we have fitted the GARCH which modeled the risk. We would like to check the model performance by performing historical backtest. For that, we can compare the estimated VaR (value at risk) with the actual return over the period. If the return is more negative than the VaR, we have a VaR exceedance. In our case, a VaR exceedance should only occur in 1% of the cases  if we specified a 99% confidence level. VaR at 1% shows what 1% probability of my extreme loss.

For that lets implement a rolling window (refit.window = "moving") with start period of the backtest of 120 i.e n.start = 120, reestimated each day (refit.every = 1) implementing hybrid solver (solver = "hybrid") and calculate the VaR (calculate.VaR = TRUE) at the 99% VaR tail level (VaR.alpha = 0.01). And finally lets keep the coefficients (keep.coef = TRUE). There can be chances of model not to be converged so lets implement the ctrl = list(tol = 1e-7, delta = 1e-9) command prior to estimate by a rolling window.

ctrl = list(tol = 1e-7, delta = 1e-9)
res_garch11_roll <- ugarchroll(res_garch11_spec, DJI.ret, n.start = 120, refit.every = 1, refit.window = "moving", solver = "hybrid", calculate.VaR = TRUE, VaR.alpha = 0.01, keep.coef = TRUE, solver.control = ctrl, fit.control = list(scale = 1))

STEP-8: Kupiec's Unconditional Coverage and the Christoffersen Test

We can examine the backtesting report using the report function. By specifying the type argument as VaR, the function executes the unconditional and conditional coverage tests for exceedances. VaR.alpha is the tail probability and conf.level is the confidence level on which the conditional coverage hypothesis test will be based.

report(res_garch11_roll, type = "VaR", VaR.alpha = 0.01, conf.level = 0.99)

To compare the number of expected versus actual exceedances given the tail probability of VaR, we can see the Kupiec's unconditional coverage, while the Christoffersen test tests a joint test of the unconditional coverage and the independence of the exceedances. If the actual exceedances are greater than the expected exceedances then we can't reject the null hypothesis that the exceedances are correct and independent.

STEP-9: Plotting the VaR Limits

A plot of the backtesting performance can also be generated easily using following command and selecting 2.

selection: 2


Here, in the figure the returns of the data (blue) hits the 1% Var (Red) 47 times compare to 13 times expected. Hence we can state the rejection of the null hypothesis that the exceedances are correct and independent. The p-values of Kupiec's unconditional coverage is also less than 5% suggests same. Hence model shouldn't be used in real life. So forecasting is very hazardious however we will forecast for the sake of learning. To mitigate the problem, probably, we should find best GARCH(P,Q) model rather than GARCH(1,1) or may be we need to model other extensions of GARCH.

STEP-10: Forecasting Risk and VaR

For forecasting we can implement following command. Here we just forecast for 12 time period ahead.

res_garch11_fcst <- ugarchforecast(res_garch11_fit, n.ahead = 12)

The forecast values for 12 period are given below.

 Series  Sigma
T+1  0.09602 0.7462
T+2  0.06433 0.7537
T+3  0.06650 0.7609
T+4  0.06635 0.7677
T+5  0.06636 0.7742
T+6  0.06636 0.7804
T+7  0.06636 0.7863
T+8  0.06636 0.7919
T+9  0.06636 0.7973
T+10 0.06636 0.8025
T+11 0.06636 0.8074
T+12 0.06636 0.8121

The forecasts for the volatility are given by sigma and 1% VaR is given just by the side for each period. For the period T+3, the expected volatility is 0.0665 and the 1% VaR is 0.7609.

I used and modified the R codes of book "Introduction to R for Quantitative Finance" which is combinely written by Gergely Daróczi, Michael Puhle, Edina Berlinger, Péter Csóka, Dániel Havran, Márton Michaletzky, Zsolt Tulassay, Kata Váradi and Agnes Vidovics-Dancs

Wednesday, July 22, 2015

Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) Process

High frequency time series generally exhibits the volatility. The Volatility in general means the risk in econometrics and it’s synonymous to standard deviation of data.  To model the volatility of any high frequency time series data Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) Process is modeled. Here is a figure of such volatility.

To run a GARCH first develop a best ARIMA process. For all the tutorial blogs on ARIMA see (here). In ARIMA, we first make data stationary (find the intuition of stationary (here)), then we find an AR (to justify the grounds of habit persistence of any time series) and MA process (to justify the grounds of habit persistence of any errors left after the AR process). 

An ARIMA (p,q,d) process is given as: φp(B)Δdyt = θq(B)εt   (see my previous post on ARIMA (here)). Say a series is ${{Y}_{t}}$ and let it is either transform or differenced appropriately to be stationary and such stationary series of ${{Y}_{t}}$ is ${{R}_{t}}$. The ARMA $(p,q)$  process of ${{R}_{t}}$ can be alternatively written as:
\[{{R}_{t}}={{a}_{0}}+\sum\limits_{i=1}^{p}{{{R}_{t-i}}+}\sum\limits_{j=1}^{q}{{{\varepsilon }_{t-j}}+}{{\varepsilon }_{t}}\]
Now once we fit the best ARIMA (What is best see (here, here, here)) we are again left with residuals of ARIMA i.e. ${{\varepsilon }_{t}}$ . Unconditionally, the error term (${{\varepsilon }_{t}}$) is a zero mean white noise process. The conditional distribution of ${{\varepsilon }_{t}}$ is normal, $N(0,{{h}_{t}})$. The residual should be free from non-normality, autocorrelation and heteroskedasticity. The residual mostly becomes well behaved if we find the best ARIMA. 

However, for the high frequency data, the Autoregressive Conditional Heteroskedasticity (ARCH) effect persists because the high volatility are followed by high volatility times and low with low times. Hence second step is to check whether such ARCH effect persist in residual of ARIMA or not. If ARCH persist then we move ahead to GARCH process. A GARCH $(P,Q)$ can be modeled with the $N(0,{{h}_{t}})$ white noised error term (${{\varepsilon }_{t}}$).
\[{{h}_{t}}={{a}_{1}}+\sum\limits_{m=1}^{P}{{{\alpha }_{m}}\varepsilon _{t-m}^{2}}+\sum\limits_{n=1}^{Q}{{{\beta }_{n}}{{\varepsilon }_{t-n}}}\]
where ${{a}_{1}}>0$, ${{\alpha }_{m}},{{\beta }_{n}}\ge 0$ for all $m$ and $n$ and $\sum\limits_{m=1}^{P}{{{\alpha }_{m}}}+\sum\limits_{n=1}^{Q}{{{\beta }_{n}}<1}$.

Here is the my tutorial video of GARCH.

Here is quick workflow diagram for GARCH process.

Saturday, July 18, 2015

The CGE Simulations with Labor Market Shock

We have an hypothetical economy in which there are two industry which produces two commodities. These industries employ the labor and capital as factor of production. Household provides these factors of production to the firms. In return firm pays the payment to Household. The household consumes all their income in this hypothetical economy. The household also demands and consume the commodities produced by firm. The utility of Household is Cobb Douglas and the production function of Firms is also modeled in Cobb Douglas function.

These relationships in values are given in following IO table.

We develop the Header Array file based on this IO table. See the previous blog (here). The mathematical relationships are discussed and derived (here). The programming of such mathematical relation is modeled (here).

Now, in this blog I want to show you the simulation on this hypothetical economy. I will specifically discuss on what will happen to this hypothetical economy when there is labor market shock? Or Say the labor is expected to grow 10% or reduced by 5% then what happen to total income of household? What happens to the prices of factors of production? What can happen in the industry, which industry will specialized (or grow) and which industry will fade out? To find the detail see this tutorial video.

Saturday, July 4, 2015

Programming in GEMPACK

Well, I took some time to write this blog. Anyways, Let's start to find the logic of GEMPACK programing. To understand this blog you should understand all my previous blogs on CGE. Therefore, this blog can be a bit rigorous.

I have explained about these codes and their logic in the video below.

I suggest you to watch this video in full scree and in HD. I also request you to write all these codes by your own. But for those who are lazy here is the file (here).

In the programming we define
1. sets and subsets
2. loading input output table or HAR file
3. define the variables
4. coefficients
5. read the data from the HAR file
6. define the formula
7. define the equations

Points to be noted:
1. save the file you have to use the quotation. Say if you want to save the file as you have to do: "".

2. To check if your codes have syntax error, use the Tablo Check options

3. Make sure you end your codes with semicolon

In the next blog I will develop a command file to impose the shock in this hypothetical economy.
See you.