Vector Autoregression (VAR) - An Outline

Estimation and Analysis of VAR Models


 

Disclaimer: Data, charts and commentary displayed are for information purposes only with no obligation and guarantee assumed.

 


The VAR methodology

Vector Autoregressive (VAR) models render an important tool for analysing macroeconomic and financial data, specifically useful for analysing the dependence and dynamics of variables. VAR models explain a group of endogeneous (to be explained) variables based solely on its common history of data. Using VAR permits to build a time series model of a group of variables without having to specify any theoretical economic model to explain the relationships among variables.

The most simple VAR model with one lagged variable for each endogeneous variable is the VAR(1) model

\[y_{1,t} = a_{11} \cdot y_{1,t-1} + a_{12} \cdot y_{2,t-1} + \epsilon_{1,t}\] \[y_{2,t} = a_{21} \cdot y_{1,t-1} + a_{22} \cdot y_{2,t-1} + \epsilon_{2,t}\]

VAR(1) model in matrix notation

\[\begin{pmatrix} y_{1,t} \\ y_{2,t} \end{pmatrix} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{pmatrix} y_{1,t-1} \\ y_{2,t-1} \end{pmatrix} + \begin{pmatrix} \epsilon_{1,t} \\\epsilon_{2,t}\end{pmatrix}\]

A VAR model of order p VAR(p) will contain for each variable in the vector at the left-hand side of the equation system p lagged variables at the right hand side. Noteworthy is any magnitude of the coefficients close to 1 as this indicates persistence of shocks, or even non-stationarity. The VAR(p) model represented in compact matrix notation is

\[Y_{t} = \boldsymbol{A}_{1} \cdot Y_{t-1} + \boldsymbol{A}_{2} \cdot Y_{t-2} + ... + \boldsymbol{A}_{p} \cdot Y_{t-p} + \epsilon_{t}\]

It is important to note that the VAR methodology will not explain causal dependencies between economic variables, but rather provide insights in their dynamic relationships. The latter is conceptualised by means of Granger causality, e.g. \(y_{2}\) is not Granger causal for \(y_{1}\) if the lagged value of \(y_{2, t-1}\) is not impacting on the current value of \(y_{1, t}\). Consequently, when rejecting the hypothis \(a_{12} = 0\) \(y_{2}\) is Granger causal for \(y_{1}\) .


Stationarity
  • The estimation of VAR model parameters requires stationarity of variables, e.g. for a VAR model with endogenous variables only (no exogeneous variables inlcuded) the left-hand side variable vector should be covariance-stationary in all variabes.
  • Precisely, a stationary VAR model generates stationary time series for each of the endogeneous variables with time-invariant mean, variance and covariance. A simple approach to verify stationarity is to visually inspect the fluctuations for each endogenous variable around the mean.
  • The stationarity condition of a VAR(p) model with k endogenous variable can be checked evaluating the characteristic polynomial. Specifically, the VAR(p) is stable is the roots of the characteric polynomial lay outside of the unit circle. \[det (\boldsymbol{I}_{K} - \boldsymbol{A}_{1} \cdot z - ... - \boldsymbol{A}_{p} \cdot z^{p} ) = 0\]
  • According to Pfaff, the stationarity of a VAR(p) model is verified in practice by calculating the eigenvalues of the coefficient matrix A that is associated to the stacked VAR(1) representation of the VAR(p) model. If the moduli of the eigenvalues are less than one, then the VAR(p) process is stable. By contrast, eigenvalues greater or equal one indicate non-stationarity of variables.

\[V_{t} = \boldsymbol{A} \cdot V_{t-1} \] \[\boldsymbol{A} = \begin{bmatrix} A_{1} & A_{2} & ... & A_{p-1} & A_{p} \\ I & 0 & ... & 0 & 0 \\ 0 & I & ... & 0 & 0 \\ ... & ... & ... & ... & ... \\ 0 & 0 & ... & 0 & I \end{bmatrix} \]

  • Another test used to determine stationarity is the Augmented Dickey Fuller test ADF test.
  • Should a time series be non-stationary as evidenced by its unit roots, differencing of data permits to transform from a non-stationary to a stationary time series. However, in the case of unit roots and cointegration and Error Correction Model needs to be applied.

Estimation of VAR Models

The estimation of VAR models involve several aspects that require consideration.

  • The lag length p of the VAR(p) models needs to be determined. An information criterion such as the one from Akaike (AIC) or Bayes-Schwarts (BIC) should be applied to select an appropriate lag length.
  • To balance the number of endogeneous variables included in the model against the history of data available (note that the number of observations required for estimation corresponds to the square of the number of variables).
  • To transform raw variables, e.g. using logs to ensure constancy of variances and to use rates or differences to ensure stationarity.
  • To include deterministic terms when time series have been subtracted by mean values

Error terms and diagnostic testing
  • Due to the Least Square estimation method, requirements for the error terms exists. Specifically that the expectation value of each error term \(E[\epsilon_{i}] = 0\) (note that this requirement is satisfied if the time series exhibit a mean value of null. An expansion of equations by a constant is required should the empirical mean value be unequal null).
  • Furthermore, any autocorrelation (= serial correlation) of error terms is not permitted \(E[\epsilon_{i,t1} \cdot \epsilon_{i,t2}] = 0\) (note that autocorrelation of error terms typically lead to biased estimators).
  • The error terms \(\epsilon_{i}\) may be correlated however, with the dependency structure represented by a covariance matrix \(\Omega\). Typically, error terms are correlated for variables where exogeneous shocks impact simultaneously on variables. This occurs for situations where variables are not fully explained by its own past.

Impulse-Response Function

To evaluate the response of the VAR system to shocks in any of the variables, the Impuse-Response Function (IRF) approach is used. The IRF approach is helpful since the coefficients in the VAR model tell little about the dynamics. Specifically, a variable’s residual is unit shocked at an initial time and the shocked VAR dynamics is compared to the VAR dynamics without any shock. This is best illustrated with a VAR(1) model
\[y_{t} = \boldsymbol{A} \cdot y_{t-1} + \epsilon_{t}\] Using this setting leads to observing the impact of shocking one variable: At time \(t = 1\) variable \(y_{1}\) is shocked by \(\epsilon_{1} = 1\) and variable \(y_{2}\) is not shocked as specified by \(\epsilon_{2} = 0\).
At time \(t = 2\) the impact on variable \(y_{1}\) is \(y_{2} = a_{11} \cdot 1\) and the impact on variable \(y_{2}\) is a_{21} 1$. The approach is repeated for subsequent time periods.


Structural VAR (SVAR)

Contemporaneous variable that appear on the right hand side (RHS) of the set of equation indicate a contemporaneous feedback terms. This is illustrated by two endogeneous variables, where one variable impact on the other variable contemporaneously.

\[y_{1,t} = a_{11} \cdot y_{1,t-1} + a_{12} \cdot y_{2,t-1} + b_{12} \cdot y_{2,t} + \epsilon_{1,t}\] \[y_{2,t} = a_{21} \cdot y_{1,t-1} + a_{22} \cdot y_{2,t-1} + b_{21} \cdot y_{1,t} + \epsilon_{2,t}\]

The contemporaneous term can be taken over to the left hand side and the standard VAR form can be achieved after mulitplying with the inverse coeffcient matrix.

\[\begin{pmatrix} y_{1,t} \\ y_{2,t} \end{pmatrix} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \begin{pmatrix} y_{1,t-1} \\ y_{2,t-1} \end{pmatrix} + \begin{bmatrix} 0 & b_{12} \\ b_{21} & 0 \end{bmatrix} \begin{pmatrix} y_{1,t} \\ y_{2,t} \end{pmatrix} + \begin{pmatrix} \epsilon_{1,t} \\\epsilon_{2,t}\end{pmatrix}\]

To ensure estimation, a SVAR(1) model would need to be restricted on the coefficients of the contemporaneous terms, e.g. one coefficient has to be set to 0 to ensure a valid definition of the LHS.


VAR with exogeneous variables (VARX)

Extensions of the VAR are implemented by including constants and trends, denoted by VARX. A system of VAR equations that contains a vector of exogeneous (= not explained) variables X

\[y_{t} = \textbf{A} \cdot y_{t-1} + \boldsymbol{B} \cdot X_{t} + \epsilon_{t}\]

Vector Error Correction Models (VECM)

If the variables used to make up the left-hand side vector y (and any exogeneous variables) are not-stationary, an application of the VAR estimation approach is not valid anymore. An estimation is nonetheless possible using the Vector Error Correction Model (VECM) providing the endogeneous variables in y are difference-stationary. Just using the first differences of I(1) integrated variables in a VAR Model may seem plausible an approach but would disregard the long-run relationship of variables, i.e. the long-run responses of variables to shocks in each variable. Therefore, a VECM Model will have to be used by adding a lagged error-correction term to the VAR Model to enable the capture of long-run tendencies. An example of a long-run relationship may be found in the dividend / price ratio of equity shares where the ratio is low and the price is high during bubbles, with reversion towards a long-run relationship. Another example often cited in literature is the Purchasing-Power-Parity (PPP) which states a convergence towards the law of one price when comparing the price of goods and services expressed in foreign and domestic currency.


 

Simulation of VAR processes

 

Given the specification of a VAR model, the simulation of data can be performed by sampling from the distribution of error terms.

 

Simulating a VAR(1) Model

  • The simulation of the bivariate VAR(1) process with correlated normal error terms has been executed using 200 data points. The correlation of error terms is specified by the following covariance matrix.

\[\boldsymbol{\Sigma} = \begin{pmatrix} 1.0 & 0.5 \\ 0.5 & 1.0 \end{pmatrix}\] - The VAR(1) model variables are stationary given the eigenvalues 0.9772002 and 0.1227998obtained from the coefficient matrix are less than one.

\[\begin{pmatrix} y_{1t} \\ y_{2t} \end{pmatrix} = \begin{pmatrix} -0.7 \\1.3\end{pmatrix} + \begin{bmatrix} 0.7 & 0.2 \\ 0.2 & 0.7 \end{bmatrix} \begin{pmatrix} y_{1t-1} \\ y_{2t-1} \end{pmatrix} + \begin{pmatrix} \epsilon_{1t} \\\epsilon_{2t}\end{pmatrix}\]

 

Simulating a VAR(2) Model

  • The simulation of the bivariate VAR(2) process with uncorrelated standard normal error terms has been executed using 200 data points

\[\begin{pmatrix} y_{1,t} \\ y_{2,t} \end{pmatrix} = \begin{pmatrix} 5.0 \\ 10.0 \end{pmatrix} + \begin{bmatrix} 5.0 & 0.2 \\ -0.2 & -0.5 \end{bmatrix} \begin{pmatrix} y_{1,t-1} \\ y_{2,t-1} \end{pmatrix} + \begin{bmatrix} -0.3 & -0.7 \\ -0.1 & 0.3 \end{bmatrix} \begin{pmatrix} y_{1,t-2} \\ y_{2,t-2} \end{pmatrix} + \begin{pmatrix} \epsilon_{1,t} \\\epsilon_{2,t}\end{pmatrix}\]

 


 

Estimation of VAR Models

Estimating a VAR(1) Model

  • We estimate a VAR(1) model using the 200 simulated data points of the above specifified bivariate VAR(1).

  • For the selection of the Lag order, i.e. the optimal lag length, the Akaike (AIC) and the Schwarz (SC) Information criteria are compared for varying lag orders (cf. table 1). By comparing both the AIC and the SIC values we observe a minimum absolute information values regarding SIC at a lag length of 1. Note that a minimum information value represents an optimal balance between a minimum residual variance and a maximum number of variables.

  • The coefficients of the estimated VAR(1) model are shown together with associated statistics in table 2 and 3.

TABLE 1: AIC and SIC Information criteria for various lag orders
 12345678
AIC(n) -2.082 -2.048 -2.04 -2.016 -1.981 -1.955 -1.951 -1.973
HQ(n) -2.041 -1.979 -1.95 -1.893 -1.830 -1.777 -1.745 -1.740
SC(n) -1.980 -1.878 -1.80 -1.711 -1.608 -1.514 -1.442 -1.396
FPE(n) 0.125 0.129 0.13 0.133 0.138 0.142 0.142 0.139
TABLE 2: Coefficient statistics of Variable Y1
 EstimateStd. Errort valuePr(>|t|)
y1.l1 0.762 0.074 10.33 0.000
y2.l1 -0.338 0.103 -3.27 0.001
TABLE 3: Coefficient statistics of Variable Y2
 EstimateStd. Errort valuePr(>|t|)
y1.l1 -0.207 0.041 -5.08 0
y2.l1 0.714 0.057 12.49 0

 

Diagnostics of VAR Models

 

Diagnostical Testing of VAR(2)

  • We first fit the following VAR(2) Model to the 200 data points obtained from simulating the above specified bivariate VAR(2) Model

 

\[Y_{t} = \boldsymbol{C} + \boldsymbol{A}_{1} \cdot Y_{t-1} + \boldsymbol{A}_{2} \cdot Y_{t-2}+ \boldsymbol{\epsilon}_{t}\]

\[\boldsymbol{C} = \begin{pmatrix} 4.449 \\ 10.275 \end{pmatrix}, \boldsymbol{A}_{1} = \begin{bmatrix} 0.496 & 0.206 \\ -0.241 & -0.512 \end{bmatrix}, \boldsymbol{A}_{2} = \begin{bmatrix} -0.272 & -0.642 \\ -0.122 & 0.292 \end{bmatrix}\]

 

  • The residuals of variable y1 are plotted together with empirical distribution and autocorrelation functions (only y1 is displayed). A basic assumption of the VAR model is that the residuals exhibit no serial autocorrelations. Therefore, the lag order of the estimated model should be sufficient to ensure absence of autocorrelations.

  • Autocorrelations in the residuals are tested using the asymptotic Portmanteu test with the following test-statistic (p-value) obtained 20.264 (0.682)

  • Heteroskedasticity in the residuals are tested using the Arch test with a test statistic value (p-value) of 48.377 (0.338) obtained.

  • Normality in the residuals are tested by applying the Jarque-Bera test to the residuals of each variable. For variable y1 the test statistic value (p-value) obtained is 2.497 (0.287). For variable y2 the test statistic value (p-value) obtained is 0.47 (0.79). Occurence of non-normality in residuals may be caused by outlier residuals which could indicate a misspecification of the VAR model.

  • Skewness and kurtosis of residuals is tested on a multi-variate basis with the following test statistic values (p-values) obained are 2.662 (0.264) and 0.581 (0.748)

  • The stationarity condition is tested based on the eigenvalues of the stacked coefficient matrices

TABLE 5: roots
 1st root2nd root3rd root4th root
value 0.83 0.6 0.521 0.521