OLS

<< Click to Display Table of Contents >>

Navigation:  Gekko commands >

OLS

Previous pageReturn to chapter overviewNext page

The OLS command performs linear regression (ordinary least squares) on an equation, optionally with linear restrictions on the parameters.

 

Note: a constant term (intercept) is added automatically, unless suppressed with <constant = no>.

 

In the OLS output, there are different links that can be clicked, showing for instance how the equation fits on data, decomposition with respect to the right-hand side variables, and parameter stability regarding different estimation periods.

 

 


 

Syntax

 

OLS <period  CONSTANT=... DUMP=... DUMPOPTIONS=...> name  leftside = var1, var2, ...  IMPOSE=... ;

 

period

(Optional). Local period, for instance 2010 2020, 2010q1 2020q4 or %per1 %per2+1.

CONSTANT=

With <constant = no>, a constant term is not added automatically.

DUMP=

(Optional). Dumps the results as a FRML equation for use in models. You may use OLS<dump> to produce a ols.frm file. OLS<dump=eqs.frm> will use the filename eqs.frm instead. Note that there is no firm guarantee that a subsequent MODEL statement will load the file, but in most cases it will (FRML statements only support a limited subset of general Gekko expressions). If the equation loads, you may consider a SIM<res> to check its residuals. Gekko will put parentheses around all expressions that contain a + or -. This will introduce superfluous parentheses in expressions like * (+ c) or exp(- b) etc. [New in 3.0.6]

DUMPOPTIONS=

(Optional). If you use OLS<dump=eqs.frm dumpoptions='append'>, the results will be appended to an existing eqs.frm file. These options will be augmented with styling, FRML code, etc.  [New in 3.0.6]

name

(Optional). A name for the equation, used to name the results. If no name is given, ols is used as name.

leftside

The leftside variable (may be an expression)

var1, ...

A list of variable names or expressions. A constant term is added automatically, unless you use option <constant = no>.

IMPOSE

(Optional). You can impose linear restrictions on the parameters, via a suitable matrix. One restriction per row of the matrix, cf. example below.

 

 

Results:

 

Note that if a name is given, ols is replaced with that name.

 

ols_predict

A timeseries with the predicted values

ols_residual

A timeseries with the residuals

#ols_param

A matrix with estimated parameters

#ols_se

A matrix with standard errors on parameters

#ols_t

A matrix with t-values on parameters

#ols_covar

A matrix with the variance-covariance matrix (of parameters)

#ols_corr

A matrix with the correlation matrix (of parameters)

#ols_stats

A matrix containing different measures (analogous to the .stats matrix in AREMOS):

 

1: Residual sum of squares

2: Standard error

3: Residual mean

4: Root mean square error (RMSE)

5: R squared

6: R bar squared

7: [empty]

8: Dependent variable mean

9: Durbin-Watson with lag 1

 

(At some point, a map will be used instead for these measures).

 

 

 


 

Example

 

This example estimates a linear model with five parameters. You may consult the MATRIX section to see the same parameters calculated with linear algebra, or the R_RUN section to see the same parameters calculated via the R interface.

 

RESET;
CREATE lna1, pcp, bul1;
SERIES <1998 2010> lna1 = data(' 166.223000  173.221000  179.571000  187.343000  194.888000  202.959000 
  209.426000  215.134000  222.716000  230.520000  238.518000  246.654000  254.991000') ;
SERIES <1998 2010> pcp  = data(' 0.9502030   0.9699920   1.0000000   1.0235000   1.0401100   1.0605400   
  1.0754700   1.0977800   1.1121200   1.1314800   1.1513000   1.1717600   1.1871600')  ;
SERIES <1998 2010> bul1 = data(' 0.0684791   0.0591698   0.0560344   0.0535439   0.0535003   0.0631703   
  0.0649875   0.0578112   0.0473207   0.0404508   0.0467488   0.0472923   0.0475191')  ;
OLS <2000 2010> dlog(lna1) = dlog(pcp), dlog(pcp.1), bul1, bul1.1;

 

The commands produce the following screen output:

 

 OLS estimation 2000-2010 (n = 11)
 dlog(lna1)
 -----------------------------------------------------------------
       Variable          Estimate         Std error        T-stat 
 -----------------------------------------------------------------
  dlog(pcp)              0.144517          0.227011          0.64 
  dlog(pcp.1)            0.613875          0.236473          2.60 
  bul1                   0.186740          0.202534          0.92 
  bul1.1                -0.350908          0.203182          1.73 
  CONSTANT              0.0298039         0.0089418          3.33 
 -----------------------------------------------------------------
 R2: 0.625034    SEE: 0.00346154    DW: 1.8651

 

In addition to the screen output, the timeseries ols_predict and ols_residual are produced, together with the matrices #ols_param, #ols_se, #ols_t, #ols_covar, #ols_corr, and #ols_stats. The matrices can be printed out with the PRT command.

 

In the example above, you may, for example, restrict the first two parameters to sum to 0.80, and the third and fourth to be equal like this (cf. the MATRIX command):

 

#= [1, 1, 0, 0, 0, 0.80; 0, 0, 1, -1, 0, 0];
OLS <2000 2010> dlog(lna1) = dlog(pcp), dlog(pcp.1), bul1, bul1.1 IMPOSE = #r;

 

If the parameters are called b{i}, the first restriction is equivalent to 1*b1 + 1*b2 + 0*b3 + 0*b4 + 0*b5 = 0.80, or b1 + b2 = 0.80. The second restriction is equivalent to 0*b1 + 0*b2 + 1*b3 + (-1)*b4 + 0*b5 = 0, or b3 = b4. So the last column of the #r matrix contains the values that the linear restrictions should sum up to. The restrictions produce the following:

 

 OLS estimation 2000-2010 (n = 11)
 dlog(lna1)
 -----------------------------------------------------------------
       Variable          Estimate         Std error        T-stat 
 -----------------------------------------------------------------
  dlog(pcp)              0.167642          0.180625          0.93 
  dlog(pcp.1)            0.632358          0.180625          3.50 
  bul1                 -0.0863480         0.0794747          1.09 
  bul1.1               -0.0863480         0.0794747          1.09 
  CONSTANT              0.0291952         0.0085164          3.43 
 -----------------------------------------------------------------
 R2: 0.491156    SEE: 0.00349218    DW: 1.6847

 

 


 

Note

 

You may consider R to perform econometrics. But Gekko also has some pretty good interfaces to TSP (with its rock-solid LSQ estimator).

 

The variables do not need to have similar magnitude to obtain precise parameter estimates (pre-scaling is performed internally).

 

Instead of OLS<dump>, some people prefer to compose FRML equations for models by hand, using TELL and PIPE. In this way, the equations can be formatted exactly as the user prefers. To control the formatting of paramaters, you may use the inbuilt format() function, for instance using TELL 'FRML y = {format(#ols_param[1], '0.000000')} * x + ({format(#ols_param[2], '0.000000')})';. The last parenthesis is to deal with #ols_param[2] being negative. See more on formatting of strings in the TELL section.

 

After an OLS, you may use the Copy-button in the main Gekko window to copy/paste (with full precision) the matrix of parameter values/errrors to Excel or other spreadsheets.

 

 


 

Related commands

 

ANALYZE, MATRIX, MODEL, R_RUN