Gekko User Guide > Modeling User Guide

Before we start out the modeling user guide, an explanation of some basic concepts that are relevant when modeling in Gekko. You may skip this part if you prefer and move directly to the next section with a more practical simulation example. (On the contrary, to read even more about databanks, timeseries, lists, and other things, you may consult the Data Management User Guide which has much more info and examples regarding this).

Databanks

Databanks are containers of variables, for instance timeseries. Gekko always starts out with two empty databanks (in memory): Work and Ref (reference). The Work databank is where data is normally changed, unless otherwise stated. For instance, simulations are always performed on Work databank data. The Ref databank can be thought of as a background databank, being particularly handy when comparing two scenarios. Try pressing the F2 key to see the open databanks (note: if the Ref databank is empty, it does not show up in the F2 list). Since a READ statement wipes out the contents of the Work and Ref databanks, it may be practical to put settings etc. (for instance paths, time periods, etc.) into the so-called Global databank. Databanks can contain many variables, so beware that the keys [Tab] or [Ctrl+Space] offer autocompletion on timeseries names (cf. here).

Timeseries

Timeseries reside in a databanks. Timeseries may have frequency annual, quarterly, monthly, weekly, daily or undated. If data has been read for timeseries y regarding the period 2015-2020, printing out y for the period 2021 will show a missing value ('M').

When doing modeling (model simulations) with Gekko, timeseries are often called "variables". Model files (.frm) files contain equations that describe relationships between these variables (timeseries), and therefore timeseries are fundamental when doing modeling.

Note that statements involving timeseries can have a local time period indicated, for instance printing with prt <2020 2030> var1, var2;. Global time can be altered with the TIME statement.

Ref databank and comparisons

When a databank is read (the READ statement), the Work databank is cleared, and all the variables from the file are put into the Work databank (it is possible to merge databanks if this behavior is preferred). After this, the Ref databank is also cleared, and all variables are copied from Work to Ref. So after reading a databank file with READ, the Work and Ref databanks are always identical (there are other ways of opening databanks, but for now we focus on READ).

The Ref databank is typically used for multiplier analysis (i.e., experiments). Say you read a databank and then perform some experiment. This experiment will only alter variables in the Work databank, so after the experiment is finished, you can compare the variables (timeseries) in the Work and Ref databanks (Gekko has a lot of statements to do such comparisons, for instance MULPRT, DECOMP etc.).

The so-called operators are used for comparisons. For instance, m means absolute multiplier, whereas d means absolute differences (or q and p in their relative versions). You may consult the PRT (print) statement regarding this, but suffices to say that you may write for instance prt <m> var1;, plot <d> var1;, sheet <q> var1;, etc.

If, at some point, you wish to make sure that the Work and Ref databanks are identical (for instance after a simulation), you can use the CLONE statement. This statement clears the Ref databank, and copies the Work databank into it (in memory). CLONE is typically used just after simulating (SIM) a baseline/reference scenario.

Cleanup and restart

There is a cleanup-statement: RESTART. This statement clears the Work and Ref databanks, in addition to clearing models, lists, options and other things. This provides a clean state of Gekko, as if it had just been closed and reopened. If there is a file with the name gekko.ini present in the working folder, the Gekko-statements in this file will be run by RESTART, so gekko.ini can be used to contain options and other statements (for instance a MODEL statement) that the user wishes to "survive" a RESTART statement.). You may also use CLS ("clear screen") to clear the output window.

In general, when doing simulations (in so-called sim-mode) and want to define a new timeseries variable not already present in the model or databank, you will have to CREATE it first (unless the timeseries starts with the letters xx). However, it should be noted that when a databank is read, any model variables not present in the databank will be auto-created as timeseries (with all observations set to missing values). Because of this, it is often most convenient to put MODEL statements before READ statements. Preferably use this order in Gekko program files: first the RESTART statement, then the MODEL statement, and then the READ statement (or in the gekko.ini file, put the MODEL statement before the READ statement).

Lists, filenames, etc

Regarding models, it should also be noted that the list of endogenous variables in a Gekko model is simply the set of all the variables at the left-hand side of the equations. This may be changed afterwards by means of the ENDO and EXO statements. Regarding equation syntax, you may consult the latter part of the MODEL help file, if you need more information on this.

The hash sign (#) is used for collections (lists, maps and matrices), and the percent sign (%) is used for scalar variables (value, date and string). So in general you refer to these with #x or %x, but note that in name composition using string/name scalars, it is advised to use {}-curlies, for instance fK{%type}{%sector}, where %type and %sector could be type and sector names (stored in strings).

Generally, list items are separated by commas, e.g. #mylist = var1, var2, var3; (this stores the three strings 'var1', 'var2' and 'var3' in the list). This is also the case when the list of items contains expressions: prt x/y, w/z;. Lags and fixed dates are put inside brackets, for instance: var1[-1] or var1[2020]. You may use x.1 or x.2 as short-hand for x[-1] or x[-2] and so on. Square brackets are also used for wildcard-lists, so instead of a standard list (#mylist), you may use for instance ['fX*'] to obtain a list (of strings) of all variables in the Work databank beginning with fX.

Regarding file names, you may use relative paths like \subfolder\filename.txt. Using relative paths makes it easier to move a system of program files (using sub-folders) to another location/computer if needed. Special user-paths can also be designated by means of the option folder … settings.

Modeling basics

Modeling basics