The official Gekko 3.0 is now released.
This is a long post, but the intent is to try to explain what Gekko 3.0 really is about. Which is actually not so easy to boil down exactly: it is perhaps best to think of it as a long-term vision, borne out of the realization that Gekko 2.0 (or adaptations hereof) was not really up to task of realizing that vision. And what is that vision? It can perhaps be summed up as trying to move Gekko into the realm of more general data-wrangling and -manipulation, while still keeping a firm timeseries-oriented foundation.
Gekko versions up to and including 1.8 were inspired by PCIM, versions 2.0-2.2 were inspired by AREMOS, Gekko 2.4 was inspired by GAMS, and it could perhaps be said that Gekko 3.0 is still inspired by GAMS, but also to a large extent by Python. Does a Gekko 3.0 program look like Python? No. But take for instance a look at lists in Gekko 3.0, including the use of trailing comma for singleton lists, or the in-built list functions (most of these are borrowed from Python). It should be mentioned that R, Matlab and Julia have also been looked at regarding version 3.0, but the main inspiration probably stems from Python, especially regarding the list and map variable types. Also, the syntax of Python has been an inspiration (the syntax of R is quite peculiar and somehow feels old-fashioned compared to Python).
Similar to the birth of Gekko 2.0, the birth of 3.0 was a prolonged affair. Around the summer of 2017, it was evident that there was some unfortunate fuzziness regarding the Gekko 2.0 syntax. Sometimes Gekko would accept a name (like x), sometimes a bank and name (like b:x), sometimes an expression (like x + y), and sometimes the ‘%’ or ‘#’ type symbols should be omitted. In order to save a few keystrokes or symbols here and there, the syntax logic had become needlessly complicated in Gekko 2.0, and a string scalar like for instance %x could refer to a timeseries in some contexts, or a string in other contexts. So it was quite evident that the syntax needed some sort of cleanup.
It was also clear that it was natural to allow other variable types than timeseries into the databanks, so that scalars, lists, matrices, etc. could be stored together with timeseries. In addition, the so-called array-series (series callable by index, for instance x[‘a’] instead of xa, or x[‘a’, ‘b’] instead of xab) inspired by GAMS syntax needed to be further developed, because in reality the array-series were squeezed into Gekko 2.4 as an afterthought. Since databanks store all kinds of variables in Gekko 3.0, it was natural to also generalize lists, so that these can also store all kinds of variables (including lists), and there is even a databank-like variable type (map) in Gekko 3.0 that stores variables by name (such a map is called a dictionary in Python, or a named list in R).
As a last but not least point, the way timeseries were handled internally had also grown problematic. In versions prior to 3.0, series statements were handled differently from other assignments. A series assignment like “SERIES y = x1 + x2;” would be run n times in an outer loop, where n was the number of periods in the time period (for instance y[2010] = x1[2010] + x2[2010], followed by y[2011] = x1[2011] + x2[2011], etc.). In Gekko 3.0, a statement like y = x1 + x2 runs more like vector addition, without any outer time loop (more on this here). This means that expressions containing series variables can flow much more smoothly in and out of functions and procedures, and it is no longer necessary to sustain an artificial distinction between series expressions and non-series expressions (or series expressions versus array-series expressions). In Gekko 3.0, an expression is an expression, and expressions can be used anywhere. So assignments in Gekko 3.0 are in reality of only one kind: assigning a right-hand side expression (of any kind) to a left-hand side variable (of any kind). This does not mean that you can assign, say, a list to a scalar (which will issue a type error), but syntactically, all assignments are equal in Gekko 3.0.
So Gekko 3.0 started out as a modest wish of cleaning up the syntax, handling series and array-series more like objects/vectors, redesigning the databank structure, and allowing more flexible use of lists etc. But soon the complications began to creep in. For instance, if lists are no longer just lists of strings, a Gekko 2.0 statement like “LIST m = a, #m1, b;”” is no longer straightforward to interpret. How can we know that a and b are supposed to represent strings (they could be series objects), and how can we know that it is the elements of a list #m1 that are to be added, and not the list itself? So the flexibility of the 3.0 lists somehow had to be reflected in the syntax (see more on lists and so-called naked lists here). Likewise, in Gekko 2.0, a statement like “PRT b:#m;” is pretty straightforward: find the list #m, and print the series corresponding to the string names in #m, where all of these series are taken from the b databank. But in Gekko 3.0, a databank may contain lists, so b:#m actually refers to the list #m in the databank b. Therefore, in Gekko 3.0, we need to use {}-curlies to do this: “PRT b:{#m};”. This indicates that it is not the list #m itself that we want to have printed: it is the variable names referred to by the string elements in #m that we are interested in (and these names are to be taken from the b databank). Note incidentally, that “PRT {b:#m};” is still different: in that case we will be fetching the #m list from the b databank, and printing the variables corresponding the these string names (see more on {}-curlies here).
Syntax questions also arose regarding the special type symbols (‘%’ and ‘#’, also called sigils, more on those here and here) and where they should be used, and regarding the choice of frequency indicator (‘!’ symbol, more on this here). Worth noting is also that the syntax of and logic of wildcards had to be adjusted too, to reflect the fact that variable names in databanks may now contain ‘%’, ‘#’ and ‘!’ symbols. All this also had to comply rather tightly with GAMS syntax, in order to ease the use of array-series and being able to interface with GAMS in a convenient way.
All in all, these changes had many impacts throughout the Gekko source code. The source code has become quite a lot clearer in Gekko 3.0, but large parts of it had to be rewritten, especially those parts that accepted series, series expressions or wildcards as input. Stumbling blocks on the way was also the lag problem, and the dynamic problem (regarding the latter: in Gekko 3.0, an expression like “x = x[-1] + 1;” will not accumulated period by period, unless a <dyn> option is used). But all in all, treating series expressions more like vector algebra is a big advantage that will benefit the further development of Gekko tremendously. Without this, user-defined functions and procedures would never become a smooth experience, and perhaps this is the very reason that neither EViews nor AREMOS implement fully featured user-defined functions?
For the casual user, the syntax and functionality changes of Gekko 3.0 are probably not earth-shattering. The user has to remember to always use type symbols ‘%’ and ‘#’ for scalars and collections, and {}-curlies must be used in some cases where they could be omitted in Gekko 2.0. There are other small differences, and the more prevalent {}-curlies will probably take some getting used to. But on the flip side, Gekko 3.0 is much more flexible regarding what can be put into, for instance, a list or a databank, and there are a lot of new functions and capabilities that benefit from the tightened syntax of 3.0. But what is perhaps most important of all: the changes in 3.0 make the use of functions, procedures and larger systems of command files a much more seamless experience, and interfacing with packages like for instance Matlab, GAMS, R, Python, Julia etc. will be much easier to implement in Gekko 3.0 than woud have been the case with the 2.0 series.
To sum it up, to this author, Gekko 3.0 feels like a fully consistent system regarding syntax, functionality and internal logic. In contrast, the 2.0/2.2/2.4 versions are less consistent, and also have some quirks and peculiarities that would sooner or later show themselves, limit the capabilities of the software, and hamper the future interoperability of Gekko and other software packages. There was a need for modernization of the core parts of Gekko, which involved rewriting large parts of the Gekko source code, but if this was not done now, it might not have been feasible to do later on. Both because the core parts of the Gekko 2.0/2.2/2.4 source code would grow more and more complicated with exceptions to rules and exceptions to exceptions to rules etc., but also because the users would begin to build larger systems on 2.0/2.2/2.4 that would later on become a major headache to migrate to a modernized Gekko.
In the future, Gekko will probably implement dataframes, table objects and other convenient datastructures too, and combined with the flexible and consistent nature of Gekko 3.0, it is the hope that general data wrangling, -manipulation and -aggregation could become one of Gekko’s strong points, too.
Recent Comments