|
<< Click to Display Table of Contents >> Gekko 2.4 to 3.x syntax reasons |
![]() ![]()
|
Migrating from Gekko 2.4 to 3.x can be frustrating syntax-wise, because both versions have their syntax peculiarities. Existing Gekko 2.4 users therefore have to unlearn some peculiarities, and learn other peculiarities, in order to migrate to Gekko 3.x.
There are the following points to note about Gekko 2.4 vs. 3.x:
1.In Gekko 3.x, the symbols % and # in variable names (for scalars and collections) are interpreted is if they were just normal characters alongside the normal a ... z alphabet, and the "characters" % and # are interpreted as an integrated part of the variable name. In Gekko 2.4, the % and # parts of variable names worked more like a kind of operators.
2.Databanks in Gekko 3.x can contain any kinds of variables, not just timeseries.
3.Lists in Gekko 3.x can contain any kinds of variables, not just strings.
The flexibilities of (2) and (3) imply a lot of advantages. Databanks can store all kinds of stuff, and lists can represent a lot of different things. For instance, you can import spreadsheet cells into a nested Gekko list (list of lists), where each cell is represented as a Gekko value, date, or string. Lists in Gekko 2.4 can only contain simple strings.
There are some drawbacks to the flexibility, too. For instance, more {}-curlies are needed in Gekko 3.x than in 2.4, but this has to do with point (2) and (3). To understand the use of {}-curlies more, let us envision a completely different Python-like language:
//"Python" x = 2 |
In this language, variable names cannot contain the special symbols % or #, and the code above defines three numerical values (variables a, x, y), a list of strings (variable b), and a string (c). When printing a, b, c, you would expect 1, ('x', 'y'), and 'y' to show up on the screen (a value, a list of strings, and a string).
But what if you intended print a, b, c to mean "print a", then "print the variables referred to by b", and finally "print the variable referred to by c"? In that case, "the variables in b" should be understood as the variables x and y, and "the variable in c" should be understood as the variable y. And if that was the intention, Gekko should print out the four values 1, 2, 3 and 3 (corresponding to a, x, y and y). But how would Gekko know that you intended this, and not the printing of raw strings?
To get 1, 2, 3, 3 printed, Gekko operates with {}-curlies, which in this case would look like the following:
//"Python" |
This would print 1, 2, 3, 3, because the {}-curlies tell the program that the inside of the curlies is to be understood as a reference to other variables. Now, let us return to Gekko syntax. Consider this statement in Gekko:
//Gekko |
Here, a is a timeseries, #b is a list of strings containing ('x', 'y'), and %c is a string 'y'. Here, Gekko 3.x will print out the data of the series a, followed by the list ('x', 'y'), followed by the string 'y'. If you intend to print out the variables that the list #b and the string %c refer to, you must use {}-curlies:
//Gekko |
In Gekko 3.x, this prints out the data of the timeseries a, x, y, y. A second example is copying. In Gekko 3.x, we can do the following:
//Gekko 3.x |
This means make a copy of the list #b, and call this copy #b2 (if #b contains the two strings 'x' and 'y', so will the #b2 list). The following does something very different:
//Gekko 3.x copy {#b} to {#b2}; |
Here, Gekko looks for the two existing lists #b and #b2 and uses the string names inside them as variable references. If #b contains the strings ('x', 'y'), and #b2 contains the strings ('z', 'w'), the statement amounts to copying the series x into z, and the series y into w. Again, the user might think that when using copy #b to #b2; it is obvious that it is the two series x and y that are to be copied into something else, but when databanks may contain lists, how can Gekko know that this is the intention, without the user indicating it somehow?
This is different in Gekko 2.4, because in 2.4, databanks cannot contain anything besides timeseries, and therefore, in Gekko 2.4, the statement copy #b to #b2; actually copies the two timeseries x and y, into z and w, because Gekko "knows" it does not make sense to talk about copying a list in a databank. Is the 2.4 syntax easier to write in this case? Absolutely, but the databanks are also much less flexible.
A third example regarding differences is list construction. Consider this example in Gekko 3.x:
//Gekko 3.x #b = x, y; |
Here, #b = x, y; is a so-called naked list definition, avoiding the more tedious #b = ('x', 'y');. After the second statement is executed, the list #b2 will contain the four strings ('a', 'x', 'y', 'c'), because the {}-curlies look inside #b and fetches its contents. But then what about this?:
//Gekko 3.x #b = x, y; |
In Gekko 2.4, #b2 = a, {#b}, c; creates the list #b2 as the four strings ('a', 'x', 'y', 'c'), whereas #b2 = a, #b, c; fails. This is intentional, to avoid too much confusion (in principle it could produce the list of three strings ('a', '#b', 'c'), but the confusion would outweigh the advantages).
As the fourth and final example, consider this syntax in Gekko 3.x:
//Gekko 3.x #b = x, y; |
This sets the two series x and y to the constant value 100. In Gekko 2.4, you can omit the curlies:
//Gekko 2.4 |
Again, this syntax may seem more simple, because the {}-curlies can be omitted. We may try to do something similar in Gekko 3.x:
//Gekko 3.x |
In the second statement, #b on the left-hand side is just interpreted as a variable name consisting of two characters from the special "Gekko alphabet". So Gekko first calculates the right-hand side (= scalar value 100) and next tries to assign the scalar to the variable name #b. This fails with a type error, since variables starting with # in Gekko can only be a list, a matrix, or a map. The last statement fails, too, but from a different reason. Here, it is stated explicitly that the variable #b must be of series type, but Gekko refuses to construct a series name that starts with # (which is not legal).
Now, why does Gekko 3.x not just "know" that the intention of #b = 100; or series #b = 100; is to define the two timeseries variables corresponding to the two strings inside #b? In principle, this could be done, but unfortunate side-effects would creep in. When a variable name like #b can also be for instance a matrix, couldn't the intention of #b = 100; rather be a simplified way to define a 1x1 matrix? Also, when using the type indicator, series #b = 100; could make more sense, but then we have the problem of series #b = 100; working, but #b = 100; not working, which seems inconsistent.
Omitting the {}-curlies around lists names on the left-hand side opens up other ambiguities, too. In Gekko 3.x, you can for instance do the following:
//Gekko 3.x |
Here, #m1 is a list of three strings, and #m2 is a list of three values. The convenient Gekko 3.x statement {#m1} = #m2; assigns the three values to each of the three series a, b, and c. This is practical in many cases, but what if we were instead using the assignment #m1 = #m2;. Shouldn't Gekko just know that #m1 = #m2; intends to update three timeseries? But how should it know? The statement #m1 = #m2; looks much more like assigning the elements of the #m2 list into the list #m1, so that #m1 = (1, 2, 3) afterwards.
But then what about a statement like series #m1 = #m2; in Gekko 3.x? This could be rewired to produce three timeseries, instead of failing. But again, there would be side-effects. What if someone removes the series type indicator, because series type indicators are in general superfluous in Gekko 3.x? In that case, the statement still runs, but does something altogether different (namely defining a list #m1). This is bound to create confusion, and perhaps errors.
Note that simplified names like for instance x%i%j still work in Gekko 3.x, where %i and %j are two strings that are in-substituted. In that case, you do not have to use the more formal x{%i}{%j}. This former way of succinct writing is no longer officially endorsed in Gekko 3.x, but it is still legal. One of the reasons why it is no longer endorsed is that the string 'x{%i}{%j}' will perform the same in-substitution of %i and %j (so-called string interpolation), whereas 'x%i%j' will not.
Wildcards
Another place where Gekko 3.x differs from Gekko 2.4 syntax-wise is wildcards. Gekko 2.4 allows a statement like index a*b m;, which will match timeseries in the databank starting with 'a' and ending with 'b', and create a list #m to store the results. Incidentally, this is a good case of a # character ambiguity in Gekko 2.4: should the index statement end with m or #m? It ends with m because this corresponds better with the statement list m = [a*b]; where the name is without #-symbol, because it is on the left-hand side. That the legal statement is not index a*b #m; in Gekko 2.4 is basically a quite arbitrary choice.
In places where the * can mean multiplication, Gekko 2.4 demands the use of []-brackets, for instance print [a*b];. If omitted, Gekko 2.4 will just print the product of the two timeseries a and b.
Gekko 3.x takes a different approach to wildcards, demanding that these are in principle quoted inside []-brackets, for instance print ['a*b'];. Adding the quotes is a bit tedious, but omitting them like in Gekko 2.4. creates all sorts of syntax ambiguities for the parser, and also makes it next to impossible to compose a wildcard from other strings (one problem about a wildcard like [a*b] is that it also parser-wise looks like a 1x1 matrix definition).
The expression ['a*b'] returns a list of strings corresponding to the matching timeseries, so print ['a*b'] prints strings, not timeseries. To print the data of the matching timeseries, you should use print {'a*b'};, using curlies instead. So the Gekko 3.x statement print {'a*b'}; corresponds to the Gekko 2.4 statement print [a*b];. This costs a couple of single quotes, but many statements in Gekko 3.x allow simplified wildcards (which is also the case for Gekko 2.4). So you can write for instance the simple index a*b to #m; or copy a*b to x*; in Gekko 3.x, which is very similar syntax-wise to Gekko 2.4.
There are some other differences between wildcard search in Gekko 2.4 and Gekko 3.x. But these are more semantic and are caused by the fact that Gekko 3.x databanks can contain variables that contain % and # symbols (and ! for frequency for that matter).
Miscellaneous annoyances
•In Gekko 2.4, you can write for instance exo #b; to exogenize the two variables (series) x and y, presupposing that #b = ('x', 'y'). In Gekko 3.x, you have to write exo {#b}; to indicate that you are referring to the elements of #b. Since you cannot use a raw list of strings directly for exogenization, something like exo #b; could be made legal in Gekko 3.x, too. But even though this saves some typing of {}-curlies, it also affects the consistency of the language. The users may ask themselves why they should write print {#b}; to print the variables that the strings refer to, whereas exo #b; is enough?
•Gekko 3.x demands a = symbol in any assignment statement. Therefore, it is no longer possible to write something like the Gekko 2.4 syntax series x % 2; to make the series x grow with 2% per period. In Gekko 3.x, this instead reads x %= 2;, which is also more in line with modern programming syntax. Omitting the = symbol would be hard syntax-wise in Gekko, because it helps identifying assignment statements, also for the human reader (remember that in Gekko 3.x, it is no longer mandatory to decorate assignment statements with type indicators like series, val, date, etc).
•In Gekko 3.x series definitions, numerical values must in general be separated by commas (blanks not allowed). However, you may use the data() function to handle blank-separated data, for instance x = data('1 2 3'); instead of x = 1, 2, 3;.
•In Gekko 3.x, dynamic statements like x = x[-1] + 1; have to be decorated with <dyn> tag, for instance x <dyn> = x[-1] + 1;. The reason for this is long-winded, but newer Gekko 3.1.x versions will abort with an error if the tag is unintentionally forgotten. And some users may even appreciate that dynamic statements are clearly visible via the tag.
•As mentioned above, in Gekko 3.x, you cannot use for instance #b = 1; and expect Gekko to understand that this means that you want to set the elements of #b (the series x and y) to the value 1 (instead you must use {#b} = 1;).
•In Gekko 3.x, you have to use for instance x{%i}{%j} for a composed name using the strings %i and %j, whereas you can no longer write it as x{i}{j}. The reason for this (and for other syntax changes) is explained at the bottom of this page. On the positive side, variable names like x{#i}{#j} can be used in Gekko 3.x, where #i and #j are lists of strings (Gekko will automatically combine and unfold the two lists into composed variable names).
•The READ statement and the Global databank. In Gekko 3.x (as was the case for Gekko 2.4), the READ statement first clears the first-position databank (typically Work), before reading in a databank. READ is often used when modeling, but since scalars and collections now live in databanks, this means that a READ statement also wipes out all "settings"-variables that the user may have defined at the top of his/her Gekko program file. This could be file paths, start/end dates, scale factors, lists containing names of sectors, etc. It seems arbitrary to require that READ only clears timeseries variables in the first-position databank, so instead, regarding "settings"-variables, these can profitably be put into the so-called Global databank. This databank is always open and always searched last, and survives READ, CLEAR, etc. So at the top of your program file, you just define for instance global:%path = 'c:\forecasts\bank1';, or global:#sectors = a, nz, qz, o;, and you can subsequently refer to %path and #sectors without worrying about them being wiped out by READ or CLEAR.