|
<< Click to Display Table of Contents >> Wildcards |
![]() ![]()
|
Wildcards are used to search for variables in one or more databanks. Internally i Gekko, a wildcard search returns a list of variables in the form of strings, possibly with databank names and frequencies. There are special rules present regarding how these wildcards work regarding type and frequency symbols.
Array-series can be searched quite straightforwardly with syntax like x[a*], cf. the INDEX statement. Wildcards can also be used to search inside a list of strings (cf. LIST).
Beware that the keys [Tab] or [Ctrl+Space] offer autocompletion on timeseries names (cf. here), which implicitly amounts to a wildcard search.
In the following, the logic of databank search is described.
Basics
Wildcards for bank searching come in three flavors (here matching variables starting with 'x' and ending with 'y'):
•String wildcards: ['x*y']. Returns matching strings
•Name wildcards: {'x*y'}. Returns matching names. Actually short for {['x*y']}, cf. the note at the end.
•Naked wildcards: x*y. Returns matching names, short for {'x*y'}.
•Note that wildcards can be used for array-subseries too, for instance index x[x*y]; (cf. INDEX).
Ranges and single character matches are possible too, for instance 'x1a..x2z' or 'x?y'.
The difference between returning strings or names can be seen in this example:
time 2010 2012; |
Result:
['x*y'] 'x1y', 'x2y' [2 items]
x1y % x1y % 2010 1.0000 M 2.0000 M 2011 1.0000 0.00 2.0000 0.00 2012 1.0000 0.00 2.0000 0.00 |
So the first PRT prints out a list of strings equal to ('x1y', 'x2y'), whereas the second PRT prints out the two timeseries x1y and x2y. The third PRT fails, since it expects to multiply two timeseries x and y.
However, in some statements like COPY, RENAME, INDEX, DISP, etc., naked wildcards are allowed, for instance index x*y; to get a list of variables starting with x and ending with y. These statements do not allow mathematical expressions, and therefore the {'...'} syntax can be omitted.
time 2010 2012; x1y = 1; x2y = 2; |
Both INDEX statements print out x1y, x2y as matching items, DISP would print the two series.
Wildcards without bank indicator only search for the variables in the first-position databank. To search in all databanks, use for instance index *:x*y; or prt {'*:x*y'};. As an example, consider the case where there are the following databanks present:
Databank |
Variables |
||
1. Work |
x1y |
x2y |
|
2. bk1 |
x1y |
x3y |
|
3. bk2 |
x2y |
a3y |
|
Note: the variables in red are the ones that are found first in a databank search
time 2010 2012; |
Examples:
•prt {'x*y'}; prints Work:x1y, Work:x2y (only variables from the Work bank)
•prt {'*:x*y'}; prints Work:x1y, Work:x2y, bk1:x1y, bk1:x3y, bk2:x2y, bk2:x3y (all 6 variables in the table)
•prt x1y, x2y, x3y; prints the variables Work:x1y, Work:x2y, bk1:x3y (shown in red, provided that databank searching is active, else the statement fails regarding x3y).
While databank searching has advantages regarding concrete variables like x1y, x2y, x3y, using such a search logic regarding wildcards would be both confusing and error-prone.
Use of '%', '#', '!', and stars
In their most strict form, wildcards for bank searching are stated like this:
#m = ['x*y']; |
This particular wildcard will return a list of strings containing the names that match the 'x*y' wildcard, that is, names that start with x and end with y. This wildcard only matches variables from the first-position databank, with the current frequency. So if the first-position databank is b1, and the current frequency is annual (!a), the wildcard matches the same variables as ['b1:x*y!a']. If you need to match all series of all frequencies (in all open databanks), you can use ['*:*!*']. All scalars and collections are matched with ['*:%*'] and ['*:#*'], respectively, so to match scalars or collections, you need to use % or # in the wildcard. However, to match all variables in a given databank, you may use the special '**' wildcard, so ['*:**'] matches all variables in all databanks.
The following finds all variables in all banks (as a list of string names):
#a = ['*:*!*'] + ['*:%*'] + ['*:#*']; //+ operator concatenates |
Similarly, the following will match all items in the first-position databank:
#w = ['*!*'] + ['%*'] + ['#*']; |
whereas
#ws = ['*']; |
matches all series with the same frequency as the current frequency in the first-position databank.
Bank ranges
Ranges work much like wildcards, using dots in a ['start' .. 'stop']-range. For instance:
#az1 = ['xa'..'xz']; |
will match all series of the current frequency in the alphabetical range xa-xz in the first-position databank. To match a range in another databank, use for instance:
#az2 = ['b1:xa'..'b1:xz']; |
Note that you must state the bankname both before and after the dots.
List searching
You may use wildcards and ranges on lists of strings, for instance:
#m = xa, xay, xdy, xey; //or: #m = ('xa', 'xay', 'xdy', 'xey'); |
When used on lists, wildcards and ranges work normally, that is, there are no special rules regarding bank colon :, frequency ! or type symbols (% and #). The strings in the list are matched as they are.
Details: why the special logic?
The reader may wonder why wildcard search in databanks has a special kind of logic regarding symbols %, #, and !? This is explained below. Imagine a databank containing these variables:
•fy!a, an annual series
•fy!q, a quarterly series
•%y, a string
•#y, a list
If we use 'naive' wildcards without special rules, we get this (for instance with INDEX *;):
index *; // --> fy!a, fy!q, %y, #y |
Everything is matched. This may seem ok, but then what about this:
index f*y; // --> nothing |
Here, the user may wonder why nothing is matched, but this is because of the frequency symbols !. If, instead, the search pattern ended on a star:
index f*; // --> fy!a, fy!q |
Suddenly the two series match again, because the star matches !a and !q. But if the star is first, we get:
index *y; // --> %y, #y |
Now !a and !q are not matched, but on the contrary, the star matches % and #, so the string and list are matched.
The reader might object that one could just end the wildcard with !*, and the timeseries would be matched as expected. But the user has become accustomed to not having to write frequency indicators on timeseries of the same frequency as the global frequency. This is one of the advantages of Gekko, being able to write prt fy; and imply fy!a (if the global frequency is annual), so there would be the risk of users forgetting about frequencies when using wildcards (especially if they work in the same frequency most of the time).
Therefore, in Gekko 3.x, the !, % and # symbols are treated in a special manner when matching wildcards. In Gekko 3.x, the following is the case (withe the same four variables as above):
index *; // --> fy!a |
Only the active frequency is matched (we assume it is annual). No starting % or # are matched.
index *!*; // --> fy!a, fy!q |
Above, all frequencies are matched.
index %*; // --> %y |
This is how to match scalars. Collections match with #*.
The rationale behind these rules is that much wildcard search takes places regarding series of a given frequency, and it is therefore beneficial that such wildcard search works as expected. The users would want to be able to write for instance prt {'f*'}; or prt {'*y'}; without worrying about frequency indicators and scalars/collection types.
Instead of the tedious ['*!*'] + ['%*'] + ['#*'], matching all series, scalars and collections in a bank, ['**'] is offered as a shortcut to match all variables in a databank. In the same vein, ['***'] is a shortcut to ['*:**'], matching all variables in all databanks, that is, 'everything'.
To match all series of all frequencies and all scalars in the first-position databank, you may use **:
index **; // --> All variables i first-position databank |
Note
The form {'a*b'} is actually short for {['a*b']}. In the last version, the inside of {} is seen to be a list of strings which is converted into a list of names, just like {#m} converts a list of strings #m into a list of names. For example, #m = ['a*b']; prt {#m}; illustrates this, where prt #m; would just print the list itself, not the variables referred to by the list elements. Therefore, prt {['a*b']} prints the variables, and as noted, Gekko allows prt {'a*b'}; as short for {['a*b']}.