Array-timeseries and implicit loops

<< Click to Display Table of Contents >>

Navigation:  Gekko User Guide > Data Management User Guide > Gekko programs >

Array-timeseries and implicit loops

Previous pageReturn to chapter overviewNext page

Array-timeseries are a bit more advanced than normal timeseries. The current section also explains implicit looping over lists, which can also be used for normal timeseries.

 

An array-series q may for instance represent production quantities in different sectors #i (where #i is a list of sector names/strings), and another array-series p may represent price levels in the same sectors. Using array-series, you may use expressions like v[#i] = p[#i]*q[#i] to calculate the nominal production values in the different #i sectors, or = sum(#i, p[#i]*q[#i]) to calculate the sum of these values. Such syntax should not be too difficult to understand, since it resembles normal mathematical expressions like the following:

 

     

 

Using array-series, it is typically possible to avoid explicit looping over indexes, and perform the looping implicitly in one statement. The Gekko sum() syntax also resembles GAMS syntax for summation, which is no coincidence.

 

But before we move on to array-series, it must be emphasized that much of the same loop-avoidance can be achieved with normal timeseries, too. An example will illustrate this.

 

 

Implicit loops with name-composition

 

As mentioned in previous sections (for instance the elevator pitch), you may use {}-name-substitution in series names. Inside the {}-curlies, it is allowed to place a list of strings, like for instance:

 

reset; time 2010 2012;
#= a, b, c; //list of three strings 'a', 'b', 'c'
pa = 1.1; pb = 1.2; pc = 1.3;
qa = 200; qb = 300; qc = 400;
v{#i} = p{#i}*q{#i};
vtot = sum(#i, p{#i}*q{#i});
prt <n> v{#i}, vtot;  // <n> omits percentage growth
 
//                   va             vb             vc           vtot 
//  2010       220.0000       360.0000       520.0000      1100.0000 
//  2011       220.0000       360.0000       520.0000      1100.0000 
//  2012       220.0000       360.0000       520.0000      1100.0000 

 

In the statement v{#i} = p{#i}*q{#i} above, implicit looping over the elements of #i is performed, so the statement amounts to va = pa*qa; vb = pb*qb; vc = pc*qc;. Similarly, = sum(#i, p{#i}*q{#i}) amounts to = pa*qa + pb*qb + pc*qc.

 

A naming scheme like above can work pretty well, using normal timeseries (and it works in multiple dimensions, too). But one of the problems is that for instance the three series va, vb, vc are not coupled in any way, except for their names. For instance, if you need to delete all these nominal production values from the databank, how can you be sure that you really delete all of them? There might be stray vd and ve series lying around, but do these represent sectors d or e of the same kind of variables, or are their names just coincidences? So there is a looseness about using naming conventions that the concept of array-series tries to resolve.

 

 

Implicit loops with array-timeseries

 

Array-timeseries provide a tighter coupling of dimensional timeseries data, and also provide other benefits and conveniences. Let us try to reconstruct the above example with array-series instead:

 

reset; time 2010 2012;
#= a, b, c; //list of three strings 'a', 'b', 'c'
= series(1); p = series(1); q = series(1);  //1 for 1-dimensional
p[a] = 1.1; p[b] = 1.2; p[c] = 1.3;  //sub-series
q[a] = 200; q[b] = 300; q[c] = 400;  //sub-series
v[#i] = p[#i]*q[#i];
vtot = sum(#i, p[#i]*q[#i]);
prt <n> v[#i], vtot;  // <n> omits percentage growth
index v[**]; //see all sub-series inside v
 
//                v[a]           v[b]           v[c]           vtot 
// 2010       220.0000       360.0000       520.0000      1100.0000 
// 2011       220.0000       360.0000       520.0000      1100.0000 
// 2012       220.0000       360.0000       520.0000      1100.0000 

//

// v[a], v[b], v[c]

//

// Found 3 matching items

 

This replicates the previous example, where normal timeseries and name-composition were used. So instead of for instance the normal timeseries va, vb, vc, we now have an array-series v with sub-series v[a], v[b], v[c]. Instead of index v[**] here, v[*] could have been used (since v is 1-dimensional), cf. the INDEX statement.

 

What is the big deal about that, except for the use of []-brackets to access the sub-series? The following examples try to illustrate the point of using array-series:

 

// ...continued
 
prt <n> v;  //prints out all of the sub-series
 
//                 v[a]           v[b]           v[c] 
//  2010       220.0000       360.0000       520.0000 
//  2011       220.0000       360.0000       520.0000 
//  2012       220.0000       360.0000       520.0000 
 
disp v;  //get info on dimensionality etc.
 
//  ==========================================================================================
//  SERIES Work: v
//  Annual series has 3 elements in 1 dimensions (period 2010 - 2012)
//  Dimension 1 (3 elements): a, b, c
//  First/last elements (alphabetically): v[a] ... v[c]
//  ==========================================================================================
 
prt p*q/1000;  //some algebra is legal directly on array-series --

               //this presupposes that the p and q sub-series 

               //are compatible (contain the same elements)
 
//         p*q/1000 [a]   p*q/1000 [b]   p*q/1000 [c] 

//  2010         0.2420         0.4320         0.6760 

//  2011         0.2420         0.4320         0.6760 

//  2012         0.2420         0.4320         0.6760 
 
= p*q;    //this also works, if the sub-series are compatible.
delete v;   //deletes v and all its sub-series in one go.

 

As it can be seen from the above examples, an array-series like v can be used more like a vector/matrix/array, for instance when printing it (which prints all its sub-series), DISP'ing it (showing dimensionality, size and element information), or deleting it.

 

Simple algebra on the array-series themselves is also supported, allowing index-free syntax (like p*q/1000 instead of p[#i]*q[#i]/1000). This can be thought of as being a bit similar to matrix algebra, performing bulk operations on the sub-elements.

 

In addition, there are other capabilities:

 

When you are looping implicitly or summing up over dimensions, you may omit/skip list elements via the $ conditional operator (similar to GAMS).

You may assign domains to the dimensions, so that you do not accidentally use an illegal element name (for instance mistyping a sector name).

Multiple dimensions works too, and it is quite easy to aggregate dimensions (roll-up), or pick an element from a dimension (slicing), etc. You just use commas to separate the dimensions, like x[a, k] or x[#i, #j]. You may for instance eliminate (aggregate) the second dimension from the array-series x like this: y[#i] = sum(#j, x[#i, #j]).

You may assign "default sets", so that prt v; only prints out some of the (most important) elements.

You can use special rules regarding the printing of and calculation with array-series, for instance if there are "holes"/missings in these. You can use the options option series array print missing = ... ; and option series array calc missing = ... ; to control this.

There is no risk of name-collisions with array-series. With name-composition there is always a risk of this. For instance if Kb means buildings capital, and q is a sector name, Kbq could mean buildings in the q sector. But what if there was another sector with the name bq? Couldn't Kbq then mean total capital K in the bq sector? This is often "solved" with underscores like Kb_q vs. K_bq, but array-series are impervious to this problem, because Kb[q] and K[bq] are necessarily different.

Array-series can be sparse. If the first dimension of the array-series x has 100 possible elements (for instance 100 sectors), and the second dimension has 100 possible elements (for instance export categories), there can be up to 10.000 different kinds of sub-series inside x. But if there are fewer sub-series than that, corresponding to data being sparse, these "holes" (zero values) will not take up any RAM of file space.

Array-series containing ages (like population data) can be "rotated", so that for instance age profiles can be plotted, cf. the rotate() function, more under PLOT (age profiles section).

 

More on array-series under the SERIES statement.

 

Array-series do not yet work in model files (.frm), because Gekko models do not support dimensionality. But Gekko supports loading GAMS models, so in that way Gekko array-series can be used together with dimensional models (with for instance DECOMP).