|
<< Click to Display Table of Contents >> COMPARE |
![]() ![]()
|
COMPARE compares variables in the first-position and Ref (reference) databanks. The comparison is only done for timeseries of the same frequency as the global frequency setting. The comparison is done over the given period (or the global period if a period is not provided), and the user may provide a list of series that are checked (if no list is given, all series are checked).
COMPARE will per default put the output in the file compare_databanks.txt (this filename can be changed). You may set thresholds regarding absolute or relative differences (options ABS, REL and PCH), and you may dump a list #dif with the different series names (cf. DUMP).
The COMPARE statement is similar to the menu item Utilities --> Compare two databanks... in the Gekko user interface.
compare < period ABS=... DUMP REL=... SORT=... PCH=... TYPE=... DUMP> variables FILE=... ;
period |
(Optional). Local period, for instance 2010 2020, 2010q1 2020q4 or %per1 %per2+1. |
ABS= |
(Optional). Absolute differences smaller than the value are not shown, for instance <abs = 150>. |
MISSING= |
(Optional). Choose m (default) or zero. If zero, any missing values (NaN, shown in PRT as M) will be treated as if they were = 0. The zero setting can be useful if you are comparing two databanks, and you do not want a missing value in one databanks and a 0 in another databank to count as a difference. (If both databanks have a missing value, or if both databanks have a 0 value, this is never counted as a difference in any case). |
REL= |
(Optional). Relative differences smaller than the value are not shown, for instance <rel = 0.01> equivalent to 1%. You may alternatively use PCH for the same purpose. |
SORT= |
(Optional). Choose between alpha (default), abs or rel. The first sorts alphabetically (which is default), the next sorts after absolute differences, and the last sorts after relative differences. The sorting and the use of ABS=, REL=, and PCH= are independent of each other. |
PCH= |
(Optional). Percentage differences smaller than the value are not shown, for instance <pch = 1.0> corresponding to 1%. You may alternatively use REL for the same purpose. |
TYPE= |
(Optional). Choose between type=normal (default) or type=hist (history). The latter computes relative changes differently. If x is a timeseries from the first-position databank and @x is a timeseries from the reference databank, type = normal computes the relative difference as rel = abs(x - @x) / @x. In contrast, with type=hist the relative difference is computed as rel = abs(x - @x) / ((abs(@x - @x[-1]) + abs(@x[-1] - @x[-2])) / 2), where the numerator is the same, but where the denominator is the average of absolute time-changes in the current period (abs(@x - @x[-1])) and the previous period (abs(@x[-1] - @x[-2])). So with type=hist, the variability of @x is used to indicate what a "large" difference is supposed to mean, and the denominator has some similarities with a standard deviation measure. The two methods will normally return different results for some variables. (For SIM Gauss-Seidel convergence check, a procedure almost exactly similar to type=hist is used for relative convergence checks). |
DUMP |
(Optional). If this option is set, a list #dif will be constructed, containing the list of different timeseries. |
variables |
A list of variable names. If no variables are given, the full databanks are compared. The names are separated by comma (like x, y, z), and a list #x of names should be used with {}-braces: {#x}. Regarding array-series, you may either indicate the name of the array-series itself (x), in which case all sub-series are checked, or you may state individual elements (like x[a,k]). |
FILE= |
Filenames may contain an absolute path like c:\projects\gekko\bank.gbk, or a relative path \gekko\bank.gbk. Filenames containing blanks and special characters should be put inside quotes. Regarding reading of files, files in libraries can be referred to with colon (for instance lib1:bank.gbk), and "zip paths" are allowed too (for instance c:\projects\data.zip\bank.gbk). See more on filenames here. |
•If no period is given inside the <...> angle brackets, the global period is used (cf. TIME).
Example
Compare all variables for the global period, or a given period:
compare; //global period |
Do the same, with a user-chosen filename:
compare <2010 2020> file=dif.txt; |
Sort the result by relative differences:
compare <sort=rel>; |
Only compare series names from the list #x:
#x = x1, x2, x3; |
Do not show relative differences smaller than 0.02 (that is, 2%):
compare <2010 2020 rel=0.02>; |
You may 'dump' a list #dif containing the names of the timeseries that are different:
compare <dump>; |
Array-series are supported, consider this example:
reset; |
The file compare_databanks.txt will contain the following output:
Comparing first-position and reference databanks |
At the right of each comparison, the value that is sorted after is shown (max) -- largest differences are shown first. In this case, max = 0.50 means that the maximal percentage difference is 0.50% (in 2002) for the array-series xx[b,y].
Note
Note: local option <rel> and <pch> cannot be used at the same time.
If <abs> and <rel>/<pch> are used at the same time, only differences larger than the abs and rel/pch criteria are shown.
compareFolders()
Related statements