Rewriting a software system

Gekko 3.0 is due to be officially released in the autumn of 2018. Version 3.0 entails a rewrite of large parts of Gekko, and the reader may ask him/herself why this is really necessary?

As explained in this blog post, Gekko started out (in 2008) just interpreting characters and strings, and later on tokens (‘words’). In 2010, a real parser was used (ANTLR), where the parser spits out a tree structure that is converted into runnable C# code. In 2014, when Gekko 2.0 was introduced, both the parser and converter was rewritten from scratch, mostly because Gekko 2.0 introduced a lot of syntax changes. Gekko 3.0 also introduces some syntax changes, but not that many – it is more a question of new functionality. Still, both the parser and converter is rewritten again for Gekko 3.0. Why?

In principle, the new functionality of Gekko 3.0 could have been introduced using the ‘old’ parser/converter of Gekko 2.0. But as mentioned in this blog post, a new rewrite was deemed essential for the long-time well-being of Gekko. Continuing to build upon the engine of 2.0 would have made the source code a mess in the longer run, with work-arounds and exceptions to general rules (and exceptions to these exceptions).

Rewriting the parser and converter from the ground up is a major hassle, and as a software system develops, it becomes more and more difficult and time-consuming to do it. Also, if the syntax changes, the users will have to rewrite code, too. This partly explains why some computer languages somehow seem frozen in time. An example could be R, which has a fantastic array of user-supplied libraries, but where some of the syntax and structures seems pretty strange and arcane at first sight. R has a lot of advantages, and a lot of mature libraries for statistics and plotting, but in the very long run a language like Python may win out simply because of its clean syntax and general “easy” and modern feel.

It is probably the case that the longer a software system lives on, the harder it gets to rewrite its core parts, including the syntax. And as mentioned, if the syntax changes, the users will have to rewrite code, too. Rewriting the core parts of Gekko 3.0 had a bit of a suicidal feel to it at the beginning, but there was a simultaneous feeling that this rewrite would be important, and perhaps the last chance before the users start to build really large systems with user-defined functions etc.

So is this the last major rewrite of the inner parts of Gekko? Regarding the core syntax, it may be. It is my belief that the syntax is now (version 3.0) really logical and consistent, and has become, in a sense, the best version of itself. For timeseries, it deals smoothly with timeseries of differing frequencies, handles the time dimension and sample settings equally smoothly, uses a very convenient databank scheme, and handles multidimensional timeseries as an inbuilt feature.

So how is version 3.0 doing? Since the winter 2017/18, some adventurous people have been using it, and in that sense it is alive and doing well. But in order to release it for general use, a lot of testcases from version 2.2/2.4 need to be checked/fixed in 3.0, and this is mainly what is going on at the moment. The hard parts regarding 3.0 have been done, but a lot of details (and a translator from 2.2/2.4 to 3.0) need to be finished.

The reward? Consistent syntax and data structures, and a healthy foundation regarding the further development of Gekko (instead of a syntax with ambiguities and gothas).

Recent Posts

Recent Comments

Archives

Categories

Meta