Disclaimer

Copyright (C) 2005, 2006 Jacques de Hooge, Geatec Engineering

This manual and the software it belongs to are free. The manual attempts to describe what the (still incomplete) software should do eventually and ideally, rather than what it actually does or ever will do. In other words, it contains fiction, just like the books of Jules Verne. You can use, redistribute and/or modify manual and software, but only under the terms stated in the QQuickLicence.

Both manual and software are distributed in the hope that they will be useful, but WITHOUT ANY WARRANTY, without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the QQuickLicence for details.

What is Wave

Wave takes a set of computational algorithms, and turns them into a scriptable, GUI driven application. Typically each computational algorithm will be embodied by an executable that takes a number of command line arguments. Some of these arguments are parameters to the computation. Other arguments are names of data items. In many cases these data items will be just files. So for simplicity the term input and output files will be used throughout this document. Wave keeps an administration of lineage relations between all data in an XML file called the Data Map. It keeps track of which algorithms were used, what the input and output files were and which parameters were used for the computation.

Browsing relations between data and starting automatic recomputation

A user may utilize the Data Manager that is part of Wave to browse the information from the Datamap interactively. In this way the user can find out what are the relations between the data, and which algorithms and parameters were used to compute the data. The user can change parameters and compute alternatives. All dependent results will be adapted automatically in a so called update wave.

Script recording and playback

Wave contains a Script Recorder. In recording mode, each computation started by the user will also result in the emission of a corresponding script statement in the programming language Python. When a script is played back, the result will be exactly the same as with interactive operation. This includes updating the Data Map to reflect new relations between data as result of running the script. After running a script, the information from the Data Map may again be browsed interactively, changes made to parameters and update waves started.

Script programming

Each script is a set of well formed Python function calls passing named parameters. So scripts are in plain Python, rather than having a command line like syntax. Scripts may be edited or hand written to feature loops and other flow control constructs. This enables working through large sets of parameter combinations unattendedly. Scrips run in background.

Processors

To connect an algorithm to Wave, the algorithm must be encapsulated in a Processor. A Processor is a small piece of Python code that will
  1. Provide a GUI for entering parameters and names of input and output data, the latter typically by drag and drop from the Data Manager. The GUI of each Processor is built up using the Eden library that, just like Wave, is part of QQuick. In fact the whole of Wave is written in Eden. GUI's of processors are usually simple, just showing some parameters, input and output names. However if needed, they may feature modal and modeless dialogs, tree controls, local menus and so on.
  2. Generate the appropriate script statement, each time the computation underlying that procesor is done with a particular set of parameters.
  3. Provide an entry point for executing that same script statement, once the script is run.
  4. Spawn the executable that contains the algorithm for that processor, and monitor progress of the computation at hand.

Full runs, trial runs and dry runs

There are three ways to start a computation in Wave.
  1. A full run will perform the complete computation on non-decimated data. After starting a full run the user waits for the run to complete, before starting the next computational step.
  2. A trial run will perform either a reduced accuracy computation or use decimated data. This enables a fast check of the effect of a particular parameter combination. Meanwhile the appropriate relations between data are already laid down. Simply switching to full run mode and starting an update wave will redo the whole sequence of computations with full accuracy on all data unattendedly in foreground. Alternatively the script that was generated during the trial runs can be executed in background.
  3. A dry run will not perform any computation. But it will lay relations between data and generate a script if the script recorder is running. So after a series of dry runs, computations may be done unattendedly, either by starting an update wave in foreground, or by executing the generated script in background.

Safe concurrency

Data that will be changed by the script is locked in advance. So while a lenghty script is running, a user can work interactively. Data touched by a user is also locked. This means that multiple users can simultaneously work on the same data while multiple scripts are running, without the risk of destroying each others work.

Viewing data graphically

Apart from computational algorithms, also viewers can be connected to Wave. The demo that goes with Wave, shows how to use GnuPlot for this purpose.

Internals

What happens when a user starts an interactive Run

A user can start an interactive Run by selecting Full Run, Trial Run or Dry Run from the menu of a Processor. This will cause a Run Transactor Function (RTF) to be constructed from function Processor.wave, by filling in Full, Trial or Dry for its updateLevel parameter. The RTF has one free parameter left, a reference to the DataMap. The RTF is pushed into application.repertoire.runTransactorNode by calling its follow member function. Note that the value of runTransactorNode is piece of code, rather than a piece of data.

Node application.dataMapNode depends on application.repertoire.runTransactorNode, so it will recompute its value by calling its getter function transactDataMap. By use of touch testing, transactDataMap finds out that it was activated by runTransactorNode. Hence transactDataMap will call the RTF contained in runTransactorNode.new to compute the new DataMap. Since the RTF uses BaseRepertoire.wave to obtain an updated version of the DataMap, it will at the same time run the appropriate computations, if the updateLevel passed to BaseReperertoire.wave by the RTF is Full or Trial. If the ScriptRecorder is running, this will also lead to insertion of one function call per computation into the script.

Conceptually the transaction started by the user does not only work upon the DataMap in in a limited sense. It works on the computational state as a whole, even including the generated script. Saying that application.dataMapNode refers only to the DataMap is hiding the bulk of the iceberg. Just as dataMapNode only contains a reference to the DataMap, the DataMap in turn only holds references to the computational data. DataMap, computational data and script together add up to the state that is altered by the RTF.

The RTF delegates the updating of the state to BaseRepertoire.wave, it just fills in the appropriate updatLevel parameter. BaseRepertoire.wave will go through the following steps:

  1. Perform an insulatedRun on the particular Processor from whose menu the user started the Wave. This will outdate all dependend Items and subsequentially update only the direct outputs of that Processor.
  2. Build up a list once of Runs that have to be redone: pendingRunKeys.
  3. Each time call run.processor.insulatedRun for a runs refered to by that list for which all inputs are up to date, until there are no pendingRunKeys left.
If the script recorder is running, each call to insulatedRun will generate a script statement as well.

What happens if a script is executed

While running a script, the DataMap is not needed. The order of computations is determined explicitly by the script, rather than having to find out which runs have up to date inputs. This fact can be used to keep the DataMap locked only briefly, even if the script itself takes a long time to run.

At the beginning of script execution, the DataMap is locked. Performing the script as a sequence of Dry Runs will correctly update the topology of the DataMap, even though the items touched are not yet up to date. Each Run involved in the script is marked with the one and only unique waveNr of this preliminary sequence of Dry Runs to lay down that these runs are locked by the script. (Is this also done by the executor and a new interpreter instance???)

The script, together with a stripped down version of Wave, called the Executor, is fed to an new (or pooled) instance of the Python interpreter, that will execute it parallel to interactive operation and other scripts that may be running.