Sunday, December 19, 2010

Making Software Literate: The Parser, The Interpreter, and The Literal

In my last post, the title suggests "teaching code to read itself". However all I wrote was about Interpreter.

For code to read itself, it must be able to generate a model from itself. Therefore, you need the metamodel of the programming lanaguage the software is written in, and the grammar of that language. By using those two ingredients and the proper tools, you can generate a model from the software.

An example of a comprehensive tool for doing this is Eclipse MoDisco. It comes with a complete tooling for discovering / reflecting / introspecting a Java project.

However, for software to understand the model, it must also have an Interpreter/Evaluator, which can do something with the model.

In a way, a generator is a specialized kind of interpreter which simply outputs the concrete representation ("artifact") of the model being processed.

To not only read a model but also to make changes to it, we need a Manipulator (what a scary name), which is a kind of interpreter that performs actions on a model. Sample action: delete an EClass node named 'Car'.

After making changes, the resulting model can be generated back to file artifacts using generator. The project can then be rebuilt, only the changed artifacts are needed.

To rebuild the project from scratch though, we need a complete set of project files.

A typical software project, not only consists of a single language (hence metamodel) but several other artifacts including:
- build.xml (Ant build)
- plugin.xml, MANIFEST.MF (PDE project)
- pom.xml (Maven project)
- build.gradle (Gradle build)
- .project, .classpath, (Eclipse JDT project)

Depending on requirements, it may not be needed (sometimes not even desirable) to model all of those artifacts properly. Sometimes, it's enough to model a file as a 'Literal':

File: EClass
name: EString
directory: EString
contents: EString

Which in practice means, that these artifacts are not part of the model-driven lifecycle at all. (i.e. You can actually ignore it and it won't evenmatter)

Model-driven is all about transformation, or processing, or shapeshifting, or (meta)morphing. If an artifact or model stays the same throughout the lifecycle, and it's not being used as a metamodel for transformation of its instances, then it's the same thing as if modeling is not used at all.

When all project artifacts are understood, 'literalled', or generated, the project can be rebuilt from scratch using the model. With a good headless build system such as Maven or Gradle, this should be simple.

The other part to include is the information "in programmer's head". We deal with this everyday that it seldom occurs to us that it *is* information.

Things like:
- the directory location of the project
- project name
- location and name of the generated binaries
- SCM user, password, URL
- SCM tool
- test status (whether the tests pass, how many tests, how many fails, how many passes)

These information should be modeled, and a specialized interpreter can be created to act on the project.

A final behavior is 'replaceSelf', which requires the following information:
1. self source location
2. self binary location
3. self descriptor model
4. location of *this* self descriptor model
5. prototype source location
6. prototype binary location
7. prototype descriptor model

where 'prototype' is the project that we've discussed and built above.

The replaceSelf behavior, given the 'self descriptor model' will update/replace itself using the prototype locations, and also update the self descriptor model (e.g. Update the version number).

If the software runs continuously (as a server/daemon), it can then fork a new version of itself, then terminate its own instance.

I guess now the lifecycle is complete. ;-)