CAL, XML, XSLT

Currently, we are using XML to represent the result of parsing CAL source files (the abstract syntax tree, AST). The benefit of this is that we can directly use a lot of infrastructure that has been developed in the context of XML, and which simplifies the task of implementing CAL. In particular, the XML community has developed a language (called XSLT) and tools for transforming XML documents into either other XML documents, or into text files.

Why use XML and XSLT?

Using XML to represent the abstract syntax tree (and intermediate results) of a CAL program is not without cost: XML is a very big and wordy format, and is restricted to tree-like data structures. DOM trees (the internal data structure often used to represent XML documents) are big, and XSLT transformations are likely slower than corresponding handwritten transformations on a custom data structure would be. So why use XML and XSLT?

The main reason is openness. XML, XSLT, XML Schema, and all the other tools and formats we use for manipuating CAL actors are open industry standards. This means that users have access to a wide selection of tools on practically every current and future platform for manipulating the data, or running other people's scripts, code generators, etc. It also means that there is a public process that updates these standards, so any data committed to these formats is unlikely to become inaccessible due to non-backward compatible changes.

Of course, in many cases, specialized tools with proprietary input and output formats provide functionality, and it would make little sense to re-implement them based on XML. Because XSLT makes it easy to construct transformations of XML documents into other formats, actor descriptions, or parts of them, or analysis results, can be moved to such tools if that turns out to be useful.

More specific reasons for using XML are the following:

Persistence. XML provides a canonical format for storing data -- not only the AST, but also any intermediate result that transformations and annotations produce as DOM trees. These can therefore easily be communicated between parts of a program, even if those are written in different languages (see next point). In addition, persistence is effortless -- since existing XML parsers and serializers can be used, they need not be written by someone who wants to work on CAL.

Language independence. This means that CAL processors can be written in any language that has an XML parser, which these days is essentially every popular programming language. In fact, different parts of the processing may be written in different languages, as long as they exchange their information in a common XML format. This ties into the next two points---

Accessibility, low barrier of entry. The ready availability of all data concerning an actor lowers the buy-in required to use CAL. If you are interested in working on CAL, you can use whatever language you like, on whatever platform- as long as you have some minimal XML infrastructure. XML parsers and XSLT transformation engines are widely available these days, so

Longevity. Because XML is a standard format, it won't go away anytime soon. Any knowledge and work committed to it has a good chance of being available for a long time.

Terseness and simplicity. XSLT is a language specifically designed to express transformations on XML documents. As a result, these tend to be much simpler and a lot shorter than equivalent programs in a general-purpose language -- in spite of XSLT's rather verbose syntax! (Go here for some information on an improved syntax for XSLT, called NiceXSL, and designed by Ed Willink.) As these transformations represent a considerable part of the software we produce, making them easy to maintain, and easy to understand, is key to our work.

CAL XML (CalML)

We are still defining the details of the XML representation of Cal (working title: CalML). Eventually, we will produce an XML Schema for CalML, for now, have a look at the actor library for examples of CAL and the corresponding CalML.

CalML to CalML transformations

<to be done>

CalML to text transformations

<to be done>

Contact