1.1 Using Termite

Depending on the desired architecture there are several ways to integrate Termite into the work flow of a larger tool. For a flexible recombination of several analyses and/or transformations it is best to treat Termite programs as interpreted scripts that read/write AST terms from the standard input an output. If performance and stability are sought for, it is also possible to call Termite programs transparently from a SATIrE analyzer.

1.1.1 Using Termite for a standalone process

The most flexible and convenient way to work with the Termite library is by using it to define filter operations on streams of source code. This way one can follow the UNIX tradition of having a collection of small self-contained programs that can be combined to create larger work flows. Depending on the expected input and generated output, several types of Termite programs can be distinguished. Typical examples are:
A source-to-source transformer
is a program that reads in an AST, then performs some transformation and outputs the transformed AST. (Example: loop unrolling)
An analyzer
is a program that reads in an AST, performs some analysis and outputs the analysis result as attributes of the AST. (Example: loop bound analysis)
A visualization
is a program that reads in an AST and outputs a visualization, e. g., in a GUI window or a PostScript file. (Example: Call-graph -> Graphviz (DOT))
A source generator
is a program that reads in a specification and outputs an AST in termite format.
A compiler
is a program that translates an AST into a different language, e. g., melmac1http://www.complang.tuwien.ac.at/gergo/melmac/ or wcetcc.
In order to generate a Termite term from one or more source files a compiler front end must be invoked. Two possibilities are supported and available in the SATIrE distribution:

1.1.1.1 EDG C/C++ front end from the ROSE compiler

If SATIrE was configured with the ROSE connection enabled2ROSE must be installed separately beforehand and is available at http://www.rosecompiler.org, conversion tools are available to translate source code to term files and vice versa. To translate source code into a Termite term the program c2term is available:

> c2term

Usage: c2term [FRONTEND OPTIONS] [--dot] [--pdf] src1.c src2.cpp ... [-o termfile.pl]
  Parse one or more source files and convert them into a TERMITE file.
  Header files will be included in the term representation.

Options:
  [FRONTENT OPTIONS] will be passed to the C/C++ frontend.

  --rose-help
    Display the help for the C/C++ frontend.

  -o, --output <termfile.pl>
    Write the output to <termifile.pl> instead of stdout.

  --dot
    Create a dotty graph of the syntax tree.

  --pdf
    Create a PDF printout of the syntax tree.

This program was built against SATIrE 0.8.6-rc4,
please report bugs to <adrian@complang.tuwien.ac.at>.
The c2term program invokes the commercial EDG C++ front end embedded into the ROSE compiler to parse one or more source files. The abstract syntax tree (AST) is then translated into the ROSE immediate representation which in turn is converted into the textual term serialization. The program passes additional options to the EDG front end.

The opposite direction is managed by the term2c conversion utility. It works by reading in a term file and then rebuilding the ROSE intermediate representation. Finally, this data structure is passed to the ROSE unparser. The EDG front end is not involved in this step any more.

> term2c

Usage: term2c [OPTION]... [FILE.term]
Unparse a term file to its original source representation.

Options:
  -o, --output sourcefile.c
    If specified, the contents of all files will be concatenated
    into the sourcefile.

  -s, --suffix '.suffix'  Default: '.unparsed'
    Use the original file names with the additional suffix.

  -d, --dir DIRECTORY
    Create the unparsed files in DIRECTORY.

  --dot
    Create a dotty graph of the syntax tree.

  --pdf
    Create a PDF printout of the syntax tree.

This program was built against SATIrE 0.8.6-rc4,
please report bugs to <adrian@complang.tuwien.ac.at>.
Since both converters use standard input and output per default it is possible to concatenate multiple Termite programs with the help of UNIX pipes. This way it is possible to build new chains of program transformations or analyzers on the fly without having to recompile the whole project.

Example:

c2term a.c b.c | ./transform1.pl | term2c -s '.transformed'
In this example pipeline, two C source file are joined into one project which is dumped to a stream in the Termite format. The stream is then transformed by a Prolog program. Finally the two source files are unparsed by the term2c converter with the new suffix ``.transformed'' attached to the file names.

1.1.1.2 Using the Clang C/Objective C front end

While the commercial EDG front end offers a high-quality C++ parser, license restrictions encumber its free distribution together with other tools. Most notably, the ROSE compiler redistributes only a 32-bit precompiled binary version of the EDG front end. It is, however, possible to buy other licenses from the Edison Design Group.

If C++ support is not needed, there is a free alternative available from the LLVM compiler project. Designed especially for use with LLVM a front end for C-like languages called clang is published under a BSD-style license. The clang front end can be downloaded at http://clang.llvm.org/. The front end is written in C++ and creates an intermediate representation very similar to that of ROSE and therefore makes a good candidate to replace the EDG front end in SATIrE. The C99 and Objective C languages are supported very well by clang, whereas C++ support is still under development.

In order to connect SATIrE with the clang front end, we decided to take the route via the Termite representation. This way, the front end is cleanly decoupled from the rest of the system and uses the Termite terms as a stable interface. The Termite term generator is implemented as a pass over the clang intermediate representation and is available via the -emit-term command line option. The term generator is not integrated with upstream clang, but distributed as a patch against a current SVN version together with SATIrE.

To build the clang front end for use with SATIrE a special make clang target is available at the toplevel which fetches the needed version of clang from the subversion repository, applies the patch, and compiles and installs the patched front end to $prefix/bin.

1.1.1.3 Uparsing Termite terms without SATIrE

Invoking the term2c program is sometimes too cumbersome, for example, when only a few expressions should be unparsed for debugging purposes. For these occasions an independent term->C converter is implemented in pure Prolog and available both in the Termite library and as a stand-alone script. The predicate is called unparse/1 and expects a Termite term as argument.

1.1.2 As part of a SATIrE analyzer

If execution speed is an issue, the steps of writing the Termite representation to disk (or a pipe) and parsing the terms (which, when output as a text, are significantly larger than the original source files) can be optimized away. If SATIrE was configured with SWI-Prolog support enabled, the term representation will be built in memory using the external interface of an embedded SWI-Prolog interpreter. Using this in-memory term, a Termite program can be executed without leaving the current process. The resulting term can again be translated to the ROSE intermediate representation directly from memory using the SWI-Prolog interface.

Using this work flow, the whole analyzer (or transformer, ...) can be distributed as a single self-contained executable.