Compilation of Stack-Based Languages, status April 1999

This project was supported by the FWF for two years starting in February 1996 (research project P11231). Since February 1998, the project has stalled, but I expect to continue it some time in the future.

Goal

The goal of the project Compilation of Stack-Based Languages is to advance the state of the art in compiling languages like Forth, Postscript and JavaVM to fast machine code. These languages have properties that make traditional compilation techniques badly suited for them:
Stack
items have to be allocated effectively to registers to get fast machine code.
Very fast compilation
is required, because of the interactive nature (Forth, Postscript), or because of the compile-and-run infrastructure (JavaVM just-in-time compilation).

Current Results

We currently have a working compiler from Forth to machine code, written in Forth: The main results up to now are:

Publications

Our final report [ertl&pirker98] is a short paper written for a computer science audience. For more specific information, read our EuroForth papers [ertl&pirker96, ertl&pirker97] (but they assume a Forth background). There is also a position paper that gives an overview of the ideas [ertl92]. Finally, my dissertation [ertl96diss] discusses RAFTS and other Forth implementation techniques.

Future Work

Interprocedural register allocation
We will probably first do a standard graph colouring register allocator. It may be too slow, so we also think about a fast register allocation method that exploits the nice liveness properties of stack items. While this method may not produce as good code as established graph colouring techniques, we hope that the improved compile time will be worth this price.
Code replication
The main purpose of code replication is to produce longer basic blocks, that allow our basic block code generation techniques to be more effective. Currently the plan is to combine a conservative full inliner with an aggressive partial inliner and a control flow join delaying optimization. This should be pretty easy with our new linear intermediate representation.
Tail-call optimization
turns a call followed by an exit into a jump. It should work even better in combination with partial inlining (because partial inlining turns some calls into tail-calls).

References

[bell73] James R. Bell.
Threaded code. Communications of the ACM, 16 no. 6 pp. 370-372, 1973.
[ertl92] M. Anton Ertl.
A new approach to Forth native code generation. In EuroForth '92, pages 73-78, Southampton, England, 1992. MicroProcessor Engineering.
[ertl96diss] M. Anton Ertl.
Implementation of Stack-Based Languages on Register Machines. Dissertation, Technische Universit\"at Wien, Austria, 1996.
[ertl&maierhofer95] M. Anton Ertl and Martin Maierhofer.
Translating Forth to efficient C. In EuroForth '95 Conference Proceedings, Schlo\ss Dagstuhl, Germany, 1995.
[ertl&pirker96] M. Anton Ertl and Christian Pirker.
RAFTS for basic blocks: A progress report on Forth native code compilation. In EuroForth '96 Conference Proceedings, St. Petersburg, Russia, 1996.
[ertl&pirker97] M. Anton Ertl and Christian Pirker.
The structure of a Forth native code compiler. In EuroForth '97 Conference Proceedings, pages 107-116, Oxford, 1997.
[ertl&pirker98] M. Anton Ertl and Christian Pirker.
Compilation of stack-based languages (abschlu\ssbericht). Final report to fwf for research project p11231, Institut f\"ur Computersprachen, Technische Universit\"at Wien, 1998.
[kogge82] Peter M. Kogge.
An architectural trail to threaded-code systems. Computer, pages 22-32, March 1982.

Anton Ertl