Parag Patel writes:
We, ( CodeGen, Inc.) sell a C-to-Fcode compiler. Well, it actually generates IEEE-1275 Forth that then must be run through a tokenizer.Really, it generates pretty ugly Forth code. It's easy to generate lousy Forth, but it's very difficult to generate nice clean optimized Forth. C and stack-based languages don't mix too well. I end up faking a C variable stack-frame using a Forth $frame variable for local vars.
Stephen Pelc writes:
MPE has produced a C to stack-machine compiler. This generates tokens for a 2-stack virtual machine. The code quality is such that the token space used by compiled programs is better than that of the commercial C compilers we have tested against. This a consequence of the virtual machine design. However, to achieve this the virtual machine design has local variable support.The tokens can then be back end interpreted, or translated to a Forth system. The translater can be written in high level Forth, and is largely portable, except for the target architecture sections.
These are not shareware tools, and were written to support a portable binary system.
An unsupported prototype Forth-to-C compiler is available at http://www.complang.tuwien.ac.at/forth/forth2c.tar.gz. It is described in the EuroForth'95 paper http://www.complang.tuwien.ac.at/papers/ertl&maierhofer95.ps.gz. Another Forth-to-C compiler is supplied with Rob Chapman's Timbre system.
Many packages for data structuring facilities like Pascal's RECORDs and C's structs have been posted. E.g., the structures of the Forth Scientific Library ( http://www.taygeta.com/fsl/fsl_structs.html) or the structures supplied with Gforth http://www.complang.tuwien.ac.at/forth/struct.fs.
Some people find the way THEN is used in Forth unnatural, others do not.
According to Webster's New Encyclopedic Dictionary, "then" (adv.) has the following meanings:
... 2b: following next after in order ... 3d: as a necessary consequence (if you were there, then you saw them).Forth's THEN has the meaning 2b, whereas THEN in Pascal and other programming languages has the meaning 3d.
If you don't like to use THEN in this way, you can easily define ENDIF as a replacement:
: ENDIF POSTPONE THEN ; IMMEDIATE
Threaded code is a way of implementing virtual machine interpreters. You can find a more in-depth explanation at http://www.complang.tuwien.ac.at/forth/threaded-code.html.
Paul Curtis writes:
The JVM, although a stack machine, can't really be used to compile Forth efficiently. Why? Well, there are a number of reasons:
That said, it is possible to write something Forth-like using JVM bytecodes, but you can't use the JVM stack to implement the Forth stack. ...
If you're serious, try getting Jasmin and programming directly on the JVM.
Some of the non-trivial pieces in translating JavaVM to Forth, that we have identified, are:
Postscript is similar to Forth in having a data stack, being interactive, and supporting wordlists. Postscript differs from Forth in using run-time name binding, run-time typing for type-checking and overloading resolution, implementing control structures through words that take anonymous definitions as parameters, in terminology (I have used Forth terminology here), and in other respects.
Concerning the question of whether Forth influenced Postscript, the Postscript manual (first edition) claims that Postscript and its predecessors were conceived and developed independently of Forth. However, also according to John Warnock Postscript's "syntax looks a little bit like Forth, because it is derived from Forth". Jim Bowery's Genesis of Postscript mentions Forth.
A Forth system running on the bare hardware is also known as a native system (in contrast to a hosted system, which runs on an OS). Don't confuse this with native-code systems (which means that the system compiles Forth code to machine code); hosted native-code systems exist as well as native threaded-code systems.
In the beginning Forth systems were native and performed the functions of an OS (from talking to hardware to multi-user multi-tasking). On embedded controllers Forth systems are usually still native. For servers and desktops most Forth-systems nowadays are hosted, because this avoids the necessity to write drivers for the wide variety of hardware available for these systems, and because it makes it easier for the user to use both Forth and his favourite other software on the host OS. A notable exception to this trend are are the native systems from Athena.
Native Forth systems can be seen as OSs written in Forth, so it is certainly possible. Several times projects to write an OS in Forth were proposed. Other posters mentioned the following reasons why they do not participate in such a project:
If you want to write an OS in Forth for a desktop or server systems, the problems are the same as for native Forth systems (and any other effort to write a new OS): the need to write drivers for a wide variety of hardware, and few applications running on the OS.
To get around the application problem, some posters have suggested writing an OS that is API or even ABI compatible with an existing OS like Linux. If the purpose of the project is to provide an exercise, the resulting amount of work seems excessively large; if the purpose is to get an OS, this variant would be pretty pointless, as there is already the other OS. And if the purpose is to show off Forth (e.g., by having smaller code size), there are easier projects for that, the compatibility requirement eliminates some of the potential advantages, and not that many people care about the code size of an OS kernel enough to be impressed.
A tethered Forth system is a cross-development environment where the host and the target are connected at run-time (during development), allowing full interactive use of the target system without requiring all the space that a full-blown Forth system would require on the target. E.g., the headers can be kept completely in the host. Tethered systems may also provide the compilation speed and some of the conveniences of a full-blown Forth system on the host.
Tethered systems are also called umbilical systems.
Such ideas have been proposed several times, to allow using control structures interpretively, among other benefits. It has also been implemented in some systems (e.g., Christophe Lavarenne's Free-Forth). In most proposals a line would be compiled and then executed.
However, such systems behave quite differently from ordinary Forth systems in some respects, in particular when dealing with parsing words. E.g., consider:
' + . : my-' ' ; my-' + .
In classical Forth '
parses +
in both cases.
This behaviour is hard to achieve in a compile-then-execute Forth
system, unless it works a word at a time, but then it would have none
of the benefits, either.
In the old days Forth did not have floating-point numbers; instead,
fixed-point arithmetic was used, usually on double-cell numbers. So,
a decimal point indicated a double number (the position of the decimal
point was stored in the variable DPL
for potential use by
fixed-boint software).
In ANS Forth, a decimal point at the end indicates a double-cell
number, and an E in the number indicates a floating-point number (when
BASE
is decimal).
All other ways to write numbers are system-dependent. However, most systems still interpret decimal points within a number as indicating double-cell numbers.
eForth was written by Bill Muench, and was originally metacompiled. On request from C. H. Ting he also produced a version that was written in MASM, and had many words removed (the user should add them back in as an educational exercise).
More specifically, why do we put doubles, addresses, string descriptors etc. on the data stack? Why not FP values as well?