constant
. Likewise, there is no difference
between instance variables that contain objects and those
that contain other data.
:noname
); however, such
words are not as bad as many other parsing words, because they are not
state-smart.
This paper assumes (in some places) that you have read the paper on structures.
draw
for displaying it on the
screen. However, draw
has to do something different for
every kind of object.
We could implement draw
as a big CASE
control structure that executes the appropriate code depending on the
kind of object to be drawn. This would be not be very elegant, and,
moreover, we would have to change draw
every time we add
a new kind of graphical object (say, a spaceship).
What we would rather do is: When defining spaceships, we would tell
the system: "Here's how you draw
a spaceship; you figure
out the rest."
This is the problem that all systems solve that (rightfully) call themselves object-oriented, and the object-oriented package I present here also solves this problem (and not much else).
draw
) for performing an operation on a variety of data
structures (classes). A selector describes what operation to
perform. In C++ terminology: a (pure) virtual function.
object class \ "object" is the parent class selector draw ( x y graphical -- ) end-class graphicalThis code defines a class
graphical
with an
operation draw
. We can perform the operation
draw
on any graphical
object, e.g.:
100 100 t-rex drawwhere
t-rex
is a word (say, a constant) that produces a
graphical object.
How do we create a graphical object? With the present definitions,
we cannot create a useful graphical object. The class
graphical
describes graphical objects in general, but not
any concrete graphical object type (C++ users would call it an
abstract class); e.g., there is no method for the selector
draw
in the class graphical
.
For concrete graphical objects, we define child classes of the
class graphical
, e.g.:
graphical class \ "graphical" is the parent class cell% field circle-radius :noname ( x y circle -- ) circle-radius @ draw-circle ; overrides draw :noname ( n-radius circle -- ) circle-radius ! ; overrides construct end-class circleHere we define a class
circle
as a child of
graphical
, with a field circle-radius
(which
behaves just like a field in the structure
package); it defines new methods for the selectors
draw
and construct
(construct
is defined in object
, the parent class of
graphical
).
Now we can create a circle on the heap (i.e.,
allocate
d memory) with
50 circle heap-new constant my-circle
heap-new
invokes construct
, thus
initializing the field circle-radius
with 50. We can draw
this new circle at (100,100) with
100 100 my-circle drawNote: You can invoke a selector only if the object on the TOS (the receiving object) belongs to the class where the selector was defined or one of its descendents; e.g., you can invoke
draw
only for objects belonging to graphical
or its descendents (e.g., circle
). Immediately before
end-class
, the search order has to be the same as
immediately after class
.
object
classobject
. You can use it as ancestor for all
classes. It is the only class that has no parent. It has two
selectors: construct
and print
.
heap-new
( ... class -- object ) and in the dictionary
(allocation with allot
) with dict-new
(
... class -- object ). Both words invoke construct
, which
consumes the stack items indicated by "..." above.
If you want to allocate memory for an object yourself, you can get its
alignment and size with class-inst-size 2@
( class --
align size ). Once you have memory for an object, you can initialize
it with init-object
( ... class object -- );
construct
does only a part of the necessary work.
In general, it is a good idea to ensure that all methods for the same selector have the same stack effect: when you invoke a selector, you often have no idea which method will be invoked, so, unless all methods have the same stack effect, you will not know the stack effect of the selector invocation.
One exception to this rule is methods for the selector
construct
. We know which method is invoked, because we
specify the class to be constructed at the same place. Actually, I
defined construct
as a selector only to give the users a
convenient way to specify initialization. The way it is used, a
mechanism different from selector invocation would be more natural
(but probably would take more code and more space to explain).
Sometimes we want to invoke a different method. E.g., assume that
you want to use the simple method for print
ing
object
s instead of the possibly long-winded
print
method of the receiver class. You can achieve this
by replacing the invocation of print
with
[bind] object printin compiled code or
bind object printin interpreted code. Alternatively, you can define the method with a name (e.g.,
print-object
), and then invoke it through the
name. Class binding is just a (often more convenient) way to achieve
the same effect; it avoids name clutter and allows you to invoke
methods directly without naming them first.
A frequent use of class binding is this: When we define a method
for a selector, we often want the method to do what the selector does
in the parent class, and a little more. There is a special word for
this purpose: [parent]
; [parent]
selector
is equivalent to [bind] parent
selector
, where parent
is the parent
class of the current class. E.g., a method definition might look like:
:noname dup [parent] foo \ do parent's foo on the receiving object ... \ do some more ; overrides foo
[mckewan97] presents class binding as an optimization technique. I recommend not using it for this purpose unless you are in an emergency. Late binding is pretty fast with this model anyway, so the benefit of using class binding is small; the cost of using class binding where it is not appropriate is reduced maintainability.
While we are at programming style questions: You should bind
selectors only to ancestor classes of the receiving object. E.g., say,
you know that the receiving object is of class foo
or its
descendents; then you should bind only to foo
and its
ancestors.
:noname
), you may have to do a lot of stack
gymnastics. To avoid this, you can define the method with m:
... ;m
. E.g., you could define the method for
draw
ing a circle
with
m: ( x y circle -- ) ( x y ) this circle-radius @ draw-circle ;mWhen this method is executed, the receiver object is removed from the stack; you can access it with
this
(admittedly, in this
example the use of m: ... ;m
offers no advantage). Note
that I specify the stack effect for the whole method (i.e. including
the receiver object), not just for the code between m:
and ;m
. You cannot use exit
in
m:...;m
; instead, use
exitm
. Moreover, for any word that calls
catch
and was defined before loading
objects.fs
, you have to redefine it like I redefined
catch
:
: catch this >r catch r> to-this ;
You will frequently use sequences of the form this
field
(in the example above: this
circle-radius
). If you use the field only in this way, you can
define it with inst-var
and eliminate the
this
before the field name. E.g., the circle
class above could also be defined with:
graphical class cell% inst-var radius m: ( x y circle -- ) radius @ draw-circle ;m overrides draw m: ( n-radius circle -- ) radius ! ;m overrides construct end-class circle
radius
can only be used in circle
and its
descendent classes and inside m:...;m
.
You can also define fields with inst-value
, which is
to inst-var
what value
is to
variable
. You can change the value of such a field with
[to-inst]
. E.g., we could also define the class
circle
like this:
graphical class inst-value radius m: ( x y circle -- ) radius draw-circle ;m overrides draw m: ( n-radius circle -- ) [to-inst] radius ;m overrides construct end-class circle
To solve this problem, I added a scoping mechanism (which was not
in my original charter): A field defined with inst-var
is
visible only in the class where it is defined and in the descendent
classes of this class. Using such fields only makes sense in
m:
-defined methods in these classes anyway.
This scoping mechanism allows us to use the unadorned field name, because name clashes with unrelated words become much less likely.
Once we have this mechanism, we can also use it for controlling the
visibility of other words: All words defined after
protected
are visible only in the current class and its
descendents. public
restores the compilation
(i.e. current
) wordlist that was in effect before. If you
have several protected
s without an intervening
public
or set-current
, public
will restore the compilation wordlist in effect before the first of
these protected
s.
Now consider the case when you want to have a selector (or several)
available in two classes: You would have to add the selector to a
common ancestor class, in the worst case to object
. You
may not want to do this, e.g., because someone else is responsible for
this ancestor class.
The solution for this problem is interfaces. An interface is a collection of selectors. If a class implements an interface, the selectors become available to the class and its descendents. A class can implement an unlimited number of interfaces. For the problem discussed above, we would define an interface for the selector(s), and both classes would implement the interface.
As an example, consider an interface storage
for
writing objects to disk and getting them back, and a class
foo
foo that implements it. The code for this would look
like this:
interface selector write ( file object -- ) selector read1 ( file object -- ) end-interface storage bar class storage implementation ... overrides write ... overrides read ... end-class foo(I would add a word
read
( file -- object ) that uses
read1
internally, but that's beyond the point illustrated
here.)
Note that you cannot use protected
in an interface; and
of course you cannot define fields.
In the Neon model, all selectors are available for all classes; therefore it does not need interfaces. The price you pay in this model is slower late binding, and therefore, added complexity to avoid late binding.
struct...end-struct
. It has a field
object-map
that points to the method map for the object's
class.
The method map (this is Self [chambers&ungar89] terminology; in C++ terminology: virtual function table) is an array that contains the execution tokens (XTs) of the methods for the object's class. Each selector contains an offset into the method maps.
selector
is a defining word that uses
create
and does>
. The body of the
selector contains the offset; the does>
action for a
class selector is, basically:
( object addr ) @ over object-map @ + @ executeSince
object-map
is the first field of the object, it
does not generate any code. As you can see, calling a selector has a
small, constant cost.
A class is basically a struct
combined with a method
map. During the class definition the alignment and size of the class
are passed on the stack, just as with struct
s, so
field
can also be used for defining class
fields. However, passing more items on the stack would be
inconvenient, so class
builds a data structure in memory,
which is accessed through the variable
current-interface
. After its definition is complete, the
class is represented on the stack by a pointer (e.g., as parameter for
a child class definition).
At the start, a new class has the alignment and size of its parent,
and a copy of the parent's method map. Defining new fields extends the
size and alignment; likewise, defining new selectors extends the
method map. overrides
just stores a new XT in the method
map at the offset given by the selector.
Class binding just gets the XT at the offset given by the selector
from the class's method map and compile,
s (in the case of
[bind]
) it.
I implemented this
as a value
. At the
start of an m:...;m
method the old this
is
stored to the return stack and restored at the end; and the object on
the TOS is stored TO this
. This technique has one
disadvantage: If the user does not leave the method via
;m
, but via throw
or exit
,
this
is not restored (and exit
may
crash). To deal with the throw
problem, I have redefined
catch
to save and restore this
; the same
should be done with any word that can catch an exception. As for
exit
, I simply forbid it (as a replacement, there is
exitm
).
inst-var
is just the same as field
, with
a different does>
action:
@ this +Similar for
inst-value
.
Each class also has a wordlist that contains the words defined with
inst-var
and inst-value
, and its protected
words. It also has a pointer to its parent. class
pushes
the wordlists of the class an all its ancestors on the search order,
and end-class
drops them.
An interface is like a class without fields, parent and protected words; i.e., it just has a method map. If a class implements an interface, its method map contains a pointer to the method map of the interface. The positive offsets in the map are reserved for class methods, therefore interface map pointers have negative offsets. Interfaces have offsets that are unique throughout the system, unlike class selectors, whose offsets are only unique for the classes where the selector is available (invokable).
This structure means that interface selectors have to perform one
indirection more than class selectors to find their method. Their body
contains the interface map pointer offset in the class method map, and
the method offset in the interface method map. The
does>
action for an interface selector is, basically:
( object selector-body ) 2dup selector-interface @ ( object selector-body object interface-offset ) swap object-map @ + @ ( object selector-body map ) swap selector-offset @ + @ executewhere
object-map
and selector-offset
are
first fields and generate no code.
As a concrete example, consider the following code:
interface selector if1sel1 selector if1sel2 end-interface if1 object class if1 implementation selector cl1sel1 cell% inst-var cl1iv1 ' m1 overrides construct ' m2 overrides if1sel1 ' m3 overrides if1sel2 ' m4 overrides cl1sel2 end-class cl1 create obj1 object dict-new drop create obj2 cl1 dict-new dropThe data structure created by this code (including the data structure for
object
) is shown in the figure, assuming a cell size of
4.
Another well-known publication is [pountain87]. However, it is not really about object-oriented programming, because it hardly deals with late binding. Instead, it focuses on features like information hiding and overloading that are characteristic of modular languages like Ada (83).
There are also many other papers on object-oriented Forth extensions; E.g., [rodriguez&poehlman96] lists 17 and [mckewan97] lists 6. In the rest of this section I will discuss two systems that have the implementation using method maps in common with the package discussed here.
The model of [zsoter96] makes heavy use of an active object (like
this
in my model): The active object is not only used for
accessing all fields, but also specifies the receiving object of every
selector invocation; you have to change the active object explicitly
with { ... }
, whereas in my model it changes more or less
implicitly at m: ... ;m
. Such a change at the method
entry point is unnecessary with the [zsoter96] model, because the
receiving object is the active object already; OTOH, the explicit
change is absolutely necessary in that model, because otherwise no one
could ever change the active object.
The model of [paysan94] combines
information hiding and overloading resolution (by keeping names in
various wordlists) with object-oriented programming. It sets the
active object implicitly on method entry, but also allows explicit
changing (with >o...o>
or with
with...endwith
). It uses parsing and state-smart objects
and classes for resolving overloading and for early binding: the
object or class parses the selector and determines the method from
this. If the selector is not parsed by an object or class, it performs
a call to the selector for the active object (late binding), like [zsoter96]. Fields are always accessed through
the active object. The big disadvantage of this model is the parsing
and the state-smartness, which reduces extensibility and increases the
opportunities for subtle bugs; essentially, you are only safe if you
never tick or postpone
an object or class.
bind ( ... "class" "selector" -- ... )
selector
in class
.
<bind> ( class selector-xt -- xt )
xt
is the method for the selector
selector-xt
in class
.
bind' ( "class" "selector" -- xt )
xt
is the method for selector
in class
.
[bind] ( compile-time: "class" "selector" -- ; run-time: ... -- ... )
selector
in class
.
class ( parent-class -- align offset )
parent-class
.
align offset
are for use by field
etc.
class->map ( class -- map )
map
is the pointer to class
's method map; it points to the
place in the map to which the selector offsets refer (i.e., where
object-map
s point to).
class-inst-size ( class -- addr )
class-inst-size 2@ ( class -- align size )
,
gives the size specification for an instance (i.e. an object) of
class
.
class-override! ( xt sel-xt class-map -- )
xt
is the new method for the selector
sel-xt
in class-map
.
construct ( ... object -- )
object
. The method for the class
object
just does nothing ( object -- )
.
current' ( "selector" -- xt )
xt
is the method for selector
in the current class.
[current] ( compile-time: "selector" -- ; run-time: ... -- ... )
selector
in the current class.
current-interface ( -- addr )
dict-new ( ... class -- object )
allot
and initialize an
object of class class
in the dictionary.
drop-order ( class -- )
class
's wordlists from the search
order. No checking is made whether class
's wordlists are actually
on the search order.
end-class ( align offset "name" -- )
name
execution: -- class
ends a class definition. The resulting class is class
.
end-class-noname ( align offset -- class )
class
.
end-interface ( "name" -- )
name
execution: -- interface
ends an interface definition. The resulting interface is
interface
.
end-interface-noname ( -- interface )
interface
.
exitm ( -- )
exit
from a method; restore old this
.
heap-new ( ... class -- object )
allocate
and initialize an object of class class
.
implementation ( interface -- )
interface
. I.e., you can use all selectors of the
interface in the current class and its descendents.
init-object ( ... class object -- )
object
) to an object of class
class
; then performs construct
.
inst-value ( align1 offset1 "name" -- align2 offset2 )
name
execution: -- w
w
is the value of the field name
in
this
object.
inst-var ( align1 offset1 align size "name" -- align2 offset2 )
name
execution: -- addr
addr
is the address of the field name
in
this
object.
interface ( -- )
;m ( colon-sys -- ) ( run-time: -- )
this
.
m: ( -- xt colon-sys ) ( run-time: object -- )
object
becomes new this
.
method ( xt "name" -- )
name
execution: ... object -- ...
creates selector name
and makes xt
its method in the current class.
object ( -- class )
overrides ( xt "selector" -- )
selector
in the current class with
xt
. overrides
must not be used during an
interface definition.
[parent] ( compile-time: "selector" -- ; run-time: ... object -- ... )
selector
in the parent of the
current class.
print ( object -- )
object
prints the address of the
object and the address of its class.
protected ( -- )
public ( -- )
protected
that
actually changed the compilation wordlist.
push-order ( class -- )
class
's wordlists to the search-order (in front)
selector ( "name" -- )
name
execution: ... object -- ...
creates selector name
for the current class and its
descendents; you can set a method for the selector in the current
class with overrides
.
this ( -- object )
<to-inst> ( w xt -- )
w
into the field xt
in this
object.
[to-inst] ( compile-time: "name" -- ; run-time: w --
)
w
into field name
in
this
object.
to-this ( object -- )
this
(used internally, but useful when debugging).
xt-new ( ... class xt -- object )
xt ( align size -- addr )
to get memory.