While this is possible in many cases without resorting to deep
stacks and without using pick
and roll
, in
other cases I failed to find such a factoring, and I posted some of
these cases as a challenge to those who hold the anti-locals position.
I finally collected the challenges and the solutions posted up to now
on this page.
Note that Chuck Moore uses variable
s to reduce the
stack depth. An advantage of this approach over using locals is
that the word can be split into factors without ado. It also has
disadvantages: The words involved now have parameters and/or results
that are not passed on the stack, which makes testing of individual
words harder; there may be non-obvious data flow between words,
which makes the program harder to understand; and the resulting code
is not reentrant.
In my EuroForth 2011 paper I give an overview of techniques programmers can use to reduce the stack depth.
The challenges are:
The challenge is to implement the words without using locals, using the stacks instead, or maybe temporarily allocated memory. I.e., no USERs or VARIABLEs at least for F>BUF-RDP-TRY; for comp-i.fs VARIABLEs are less of a problem, but to me still a sign that stacks are not the answer. And the result should be at least as understandable and maintainable as the original; using factoring to achieve the latter is probably a good idea.
If you can make it more readable by factoring while still using locals, this would show the merits of factoring, but not the demerits of using locals.
A problem with these challenges is that they are taken from real code, call non-standard words (you have to look up the stack effects of these), and are unwieldy in other ways. This is in some way unavoidable: A designed (rather than real-world) example would appear contrived.
F>BUF-RDP-TRY
f>buf-rdp-try ( rf c-addr ur nd up um1 -- um2 )
is a
factor of f.rdp
, a floating-point output word. It tries
to convert the FP number rf into a string in the buffer c-addr ur.
The result is in either fixed-point or exponential notation, depending
on the FP number and the other parameters (see f.rdp
documentation). ur corresponds to the nr parameter
of f.rdp
, nd to the nd parameter, and up to the np
parameter. um1 is the mantissa length to try, um2 is the proper
mantissa length as determined by this try. Make another try with that
mantissa length, and you get the proper conversion. The source code
with its surroundings
are here.
\ uses the following non-standard non-obvious words: \ push-right ( c-addr u1 u2 cfill -- ) \ move string at c-addr u1 right by u2 chars (without exceeding \ the original bound); fill the gap with cfill \ <<# ( -- ) \ Start a hold area that ends with '#>>'. \ #>> ( -- ) \ Release the hold area started with '<<#'. : f>buf-rdp-try { f: rf c-addr ur nd up um1 -- um2 } \ um1 is the mantissa length to try, um2 is the actual mantissa length c-addr ur um1 /string '0 fill rf c-addr um1 represent if { nexp fsign } nd nexp + up >= up 0= or ur nd - 1- dup { beforep } fsign + nexp 0 max >= and if \ fixed-point notation c-addr ur beforep nexp - dup { befored } '0 push-right befored 1+ ur >= if \ <=1 digit left, will be pushed out by '.' rf fabs f2* 0.1e nd s>d d>f f** f> if \ round last digit '1 c-addr befored + 1- c! endif endif c-addr beforep 1- befored min dup { beforez } 0 max bl fill fsign if '- c-addr beforez 1- 0 max + c! endif c-addr ur beforep /string 1 '. push-right nexp nd + else \ exponential notation c-addr ur 1 /string 1 '. push-right fsign if c-addr ur 1 '- push-right endif nexp 1- s>d tuck dabs <<# #s rot sign 'E hold #> { explen } ur explen - 1- fsign + { mantlen } mantlen 0< if \ exponent too large drop c-addr ur '* fill else c-addr ur + 0 explen negate /string move endif #>> mantlen endif else \ inf or nan \ don't rely on REPRESENT result 2drop rf f0< if s" -Inf" else rf f0>= if s" Inf" else s" NaN" endif endif c-addr ur rot umin dup >r move c-addr ur r> /string blank ur endif 1 max ur min ;The problem is the big number of values around. One could factor out the three cases (fixed-point, exponential, and inf/nan), but the number of values around would stay the same, and we would still need locals.
comp-i.fs
The code uses knowledge about the image file format, which is not
really documented; this makes the code hard to grok. There are two
long words with locals in this
code: compare-images
and comp-image
.
\ non-standard words used: \ image-data ( i-field expected-offset -- base offset ) \ image1 ( -- addr ) \ image2 ( -- addr ) \ write-cell ( x file-id -- ior ) \ reloc-bits ( -- addr ) \ set-bit ( u addr -- ) \ write-symbol ( acell mask file-id u -- ) : compare-images { size file-id -- } \G compares image1 and image2 (of size cells) and sets reloc-bits. \G offset is the difference for relocated addresses \ this definition is certainly to long and too complex, but is \ hard to factor. cr ." code" cell 26 cells image-data { cbase coffset } ." xt" 13 cells 22 cells image-data { xbase xoffset } ." label" 14 cells 18 cells image-data { lbase loffset } size 0 u+do image1 i th @ image2 i th @ { cell1 cell2 } case cell1 cell2 = ?of cell1 file-id write-cell throw endof cell1 im-sects1 sect-reloc cell2 im-sects2 sect-reloc over = ?of file-id write-cell throw i reloc-bits set-bit endof drop cell1 coffset + cell2 = ?of cell1 cbase - $4000 file-id i write-symbol endof cell1 xoffset + cell2 = ?of cell1 xbase - 0 file-id i write-symbol endof cell1 loffset + cell2 = ?of cell1 lbase - $8000 file-id i write-symbol endof cell1 file-id write-cell throw cell1 cell2 <> if 0 i th 9 u.r cell1 17 u.r cell2 17 u.r cr endif 0 endcase loop ;Again, there is a large number of values in play here.
\ additional non-standard words used: \ slurp-file ( c-addr1 u1 -- c-addr2 u2 ) \ size1 ( -- u ) \ size2 ( -- u ) \ prepare-sections ( -- ) \ reloc-size ( -- u ) \ reloc-bits ( -- addr ) : comp-image ( "image-file1" "image-file2" "new-image" -- ) parse-name slurp-file { file1 fsize1 } file1 fsize1 s" Gforth5" search 0= abort" not a Gforth image" drop 8 + file1 - { header-offset } file1 fsize1 header-offset /string to size1 to image1 size1 aligned size1 <> abort" unaligned image size" parse-name slurp-file header-offset /string to size2 to image2 size1 size2 <> abort" image sizes differ" prepare-sections parse-name ( "new-image" ) w/o bin create-file throw { outfile } size1 1- cell/ bits/au / 1+ to reloc-size reloc-size allocate throw to reloc-bits reloc-bits reloc-size erase file1 header-offset outfile write-file throw base @ hex size1 aligned cell/ outfile compare-images base ! reloc-bits reloc-size outfile write-file throw outfile close-file throw ;This word is just a straight-line word, which makes it relatively easy to understand despite its length (if you understand the components).
map-array
(mostly solved): map-array ( ... addr u xt -- ... ) \ executes xt ( ... x -- ... ) for every element of the array starting \ at addr and containing u elements { xt } cells over + swap ?do i @ xt execute 1 cells +loop ;Here the standard restrictions of
?do...+loop
on return
stack usage do not allow us to keep xt on the return stack the whole
time, while we can keep it in a local the whole time. It is possible
to push the xt on the return stack only during the EXECUTE:
: map-array ( ... addr u xt -- ... ) rot rot cells over + swap ?do ( xt ) i @ swap dup >r execute r> 1 cells +loop drop ;But that would mean quite a bit of stack busy work; not really an example that shows the superiority of not using locals. Andrew Haley suggests using macros instead:
: (map) ( a n - a a') cells over + swap ; : map[ postpone (map) postpone ?do ; immediate : ]map postpone cell postpone +loop ; immediateThis is used like
map[ i @ + ]map
instead of ['] +
map-array
and can only be used in a colon definition.