Defining Words in Forth

It has been said that one does not write a program in Forth. Rather, one extends Forth to make a new language specifically designed for the application at hand. An important part of this process is the defining word, by which it is possible to combine a data structure with an action to create multiple instances that differ only in detail. One thinks of a cookie-cutter; all the cookies are the same shape but have different-colored icing.

The basics of create ... does>

Defining words are based on the Forth construct create ... does>, which beginners can apply mechanically. The steps are:

Start a colon definition
Write create
Follow by words that lay down data or allot RAM, thus creating the body
Write does>
Follow by words that act on the body.

These steps are fairly simple, but understanding them is complex because there are three stages in the action of a defining word.

An example

Our example will be indexed-array, which allots an area of RAM. At run time, it takes an index, i, and returns the address of the ith cell. If i=0, the address of the first cell is returned because Forth conventionally starts numbering at 0. If you don't like that, rewrite indexed-array. After all, this is Forth.

: indexed-array ( n -- ) ( i -- a)
     create cells allot
     does> swap cells + ;

20 indexed-array foo  \ Make a 1-dimensional array with 20 cells
 3 foo                \ Put addr of fourth element on the stack

Stage 1: Compiling the defining word

The first phase is in effect during the compilation of indexed-array, that is, between the colon and semicolon. The colon sets up a header. Then, execution tokens of ordinary Forth words are laid down, while those with the attribute "C" (formerly called "Immediate") are executed at once. The process is terminated by the semicolon.

The only "C" word in indexed-array is does>. It lays down code that will act later in stage 2.

Stage 2: Creating a "child"

The second phase is in effect when indexed-array is used to create foo.

create sets up a header
cells allot reserves n cells, forming the "data field" (formerly called "body") of foo.
The code that was laid down by does> now comes into action. It changes the execution of foo so that it will:
1. Put the address of its data field on the stack, and then
2. Execute the Forth words between does> and semicolon.

Stage 3: Executing the child

In the third phase, we execute foo.

i is already on the stack, and the origin of the data field is put on top of that
swap rearranges the stack
cells multiplies i by the cell length
+ adds the result to the origin of the data field.

Create without does>

In F83, create will create a dictionary entry that returns the address of the next available location in data space. The ANSI standard is essentially the same, adding specifications as to alignment.

Therefore, if all that is wanted is to return an address, does> is not needed. Adding it will not change the results, but will cost in memory and time. To create an 80-byte buffer:

create buffer 80 allot

Now, executing buffer returns the first address of the alloted area.

Miscellaneous topics

Important issues such as range checking and multi-dimensional arrays are not discussed here.

In many Forths, for example F83 and F-PC, it is possible for defining words to create defining words, which in turn create other defining words. The nesting, in theory, can be carried on indefinitely. However, this not permitted in an ANS standard program (section 3.6).

Why is there a right angle-bracket in does>? It originated in early Forths in which create was followed by <builds . . . does>. Later, the action of <builds was incorporated in create, but the spelling of does> was not changed.

Code meets requrements for an ANS Forth standard program.
Portions have been extracted from an article in Forth Dimensions, May, June 1992.

Home