Difference between revisions of "Forthcompiler"

From ScienceZero
Jump to: navigation, search
Line 16: Line 16:
 
   '''( a b c -- c b a )'''
 
   '''( a b c -- c b a )'''
 
   Reading threee values from the stack and write them back in the opposite order.
 
   Reading threee values from the stack and write them back in the opposite order.
 +
 +
 +
==Compiler stage 1===
 +
#Tokenize the source by splitting it at whitespace, turning ": *2 dup + ;" into ":" "*2" "dup" "+" ";"
 +
 +
 +
 +
  
 
==Words==
 
==Words==
Line 39: Line 47:
 
   >r            ( n -- )          ( fundamental )
 
   >r            ( n -- )          ( fundamental )
 
   r>            ( -- n )          ( fundamental )
 
   r>            ( -- n )          ( fundamental )
 
 
== Compiler ==
 
=== Stage 1 (not implemented in version 1) ===
 
#Definitions are expanded
 
#Macros are executed
 
=== Stage 2 ===
 
#Functions are added to the dictionary
 
=== Stage 3 ===
 
#All words are looked up in the dictionary and the opcode stored in the program array
 
#The address of each function is stored in the function jump table in the same order as the functions were added to the dictionary
 
 
== The dictionary ==
 
The dictionary used to convert a command into an opcode. It is a zero terminated string like "setcolour,readcolour,setpos,readpos,",0 that is searched linearly from start to end. A comma marks the end of a word. All functions are added to the end of this table in stage 2 of the compiling process.
 
 
== The function jump table ==
 
This table contains the address of each function, in the same order as the functions appear in the dictionary. It is filled in during stage 3 when the actual code position is known.
 
 
== Interpreter ==
 
The opcode pointer always points to the byte after the one being processed.
 
 
do
 
  opcode = program(opcodepointer)
 
  opcodepointer += 1
 
 
  switch opcode
 
    case 0-4
 
      value = 0
 
      for n = 0 to opcode
 
          value = value or (program(opcodepointer) << (n * 8))
 
          opcodepointer = opcodepointer + 1
 
      push value
 
   
 
    case op_loadString
 
      push opcodepointer
 
      skip string data
 
 
    case command
 
      execute command
 
 
    case function
 
      pushr localpointer
 
      pushr opcodepointer
 
      opcodepointer = jumptable(opcode)
 
 
    case op_return
 
      opcodepointer = pullr
 
      localpointer = pullr
 
    end switch
 
 
again
 
 
=== Opcodes===
 
enum operands {op_ldc0, op_ldc8, op_ldc16, op_ldc24, op_ldc32, op_lds, op_end, op_setcolour, op_readcolour, op_setpos, op_readpos, op_load8, op_load16, op_load32, op_store8, op_store16, op_store32, op_var, op_unvar, op_ldl, op_stl, op_penup, op_pendown, op_forward, op_backward, op_left, op_right, op_call, op_for, op_next, op_do, op_again, op_if, op_else, op_endif, op_add, op_sub, op_mul, op_umul, op_div, op_udiv, op_lsl, op_lsr, op_asr, op_and, op_or, op_eor, op_not, op_drop, op_dup, op_swap, op_over, op_rot, op_nrot, op_dtor, op_rtod, op_nop};
 
*DIV, MUL and LSR are unsigned.
 
*/, *, asr are signed.
 
*Opcodes larger than op_nop are local function calls.
 
 
=== Constants ===
 
op_ldc0, op_ldc8, op_ldc16, op_ldc24, op_ldc32
 
*If the two or three most significant bytes are 0xff then shorter code is generated by:
 
**op_ldc16 0xnn 0xnn op_not
 
**op_ldc8 0xnn op_not
 
 
== Optimisations ==
 
The bytecode is optimised in several separate steps after the compilation. Unused instruction slots are padded with nops. After each stage the code is compacted and the nops removed.
 
 
=== Arithmetic and logic peephole optimising ===
 
*Constant folding
 
**1 8 lsl → 256 (for all arithmetic and logic operations)
 
 
*Strength reduction
 
**0 div → -1 (TODO: check if this is what the interpreter would normally do...)
 
**1 div → nop
 
**pow2(n) div → (n - 1) asr
 
**pow2(n) udiv → (n - 1) lsr
 
**1 mul → nop
 
**2 mul → dup +
 
 
*No operations
 
**not not → nop
 
**dup and → nop
 
**dup or → nop
 
 
=== Loading of constants ===
 
*Repeating numbers are shortened by using op_dup and op_over, 1 3 3 1 3 can be shortened to:
 
**op_ldc8 0x01 op_ldc8 0x03 op_dup op_ldc8 0x01 op_over
 
*Applying an eight bit constant and an operation on a copy of one of the two previous numbers.
 
*Applying an operation on copies of the two previous numbers.
 
 
=== Stack reordering ===
 
Optimisation is done by a brute force search generating optimal fragments of up to 7 instructions.
 
*[[Optimal stack reordering in Forth]]
 
 
=== Common Subexpression Elimination ===
 
Identify common sub-expressions in the code, and use the result for each instance, rather than re-evaluating them each time.
 
*Not done
 
 
=== Loop Invariant Motion ===
 
This is the 'lifting of expressions' from loops. The compiler can identify that a particular expression inside a loop actually does not change as the loop is running.
 
*Not done
 
 
===  Tail Call Optimization and Tail Recursion ===
 
A tailcall is a call immediately before a return. Normally this function will be called, and when it returns to the caller, the caller returns again. Tailcall optimization avoids this by restoring the saved registers before jumping to the tailcall. The called function will now return directly to the caller's caller, saving a return sequence.
 
*Not done, will require a jump op-code
 
 
== Just In Time compiler ==
 
The JIT compiler can be made very fast if the code quality is not very important. So loops can be compiled without considering how many times they will be executed. If they are executed more than 3 times there will be a benefit.
 
 
=== Register allocation ===
 
There are three register pools, stack, return stack and free. The pools are implemented as stacks with special functions.
 
 
=== Stack manipulation ===
 
All stack manipulations are done by register renaming so no code is generated.
 
 
=== Branching code ===
 
*
 

Revision as of 11:57, 24 January 2016

A simple Forth compiler

Definitions

Everything inside ( ) is a comment

When stack content is written on one line, the top of the stack is to the right
  bottom -> 1 2 3 <- top

Stack diagrams are comments that show what is taken from the stack and what is returned ot the stack
( items read from the stack -- items written on the stack )
Examples:
  ( n n -- n )
  This means that two numbers were read from the stack, then one number was written back.
  For example a simple addition reads two numbers and writes back the result.

  ( a b c -- c b a )
  Reading threee values from the stack and write them back in the opposite order.


Compiler stage 1=

  1. Tokenize the source by splitting it at whitespace, turning ": *2 dup + ;" into ":" "*2" "dup" "+" ";"



Words

Data processing:
  +             ( n n -- n )        ( : +  >r 0 r> - - ; )
  -             ( n n -- n )        ( fundamental )
  *             ( n n -- n )        ( : * 0 do over 1 and if rot dup >r -rot r> + endif rot 1 lsl rot 1 lsr rot over if again endif >r drop -rot drop drop ; ) 
  /             ( n n -- n )        ( ? )
  lsl           ( n n -- n )        ( : lsl for dup + next ; )
  lsr           ( n n -- n )        ( : lsr for 31 for dup 0x80000000 and swap 1 lsl swap if 1 + endif next 0x7fffffff and next ; )
  and           ( n n -- n )        ( fundamental )
  or            ( n n -- n )        ( : or over not and + ; )
  eor           ( n n -- n )        ( : eor over over or -rot and not and ; )
  not           ( n -- n )          ( : not >r -1 r> - ; )

Stack manipulation:
 drop           ( n -- )           ( : drop if endif ; )
 dup            ( n -- n n )       ( fundamental )
 swap           ( a b -- b a )     ( : swap dup >r >r dup r> eor dup >r eor r> r> eor ; )
 over           ( a b -- a b a )   ( : over >r dup r> swap ; ) 
 rot            ( a b c -- b c a ) ( : rot >r swap r> swap ; ) 
 -rot           ( a b c -- c a b ) ( : -rot swap >r swap r> ; )
 >r             ( n -- )           ( fundamental )
 r>             ( -- n )           ( fundamental )