Virtual machine

From ScienceZero
Jump to: navigation, search

Function definition syntax

Functions are defined using reverse Polish notation where the operator follows the operands.

 :  <name> Define a new function
 :: <name> Define a new inline function
           An inline function is inserted directly into the program without a call
           This is important for functions that work on the return stack
 :!        Anonymous function that has no name, it leaves the pointer to the function on the stack
 ;         End definition
 ><name>   Write to variable
 <name>    Call function or read variable
 (         Start comment
 )         End comment

 Example functions
   : inc 1 + ;            ( Increase by 1 )
   : dup >a a a ;         ( Duplicate top of stack using a variable )
   : dup >r @r r ;        ( Duplicate top of stack using return stack )
   : rol 32 swap - ror ;  ( Rotate left )
   : neg 0 swap - ;       ( Negative )

 After being defined a function can be used just like any other instruction
 Write numbers directly without using LDC/LDN/LDE since they are implied
 Variables are inferred and don't have to be decleared

Stacks

The virtual machine has two stacks, one for data and one for return addresses and variables.

  • The data stack is used for general data processing
  • The return stack is used for function calls, flow control and temporary variables

When writing down the stack on one line, the top of the stack is to the right on the line. SP and RP points to the first empty cell.

A stack with four items, 7 is the topmost item
 1 4 2 7 

Performing an ADD instruction will result in this stack
 1 4 9

 2 and 7 are pulled off the stack, added and the result 9 is written back to the stack.

Stack diagrams

 A stack diagram shows how many items a function/instruction take from the stack and put back
 Show what is taken off the stack then what is put back on the stack separated by --
 n n -- n

 Example, a function called mul that takes two numbers off the stack and writes one number back
   : mul ( n n -- n ) 0 rot for over add next nip ;

 Use n for general numbers but name the items if it makes sense
 For example a function that calculates power from voltage and current
   : power ( volt ampere -- watt ) mul ;
 Use | to show conditional stack diagrams.
 This conditional stack diagram show that the function can return one or two items
   ( n n - n | n n -- n n )

Registers

The registers and stack cells are 32 bit each, the registers are internal to the virtual machine The PC and LP registers will be put on the return stack when a function is called so the machine can return safely and continue where it left off when the function completes.


Registers

 PC Program counter        - Points to the currently executing instruction
 DP Data stack pointer     - Points to the next free item on the data stack
 RP Return stack pointer   - Points to the next free item on the return stack
 LP Local variable pointer - Points to the first local variable or local array element on the return stack
                             This is also used when unwinding a stack frame

Machine instructions

Each instruction is one byte (8 bits). The 4 most significant bits is the type of instruction and the 4 least significant bits is the data.

                          7       0
Most significant bit -->  tttt dddd  <-- least significant bit

When viewed in hexadecimal the first digit is the type and the second is the data. So the byte 0xDA is the OR instruction

Type     Data
   0 LDC #imm4             - Load 4 bit immediate constant
   1 LDN #imm4             - Load 4 bit immediate negative constant
   2 LDE #imm4             - Load extend as TOP = TOP:imm4
   3 LSL #imm4             - Logical shift left by imm4 + 1
   4 DIM #imm4             - Reserve TOP:imm4 elements on the return stack
   5 LDL #imm4             - Load local variable from the return stack
   6 STL #imm4             - Store local variable on the return stack
   7 SYS #'aaaa'           - Call system function as TOP:imm4
   8 LEA #'aaaa'           - Load effective address as TOP:imm4
   9 JUMP #'aaaa'          - Jump to TOP:imm4
   A CALL #'aaaa'          - Call function at TOP:imm4
   B OPR0 *RESERVED*       - Reserved for future expansion

   C OPR1 #imm4
           0 ADD.          - 32 bit IEEE 754 floating-point operation
           1 SUB.
           2 MUL.
           3 DIV.
           4 SQRT.
           5 TOF.          - Convert from integer to floating-point
           6 TOI.          - Convert from floating-point to integer 
           7 *RESERVED*    - Reserved for future expansion of floating-point
           8 *RESERVED*   
           9 *RESERVED*   
           A *RESERVED*   
           B *RESERVED*   
           C *RESERVED*   
           D *RESERVED*   
           E *RESERVED*   
           F *RESERVED*
   
   D OPR2 #imm4
           0 ADD           - Addition                 ( n n -- n )
           1 SUB           - Subtraction              ( n n -- n )
           2 MUL           - Multiplication           ( n n -- n )
           3 DIV           - Unsigned division        ( n n -- n )
           4 SDIV          - Signed division          ( n n -- n )  
           5 LSL           - Logical shift left       ( n n -- n )
           6 LSR           - Logical shift right      ( n n -- n )
           7 ASR           - Arithmetic shift right   ( n n -- n )
           8 ROR           - Rotate right             ( n n -- n )
           9 AND           - Bitwise AND              ( n n -- n )
           A OR            - Bitwise OR               ( n n -- n )
           B EOR           - Bitwise Exclusive OR     ( n n -- n )
           C NOT           - Bitwise invert           ( n -- n )
           D NEG           - Negative                 ( n -- n )
           E INC           - Increase                 ( n -- n )
           F DEC           - Decrease                 ( n -- n )

   E OPR3 #imm4
           0 DUP           - Duplicate top of stack                              ( a -- a a )
           1 DROP          - Drop top of stack                                   ( a b -- a )
           2 SWAP          - Swap the two topmost items                          ( a b -- b a )
           3 OVER          - Copy the second item                                ( a b -- a b a )
           4 MATCH         - Compare TOP with N values and call the corresponding function if true
           5 LP            - Read local pointer
           6 R             - Move the top of the return stack to the data stack  ( r -- n )
           7 >R            - Move top of data stack to the return stack          ( n -- r )
           8 @R            - Copy the top of the return stack to the data stack  ( -- n )
           9 LD32          - Load a 32 bit value from memory                     ( address -- data )
           A ST32          - Store a 32 bit value to memory                      ( data address -- )
           B LD16          - Load a 16 bit value from memory                     ( address -- data )
           C ST16          - Store a 16 bit value to memory                      ( data address -- )
           D LD8           - Load a 8 bit value from memory                      ( address -- data )
           E ST8           - Store a 8 bit value to memory                       ( data address -- )
           F NOP           - No-operation                                        ( -- )

   F OPR4 #imm4
           0 FOR           - Start a loop of a fixed number of iterations        ( n -- r r )
           1 NEXT          - Decrease loop counter and exit if 0                 ( r r -- r r | r r -- )
           2 DO            - Start a loop                                        ( -- r )
           3 WHILE         - If not 0 then loop                                  ( n -- )
           4 UNTIL         - if 0 then loop                                      ( n -- )
           5 AGAIN         - Loop                                                ( r -- )
           6 RP            - Read return pointer                                 ( rp -- address )
           7 >RP           - Write return pointer                                ( address -- rp )
           8 FLAG          - No-operation for 0, else return -1                  ( n -- flag ) 
           9 NFLAG         - Not Flag                                            ( n -- /flag )
           A IF            - If false then skip until ELSE or ENDIF              ( n -- )
           B ELSE          - Skip until ENDIF on the same level                  ( -- )
           C ENDIF         - No-operation                                        ( -- )
           D JUMP          - Jump to address TOP                                 ( address -- )
           E CALL          - Calls a function at address TOP                     ( address -- r r )
           F RETURN        - Return from function                                ( r r -- )

Function call

When a function call is made the program pointer PC and the local pointer LP is written to the return stack. The local pointer is loaded with the address of the first empty cell on the return stack. The LDL and STL instructions will use LP as the base address for local variables. A RETURN instruction will use the LP as the base address and use negative offsets to get the old LP and the old PC so execution can continue after the function call.

Function call stack frame
Showing the top of the return stack after a function call is taken, before the function has executed any instructions
          V0      V1      V2  <--  Local variables on the return stack accessed with the LDL and STL instructions
  PC LP <empty> <empty> <empty> ...
            \    
             \     
              The new LP points to the first local variable for the function
              RP points to the first free cell
  
  DIM #2 will cause the next two cells to be filled with 0
          V0      V1      V2 
  PC LP    0       0    <empty> ...
            \               \                
             \               RP is advanced by two to the first free cell
              LP still points to the same position but the two first variables now contains 0


Questions with answers

Why are immediate values loaded 4 bit at a time using LDC/LDN/LDE instructions instead of 8 bit at a time as raw data?

It makes it much simpler to implement the virtual machine since each byte can be executed and parsed without any context.


Why is there both CALL/JUMP and CALL#/JUMP# instructions?

CALL/JUMP is required for efficiently supporting higher-order functions. Immediate call is a very common instruction so CALL# improves code density. JUMP# is for tail call optimisation.
A higher order function takes one or more functions as an input or outputs a function.


What is the point of the LEA# instruction?

Load effective address calculates the absolute address of a function so that it can be called with the CALL instruction.
This is useful when you pass a function as a parameter or store functions in collections.


Why is there a nop instruction?

It makes it easier to manipulate code, the compiler can insert NOP to reserve space or blank out code during optimising for later deletion.


Why the DUP/DROP/SWAP/OVER/ROT/-ROT instructions when there are variables?

Usually it appears to be better to name the action rather than to name the data.
At some level of complexity variables become a better solution. When?


Why is ENDIF a no-operation?

ENDIF is just a end of construct marker used by the IF and ELSE instructions.

Namespace and scope

Variables

Functions can read and write variables in the same scope, they can read variables in the parent scopes.


Questions that have to be answered

Library

Memory manager
Suggestion: Simple bit mapped memory manager, 1 bit for each 32 bit word. Automatic defragmentation and extreme simplicity at the cost of less than 4% waste and some reduction in performance.
Floating-point emulation, someone has to write it..
List, Set, Hash-table

Arrays

What should the syntax be?
Stack or heap?
Both


Reference cells

Variables on the stack gets deleted when a function returns so we may need variables stored on the heap.
Syntax?


Name

Current name is FIRE which mean four. :: FIRE ::


How to load DP?


Do the stacks grow in a defined direction?

Suggestion - RP grows form high to low address, DP grows from low to high address.


SYSCALL

Rules, structure...


Safety

The current safety model is based on the VM being isolated. When running an OS the VM must be able to reach the hardware.
What is the cost of software memory management?


Self modifying code?


Data blocks, where to put them and how to mark them?

Suggestion - FALSE IF <put data here> ENDIF


The future

64 bit?

Just in time compiler for ARM

The just in time compiler will convert the bytecode to ARM machine code at load time.

Five step plan

Step 1

Make an interpreter for a stack machine.
Each instruction is 8 bit long.
Use a 256 entry call table to call functions for each VM instruction
Cache the top of stack in R0

Step 2

Replace each VM instruction with a call to the corresponding ARM function
The program counter will cause some trouble because ARM code will be 2 or 4 times longer than the bytecode
Use abstract syntax trees to minimize or eliminate the use of a program counter
 opcode -> BL opcode

Step 3

Replace instructions that only access the top of the stack with single ARM instructions
 not -> MVN R0,R0
 inc -> ADD R0,R0,#1
 neg -> RSB R0,R0,#0
 nop -> MOV R0,R0
Possibly merge with constants
 ldc #3 -> MOV R0,R0
 sub    -> SUB r0,r0,#3

Step 4

Cache the stack in registers
Stack manipulation is done by register renaming
The tricky bit is branching
Flush the registers to the stack before every branch and every branch target
Or better, flush the longest rename stack until both are equal in length, then flush both stack until they have identical content.

Step 5

Avoid flushing...
  1. This requires the two possible execution paths in IF THEN ELSE to agree on what registers to use for what purpose
  2. If a function is called from two different places the parameters must be in the same registers
    Small functions can be inlined
    Large functions can have the registers flushed to the stack without a large loss of performance
  3. The end of a loop must use the same registers for the same things as the start of the loop