Difference between revisions of "Abstracted assembler"

Latest revision as of 20:49, 9 October 2017

Most low level details can be found in ARM Architecture Reference Manual ARMv8.

Goals

Let the compiler allocate registers
Make it possible to use readable variable names in a safe way
Add a thin level of abstraction that does not require a full compiler to generate good code

Open questions

What is the default data size, 32 or 64 bit?
- Pros for 32-bit:
  - ints are usually 32-bit in C and C++, even on 64-bit machines so may be more in line with developer expectations
  - may yield faster code
- Pros for 64-bit:
  - Address locations / pointers need to be 64-bit, so 64-bit int may make it easier / more logical to do address / pointer calculations
  - Attempt to use a 32-bit int as a pointer should probably give a compiler error. With 64-bit ints, such errors may be more rare

How to specify floating-point operations?
- Options:
  - Number with . infers float
    - myvar = 1.0
  - .f forces float
    - myvar.f = 1

How to handle expressions involving float?
- Options:
  - Handle like in C and C++, where any int that becomes involved in float calculation is "upgraded" to a float first and the result becomes float

Which instruction sets should be supported?
- Options:
  - Any that the underlying assembler support. So, anything that appears to be instructions is just passed through to the assembler after name to register resolve

How to specify SIMD operations?
- Options:
  - Use language support for assembly instruction pass through and use SIMD instructions directly

What is the simplest type system that will work?

Language features

bit slicing

   index index
       a = n.[10..2]
   index lsb
       a = n.[10..]
   msb index
       a = n.[..2]
   index count
       a = n.[10,2]
   msb-index index
       a = n.[-2..]  strip the two most significant bits    
   index    
       bit = n.[index]

Common operators

Using python operator precedence with assembly style additions
( ) Custom order of operations, a = (a + b) * c
Arithmetic
  +   addition
  <+> addition and set condition codes on result
  -
  *
  / 

Comparisons
  =
  <
  >
  <=
  >=
  <> 

Bitwise logical
  Normal  Alternative?
  &       and
  |       or
  ^       eor
  &~      bic
  |~      orn
  ^~      eon
  ~       not

Shifts
  Normal  High precedence
  <<      lsl
  >>      lsr
  >>>     asr
  >|>     ror

Built in functions

clz    count leading zeros
cls    count leading signs
rbit   mirror bits
rev    swap endian
cnt    population count
max
min
swap
adr    returns the address of a label

Conditional functions                 true false      
  CSEL  cond x y   select               x | y
  CSINC cond x y   select increment   x+1 | y
  CSINV cond x y   select inversion    ~x | y
  CSNEG cond x y   select negation     -x | y
  CINC  cond x     increment          x+1 | x
  CINV  cond x     invert              ~x | x 
  CNEG  cond x     negate              -x | x
  CSET  cond       set                  1 | 0
  CSETM cond       set mask            -1 | 0

Program flow

< reg-list = > label <reg-list>
    Function call

return <reg-list>
    Return from function with zero or more variables

again
    Jump to the top of the loop

break
    Jump to after the end of the loop

bcc
    Conditional branch to a label

while cond
    Execute the loop zero or more times while cond = true

dowhile cond
    Execute the loop one or more times while cond = true

match
      
for x

for x = 1 to y

if then else

Conditional

<op> set condition flags on the result of op, a = b <+> c

a = b > c ? 10 : 20
a = 10 if b > c else 20
a = if b > c then 10 else 20

<le> returns -1 if the condition flags matches the condition for Less than or equal otherwise 0.
<c> returns the Carry flag as 0 or 1

.

Condition flags

N Negative - The most significant bit of the result, 1 if the result is negative otherwise 0.
Z Zero     - 1 if the result of the instruction is zero, 0 otherwise.
C Carry    - 1 if the instruction results in a carry condition, for example an unsigned overflow that is the result of an addition.
V Overflow - 1 if the instruction results in an overflow condition, for example a signed overflow that is the result of an addition.

Mnemonic                        Condition     Exactly opposite
   AL    Always                    Any           NOP
   CC    Carry clear               C=0           CS
   CS    Carry set                 C=1           CC
   EQ    Equal                     Z=1           NE
   GE    Greater than or equal     N=V           LT
   GT    Greater than              N=V and Z=0   LE
   HI    Higher (unsigned)         C=1 and Z=0   LS
   LE    Less than or equal        N<>V or Z=1   GT
   LS    Lower or same (unsigned)  C=0  or Z=1   HI
   LT    Less than                 N<>V          GE
   MI    Negative                  N=1           PL
   NE    Not equal                 Z=0           EQ
   PL    Positive                  N=0           MI
   VC    Overflow clear            V=0           VS
   VS    Overflow set              V=1           VC
   LO    Lower (unsigned)          Alias for CC  HS
   HS    Higher/same (unsigned)    Alias for CS  LO

Memory access

[adr].32 = 0
[adr].64 = {xpos,ypos}    Multiple variables
b = [addr].s16
b = [addr1 + addr2].s16
arr[idx].64 = [addr2]
arr[idx]!64 = [addr2]     Increment idx by 8

push {var-list}               Access the system stack
pop {var-list}

Labels

Everything that starts on the first character on a line is a label. Labels are local to the function they are defined in. To reach other labels a fully qualified name is required.

label    x = 3
   notalabel
         a = adr bootloader.var2  ; Fully qualified label name

Data structures

String
Array
BitArray
List
Hastable
Set
Tree

Memory manager

Implementation

Examples

Print value in hexadecimal

toHexStringNLZ:
   clz     x11,x0
   subs    x11,x11,#64
   sub     x1,x1,x11,asr #2
   cinc    x1,x1,eq
   strb    wzr,[x1]
.loop:
   ubfx    x10,x0,#0,#4
   cmp     x10,#9
   add     x10,x10,#'0'
   add     x11,x10,#7
   csel    x11,x10,x11,ls
   strb    w11,[x1,#-1]!
   lsr     x0,x0,#4
   cbne    x0,.loop
   ret

tohhex adr number
   leadingZeros = clz number
   remainingBits = 64 - leadingZeros
   remainingDigits = remainingBits / 4
   adr = adr + remainingDigits

   if remainingDigits = 0
       adr += 1
   [adr].8 = 0
    
   dowhile number <> 0
       digit = number.[3..0]
       number = number >> 4

       if digit <= 9
           digit = digit + '0'
       else
           digit = digit + '0' + 7  

           adr -= 1
           [adr].8 = digit

tohhex adr number
   leadingZeros = clz number
   remainingBits = 64 <-> leadingZeros
   remainingDigits = remainingBits / 4
   adr = adr + remainingDigits
   adr = <eq> ? adr + 1 : adr

   [adr].8 = 0

   dowhile number <> 0
       digit = number.[3..0]
       number = number >> 4

       if digit <= 9
           digit = digit + '0'
       else
           digit = digit + '0' + 7  

           adr -= 1
           [adr].8 = digit

Add with carry

x = y + z + <c>

@@ Line 7: / Line 7: @@
 ==Open questions==
-#What is the default data size, 32 or 64 bit?
-#How to specify SIMD and floating-point operations?
+* What is the default data size, 32 or 64 bit?
-#What is the simplest type system that will work?
+** Pros for 32-bit:
-#Which instruction sets should be supported?
+*** ints are usually 32-bit in C and C++, even on 64-bit machines so may be more in line with developer expectations
+*** may yield faster code
+** Pros for 64-bit:
+*** Address locations / pointers need to be 64-bit, so 64-bit int may make it easier / more logical to do address / pointer calculations
+*** Attempt to use a 32-bit int as a pointer should probably give a compiler error. With 64-bit ints, such errors may be more rare
+* How to specify floating-point operations?
+** Options:
+*** Number with '''.''' infers float
+**** '''myvar = 1.0'''
+*** '''.f''' forces float
+**** '''myvar.f = 1'''
+* How to handle expressions involving float?
+** Options:
+*** Handle like in C and C++, where any int that becomes involved in float calculation is "upgraded" to a float first and the result becomes float
+* Which instruction sets should be supported?
+** Options:
+*** Any that the underlying assembler support. So, anything that appears to be instructions is just passed through to the assembler after name to register resolve
+* How to specify SIMD operations?
+** Options:
+*** Use language support for assembly instruction pass through and use SIMD instructions directly
+* What is the simplest type system that will work?
 ==Language features==
 ===bit slicing===
-     index-index
+     index index
          a = n.[10..2]
-     index-lsb
+     index lsb
          a = n.[10..]
-     msb-index
+     msb index
          a = n.[..2]
-     index-count
+     index count
          a = n.[10,2]
-     msbbase
+     msb-index index
          a = n.[-2..]  strip the two most significant bits
      index
@@ Line 30: / Line 55: @@
   Using python operator precedence with assembly style additions
   ( ) Custom order of operations, '''a = (a + b) * c'''
-  +   addition
+  Arithmetic
- <+> addition and set condition codes on result
+   +   addition
- -
+   <+> addition and set condition codes on result
- *
+   -
- /
+   *
- < > <= >= <>
+   /
- &   and
- |   or
- ^   eor
- &~  bic
- |~  orn
- ^~  eon
- ~   not
-  <<  lsl
+  Comparisons
-  >>  lsr
+   =
- >>> asr
+   <
- >|> ror
+   >
+   <=
+   >=
+   <>
+  Bitwise logical
+   Normal  Alternative?
+   &       and
+   |       or
+   ^       eor
+   &~      bic
+   |~      orn
+   ^~      eon
+   ~       not
+ Shifts
+   Normal  High precedence
+   <<      lsl
+   >>      lsr
+   >>>     asr
+   >|>     ror
 === Built in functions ===
@@ Line 55: / Line 93: @@
   rev    swap endian
   cnt    population count
- cset   conditionally set 0 or 1
- csetm  conditionally set 0 or -1
   max
   min
   swap
   adr    returns the address of a label
+ Conditional functions                 true false
+   CSEL  cond x y   select               x | y
+   CSINC cond x y   select increment   x+1 | y
+   CSINV cond x y   select inversion    ~x | y
+   CSNEG cond x y   select negation     -x | y
+   CINC  cond x     increment          x+1 | x
+   CINV  cond x     invert              ~x | x
+   CNEG  cond x     negate              -x | x
+   CSET  cond       set                  1 | 0
+   CSETM cond       set mask            -1 | 0
 === Program flow ===
-     call label {reg-list}
+ < reg-list = > label <reg-list>
-      return {reg-list}
+      Function call
-      again
-      exit
+ return <reg-list>
-      bcc
+      Return from function with zero or more variables
-      while
-      dowhile
+ again
-      for
+      Jump to the top of the loop
-        for x
-        for x = 1 to y
+ break
-     if then else
+      Jump to after the end of the loop
+ bcc
+      Conditional branch to a label
+ while cond
+      Execute the loop zero or more times while cond = true
+ dowhile cond
+      Execute the loop one or more times while cond = true
+ match
+ for x
+ for x = 1 to y
+ if then else
 ===Conditional===
   <op> set condition flags on the result of op, '''a = b <+> c'''
-  a = cond ? val1 : val2
- a = <eq> ? x+1 : y
   a = b > c ? 10 : 20
+ a = 10 if b > c else 20
+ a = if b > c then 10 else 20
   <le> returns -1 if the condition flags matches the condition for Less than or equal otherwise 0.
-  <c> returns the Carry flag (0|1)
+  <c> returns the Carry flag as 0 or 1
+.
 ====Condition flags====
   N Negative - The most significant bit of the result, 1 if the result is negative otherwise 0.