Difference between revisions of "Abstracted assembler"
From ScienceZero
(→=Condition flags) |
(→Open questions) |
||
(44 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | Most low level details can be found in ARM Architecture Reference Manual ARMv8. | ||
+ | |||
+ | ==Goals== | ||
+ | #Let the compiler allocate registers | ||
+ | #Make it possible to use readable variable names in a safe way | ||
+ | #Add a thin level of abstraction that does not require a full compiler to generate good code | ||
+ | |||
==Open questions== | ==Open questions== | ||
− | + | ||
− | + | * What is the default data size, 32 or 64 bit? | |
− | + | ** Pros for 32-bit: | |
− | + | *** ints are usually 32-bit in C and C++, even on 64-bit machines so may be more in line with developer expectations | |
+ | *** may yield faster code | ||
+ | ** Pros for 64-bit: | ||
+ | *** Address locations / pointers need to be 64-bit, so 64-bit int may make it easier / more logical to do address / pointer calculations | ||
+ | *** Attempt to use a 32-bit int as a pointer should probably give a compiler error. With 64-bit ints, such errors may be more rare | ||
+ | |||
+ | * How to specify floating-point operations? | ||
+ | ** Options: | ||
+ | *** Number with '''.''' infers float | ||
+ | **** '''myvar = 1.0''' | ||
+ | *** '''.f''' forces float | ||
+ | **** '''myvar.f = 1''' | ||
+ | |||
+ | * How to handle expressions involving float? | ||
+ | ** Options: | ||
+ | *** Handle like in C and C++, where any int that becomes involved in float calculation is "upgraded" to a float first and the result becomes float | ||
+ | |||
+ | * Which instruction sets should be supported? | ||
+ | ** Options: | ||
+ | *** Any that the underlying assembler support. So, anything that appears to be instructions is just passed through to the assembler after name to register resolve | ||
+ | |||
+ | * How to specify SIMD operations? | ||
+ | ** Options: | ||
+ | *** Use language support for assembly instruction pass through and use SIMD instructions directly | ||
+ | |||
+ | * What is the simplest type system that will work? | ||
==Language features== | ==Language features== | ||
===bit slicing=== | ===bit slicing=== | ||
− | index | + | index index |
− | a = | + | a = n.[10..2] |
− | index | + | index lsb |
− | a = | + | a = n.[10..] |
+ | msb index | ||
+ | a = n.[..2] | ||
+ | index count | ||
+ | a = n.[10,2] | ||
+ | msb-index index | ||
+ | a = n.[-2..] strip the two most significant bits | ||
index | index | ||
bit = n.[index] | bit = n.[index] | ||
− | + | ||
=== Common operators === | === Common operators === | ||
− | + addition | + | Using python operator precedence with assembly style additions |
− | + | ( ) Custom order of operations, '''a = (a + b) * c''' | |
− | + | Arithmetic | |
− | + | + addition | |
− | + | <+> addition and set condition codes on result | |
− | < > <= >= <> | + | - |
− | & and | + | * |
− | + | / | |
− | + | ||
− | bic | + | Comparisons |
− | orn | + | = |
− | eon | + | < |
− | not | + | > |
+ | <= | ||
+ | >= | ||
+ | <> | ||
+ | |||
+ | Bitwise logical | ||
+ | Normal Alternative? | ||
+ | & and | ||
+ | | or | ||
+ | ^ eor | ||
+ | &~ bic | ||
+ | |~ orn | ||
+ | ^~ eon | ||
+ | ~ not | ||
− | << | + | Shifts |
− | + | Normal High precedence | |
− | + | << lsl | |
− | + | >> lsr | |
+ | >>> asr | ||
+ | >|> ror | ||
=== Built in functions === | === Built in functions === | ||
Line 40: | Line 93: | ||
rev swap endian | rev swap endian | ||
cnt population count | cnt population count | ||
− | + | max | |
− | + | min | |
+ | swap | ||
+ | adr returns the address of a label | ||
+ | |||
+ | Conditional functions true false | ||
+ | CSEL cond x y select x | y | ||
+ | CSINC cond x y select increment x+1 | y | ||
+ | CSINV cond x y select inversion ~x | y | ||
+ | CSNEG cond x y select negation -x | y | ||
+ | CINC cond x increment x+1 | x | ||
+ | CINV cond x invert ~x | x | ||
+ | CNEG cond x negate -x | x | ||
+ | CSET cond set 1 | 0 | ||
+ | CSETM cond set mask -1 | 0 | ||
=== Program flow === | === Program flow === | ||
− | call | + | < reg-list = > label <reg-list> |
− | + | Function call | |
− | again | + | |
− | + | return <reg-list> | |
− | bcc | + | Return from function with zero or more variables |
− | while | + | |
− | dowhile | + | again |
− | + | Jump to the top of the loop | |
− | + | ||
− | + | break | |
− | + | Jump to after the end of the loop | |
− | + | ||
+ | bcc | ||
+ | Conditional branch to a label | ||
+ | |||
+ | while cond | ||
+ | Execute the loop zero or more times while cond = true | ||
+ | |||
+ | dowhile cond | ||
+ | Execute the loop one or more times while cond = true | ||
+ | |||
+ | match | ||
+ | |||
+ | for x | ||
+ | |||
+ | for x = 1 to y | ||
+ | |||
+ | if then else | ||
+ | |||
===Conditional=== | ===Conditional=== | ||
− | <op> set condition flags on the result of op '''a = b <+> c''' | + | <op> set condition flags on the result of op, '''a = b <+> c''' |
+ | |||
+ | a = b > c ? 10 : 20 | ||
+ | a = 10 if b > c else 20 | ||
+ | a = if b > c then 10 else 20 | ||
− | ===Condition flags=== | + | <le> returns -1 if the condition flags matches the condition for Less than or equal otherwise 0. |
+ | <c> returns the Carry flag as 0 or 1 | ||
+ | . | ||
+ | ====Condition flags==== | ||
N Negative - The most significant bit of the result, 1 if the result is negative otherwise 0. | N Negative - The most significant bit of the result, 1 if the result is negative otherwise 0. | ||
Z Zero - 1 if the result of the instruction is zero, 0 otherwise. | Z Zero - 1 if the result of the instruction is zero, 0 otherwise. | ||
Line 83: | Line 173: | ||
LO Lower (unsigned) Alias for CC HS | LO Lower (unsigned) Alias for CC HS | ||
HS Higher/same (unsigned) Alias for CS LO | HS Higher/same (unsigned) Alias for CS LO | ||
− | |||
− | |||
− | |||
===Memory access=== | ===Memory access=== | ||
− | + | [adr].32 = 0 | |
− | + | [adr].64 = {xpos,ypos} Multiple variables | |
− | + | b = [addr].s16 | |
− | + | b = [addr1 + addr2].s16 | |
− | + | arr[idx].64 = [addr2] | |
+ | arr[idx]!64 = [addr2] Increment idx by 8 | ||
+ | |||
+ | push {var-list} Access the system stack | ||
+ | pop {var-list} | ||
+ | |||
+ | ===Labels=== | ||
+ | Everything that starts on the first character on a line is a label. Labels are local to the function they are defined in. To reach other labels a fully qualified name is required. | ||
+ | label x = 3 | ||
+ | notalabel | ||
+ | a = adr bootloader.var2 ; Fully qualified label name | ||
== Data structures== | == Data structures== | ||
Line 104: | Line 201: | ||
==Memory manager== | ==Memory manager== | ||
+ | ==Implementation== | ||
==Examples== | ==Examples== | ||
Line 109: | Line 207: | ||
toHexStringNLZ: | toHexStringNLZ: | ||
clz x11,x0 | clz x11,x0 | ||
− | + | subs x11,x11,#64 | |
− | + | sub x1,x1,x11,asr #2 | |
cinc x1,x1,eq | cinc x1,x1,eq | ||
strb wzr,[x1] | strb wzr,[x1] | ||
Line 146: | Line 244: | ||
adr -= 1 | adr -= 1 | ||
[adr].8 = digit | [adr].8 = digit | ||
+ | |||
+ | |||
+ | tohhex adr number | ||
+ | leadingZeros = clz number | ||
+ | remainingBits = 64 <-> leadingZeros | ||
+ | remainingDigits = remainingBits / 4 | ||
+ | adr = adr + remainingDigits | ||
+ | adr = <eq> ? adr + 1 : adr | ||
+ | |||
+ | [adr].8 = 0 | ||
+ | |||
+ | dowhile number <> 0 | ||
+ | digit = number.[3..0] | ||
+ | number = number >> 4 | ||
+ | |||
+ | if digit <= 9 | ||
+ | digit = digit + '0' | ||
+ | else | ||
+ | digit = digit + '0' + 7 | ||
+ | |||
+ | adr -= 1 | ||
+ | [adr].8 = digit | ||
+ | |||
+ | === Add with carry === | ||
+ | x = y + z + <c> |
Latest revision as of 20:49, 9 October 2017
Most low level details can be found in ARM Architecture Reference Manual ARMv8.
Contents
Goals
- Let the compiler allocate registers
- Make it possible to use readable variable names in a safe way
- Add a thin level of abstraction that does not require a full compiler to generate good code
Open questions
- What is the default data size, 32 or 64 bit?
- Pros for 32-bit:
- ints are usually 32-bit in C and C++, even on 64-bit machines so may be more in line with developer expectations
- may yield faster code
- Pros for 64-bit:
- Address locations / pointers need to be 64-bit, so 64-bit int may make it easier / more logical to do address / pointer calculations
- Attempt to use a 32-bit int as a pointer should probably give a compiler error. With 64-bit ints, such errors may be more rare
- Pros for 32-bit:
- How to specify floating-point operations?
- Options:
- Number with . infers float
- myvar = 1.0
- .f forces float
- myvar.f = 1
- Number with . infers float
- Options:
- How to handle expressions involving float?
- Options:
- Handle like in C and C++, where any int that becomes involved in float calculation is "upgraded" to a float first and the result becomes float
- Options:
- Which instruction sets should be supported?
- Options:
- Any that the underlying assembler support. So, anything that appears to be instructions is just passed through to the assembler after name to register resolve
- Options:
- How to specify SIMD operations?
- Options:
- Use language support for assembly instruction pass through and use SIMD instructions directly
- Options:
- What is the simplest type system that will work?
Language features
bit slicing
index index a = n.[10..2] index lsb a = n.[10..] msb index a = n.[..2] index count a = n.[10,2] msb-index index a = n.[-2..] strip the two most significant bits index bit = n.[index]
Common operators
Using python operator precedence with assembly style additions ( ) Custom order of operations, a = (a + b) * c Arithmetic + addition <+> addition and set condition codes on result - * / Comparisons = < > <= >= <> Bitwise logical Normal Alternative? & and | or ^ eor &~ bic |~ orn ^~ eon ~ not Shifts Normal High precedence << lsl >> lsr >>> asr >|> ror
Built in functions
clz count leading zeros cls count leading signs rbit mirror bits rev swap endian cnt population count max min swap adr returns the address of a label Conditional functions true false CSEL cond x y select x | y CSINC cond x y select increment x+1 | y CSINV cond x y select inversion ~x | y CSNEG cond x y select negation -x | y CINC cond x increment x+1 | x CINV cond x invert ~x | x CNEG cond x negate -x | x CSET cond set 1 | 0 CSETM cond set mask -1 | 0
Program flow
< reg-list = > label <reg-list> Function call return <reg-list> Return from function with zero or more variables again Jump to the top of the loop break Jump to after the end of the loop bcc Conditional branch to a label while cond Execute the loop zero or more times while cond = true dowhile cond Execute the loop one or more times while cond = true
match for x for x = 1 to y if then else
Conditional
<op> set condition flags on the result of op, a = b <+> c a = b > c ? 10 : 20 a = 10 if b > c else 20 a = if b > c then 10 else 20
<le> returns -1 if the condition flags matches the condition for Less than or equal otherwise 0. <c> returns the Carry flag as 0 or 1
.
Condition flags
N Negative - The most significant bit of the result, 1 if the result is negative otherwise 0. Z Zero - 1 if the result of the instruction is zero, 0 otherwise. C Carry - 1 if the instruction results in a carry condition, for example an unsigned overflow that is the result of an addition. V Overflow - 1 if the instruction results in an overflow condition, for example a signed overflow that is the result of an addition.
Mnemonic Condition Exactly opposite AL Always Any NOP CC Carry clear C=0 CS CS Carry set C=1 CC EQ Equal Z=1 NE GE Greater than or equal N=V LT GT Greater than N=V and Z=0 LE HI Higher (unsigned) C=1 and Z=0 LS LE Less than or equal N<>V or Z=1 GT LS Lower or same (unsigned) C=0 or Z=1 HI LT Less than N<>V GE MI Negative N=1 PL NE Not equal Z=0 EQ PL Positive N=0 MI VC Overflow clear V=0 VS VS Overflow set V=1 VC LO Lower (unsigned) Alias for CC HS HS Higher/same (unsigned) Alias for CS LO
Memory access
[adr].32 = 0 [adr].64 = {xpos,ypos} Multiple variables b = [addr].s16 b = [addr1 + addr2].s16 arr[idx].64 = [addr2] arr[idx]!64 = [addr2] Increment idx by 8
push {var-list} Access the system stack pop {var-list}
Labels
Everything that starts on the first character on a line is a label. Labels are local to the function they are defined in. To reach other labels a fully qualified name is required.
label x = 3 notalabel a = adr bootloader.var2 ; Fully qualified label name
Data structures
String Array BitArray List Hastable Set Tree
Memory manager
Implementation
Examples
Print value in hexadecimal
toHexStringNLZ: clz x11,x0 subs x11,x11,#64 sub x1,x1,x11,asr #2 cinc x1,x1,eq strb wzr,[x1] .loop: ubfx x10,x0,#0,#4 cmp x10,#9 add x10,x10,#'0' add x11,x10,#7 csel x11,x10,x11,ls strb w11,[x1,#-1]! lsr x0,x0,#4 cbne x0,.loop ret
tohhex adr number leadingZeros = clz number remainingBits = 64 - leadingZeros remainingDigits = remainingBits / 4 adr = adr + remainingDigits if remainingDigits = 0 adr += 1 [adr].8 = 0 dowhile number <> 0 digit = number.[3..0] number = number >> 4 if digit <= 9 digit = digit + '0' else digit = digit + '0' + 7 adr -= 1 [adr].8 = digit
tohhex adr number leadingZeros = clz number remainingBits = 64 <-> leadingZeros remainingDigits = remainingBits / 4 adr = adr + remainingDigits adr = <eq> ? adr + 1 : adr [adr].8 = 0 dowhile number <> 0 digit = number.[3..0] number = number >> 4 if digit <= 9 digit = digit + '0' else digit = digit + '0' + 7 adr -= 1 [adr].8 = digit
Add with carry
x = y + z + <c>