Difference between revisions of "ARM: Cortex-M3 Thumb-2 instruction set"
(→Condition flags) |
|||
(14 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
+ | The instruction set of the ARM Cortex-M3 CPU used in the [[STM32 Microcontroller]] | ||
+ | *[http://www.sciencezero.4hv.org/download/computing/Tumb-2.txt Download this page as a text file] | ||
+ | |||
==Hardware registers== | ==Hardware registers== | ||
*R0-R12 General purpose registers | *R0-R12 General purpose registers | ||
Line 16: | Line 19: | ||
==Immediate constants== | ==Immediate constants== | ||
*imm<n> means a constant of n bits (a value that is fixed as assemble time and can not be changed during execution) | *imm<n> means a constant of n bits (a value that is fixed as assemble time and can not be changed during execution) | ||
− | *# tells the assembler that the following is an immediate constant | + | *<nowiki>#</nowiki> tells the assembler that the following is an immediate constant |
− | + | ||
==Parameters== | ==Parameters== | ||
*<x> means always x | *<x> means always x | ||
Line 37: | Line 40: | ||
EQ Z = 1 Equal (to zero) Equal | EQ Z = 1 Equal (to zero) Equal | ||
NE Z = 0 Not equal Not equal, or unordered | NE Z = 0 Not equal Not equal, or unordered | ||
− | CS / HS C = 1 Carry Set / Unsigned higher or same Greater than or equal, or unordered | + | CS / HS C = 1 Carry Set / Unsigned higher or same Greater than or equal, or unordered |
− | CC / LO C = 0 Carry Clear / Unsigned lower Less than | + | CC / LO C = 0 Carry Clear / Unsigned lower Less than |
MI N = 1 Negative Less than | MI N = 1 Negative Less than | ||
PL N = 0 Positive Greater than or equal, or unordered | PL N = 0 Positive Greater than or equal, or unordered | ||
Line 50: | Line 53: | ||
LE Z = 1 or N <> V Signed less than or equal Less than or equal, or unordered | LE Z = 1 or N <> V Signed less than or equal Less than or equal, or unordered | ||
AL Any Always (normally omitted) Always (normally omitted) | AL Any Always (normally omitted) Always (normally omitted) | ||
− | * If a two character condition code is added to the end of the instruction name, the assembler will generate the correct IT (If-Then) instructions | + | * If a two character condition code is added to the end of the instruction name, |
+ | the assembler will generate the correct IT (If-Then) instructions | ||
+ | |||
E.g. ADDEQ r0,R0,#1 (execute the instruction if the zero flag is set) | E.g. ADDEQ r0,R0,#1 (execute the instruction if the zero flag is set) | ||
will be converted by the assembler to | will be converted by the assembler to | ||
Line 69: | Line 74: | ||
If using this together with condition codes, it is in form of: ADDSEQ R0,R1,R2 | If using this together with condition codes, it is in form of: ADDSEQ R0,R1,R2 | ||
− | + | ==Shifts== | |
LSL Logical shift left 0xFFFFFF00 LSL #4 = 0xFFFFF000 (shifts in zero at the bottom) | LSL Logical shift left 0xFFFFFF00 LSL #4 = 0xFFFFF000 (shifts in zero at the bottom) | ||
LSR Logical shift right 0xFFFFFF00 LSR #4 = 0x0FFFFFF0 (shifts in zero at the top) | LSR Logical shift right 0xFFFFFF00 LSR #4 = 0x0FFFFFF0 (shifts in zero at the top) | ||
ASR Arithmetic shift right 0xFFFFFF00 ASR #4 = 0xFFFFFFF0 (shifts in the original bit 31 at the top) | ASR Arithmetic shift right 0xFFFFFF00 ASR #4 = 0xFFFFFFF0 (shifts in the original bit 31 at the top) | ||
− | ROR Rotate right 0x12345678 | + | ROR Rotate right 0x12345678 ROR #4 = 0x81234567 |
− | RRX Rotate right with extend Rotates the operand one bit to the right through the carry as a 33 bit value, Carry -> operand -> Carry | + | RRX Rotate right with extend Rotates the operand one bit to the right through the carry as a 33 bit value, |
+ | Carry -> operand -> Carry | ||
==Thumb-2 instruction set== | ==Thumb-2 instruction set== | ||
Line 80: | Line 86: | ||
MVN{S} Rd, <Operand2> Move not Rd = 0xFFFFFFFF EOR Operand2 | MVN{S} Rd, <Operand2> Move not Rd = 0xFFFFFFFF EOR Operand2 | ||
MOV Rd, #<imm16> Move wide Rd = imm16 | MOV Rd, #<imm16> Move wide Rd = imm16 | ||
− | MOVT Rd, #<imm16> Move top Rd[31:16] = imm16, the constant is put in the upper 16 bits of Rd, the lower 16 bits are unaffected | + | MOVT Rd, #<imm16> Move top Rd[31:16] = imm16, |
− | + | the constant is put in the upper 16 bits of Rd, | |
+ | the lower 16 bits are unaffected | ||
+ | |||
ADD{S} Rd, Rn, <Operand2> Add Rd = Rn + Operand2 | ADD{S} Rd, Rn, <Operand2> Add Rd = Rn + Operand2 | ||
ADD Rd, Rn, #<imm12> Add wide Rd = Rn + Imm12 | ADD Rd, Rn, #<imm12> Add wide Rd = Rn + Imm12 | ||
ADC{S} Rd, Rn, <Operand2> Add with carry Rd = Rn + Operand2 + Carry | ADC{S} Rd, Rn, <Operand2> Add with carry Rd = Rn + Operand2 + Carry | ||
− | + | ||
SUB{S} Rd, Rn, <Operand2> Subtract Rd = Rn - <Operand 2> | SUB{S} Rd, Rn, <Operand2> Subtract Rd = Rn - <Operand 2> | ||
SBC{S} Rd, Rn, <Operand2> Subtract with carry Rd = Rn – Operand2 - (1 - Carry) | SBC{S} Rd, Rn, <Operand2> Subtract with carry Rd = Rn – Operand2 - (1 - Carry) | ||
Line 91: | Line 99: | ||
RSB{S} Rd, Rn, <Operand2> Reverse subtract Rd = <Operand 2> - Rn | RSB{S} Rd, Rn, <Operand2> Reverse subtract Rd = <Operand 2> - Rn | ||
RSC{S} Rd, Rn, <Operand2> Reverse subtract with carry Rd = Operand2 – Rn – (1 - Carry) | RSC{S} Rd, Rn, <Operand2> Reverse subtract with carry Rd = Operand2 – Rn – (1 - Carry) | ||
− | + | ||
− | MUL{S} Rd, Rm, Rs Multiply Rd = Rn * Rm | + | MUL{S} Rd, Rm, Rs Multiply Rd = Rn * Rm Return 32 least significant bit |
− | MLA Rd, Rm, Rs, Rn Multiply and accumulate Rd = (Rn + (Rm * Rs)) | + | MLA Rd, Rm, Rs, Rn Multiply and accumulate Rd = (Rn + (Rm * Rs)) Return 32 least significant bit |
− | MLS Rd, Rm, Rs, Rn Multiply and subtract Rd = (Rn - (Rm * Rs)) | + | MLS Rd, Rm, Rs, Rn Multiply and subtract Rd = (Rn - (Rm * Rs)) Return 32 least significant bit |
UMULL RdLo, RdHi, Rm, Rs Multiply unsigned long, 64 bit result | UMULL RdLo, RdHi, Rm, Rs Multiply unsigned long, 64 bit result | ||
UMLAL RdLo, RdHi, Rm, Rs Multiply unsigned accumulate long, 64 bit result | UMLAL RdLo, RdHi, Rm, Rs Multiply unsigned accumulate long, 64 bit result | ||
− | + | ||
SDIV Rd, Rn, Rm Signed division 0x80000000 / 0xFFFFFFFF = 0x80000000, Rn / 0 = 0 | SDIV Rd, Rn, Rm Signed division 0x80000000 / 0xFFFFFFFF = 0x80000000, Rn / 0 = 0 | ||
UDIV Rd, Rn, Rm Unsigned division Rn / 0 = 0 | UDIV Rd, Rn, Rm Unsigned division Rn / 0 = 0 | ||
− | + | ||
ASR{S} Rd, Rm, <Rs|#imm5> Arithmetic shift right, canonical form of MOV{S} Rd, Rm, ASR <Rs|#imm5> | ASR{S} Rd, Rm, <Rs|#imm5> Arithmetic shift right, canonical form of MOV{S} Rd, Rm, ASR <Rs|#imm5> | ||
LSL{S} Rd, Rm, <Rs|#imm5> Logical shift left | LSL{S} Rd, Rm, <Rs|#imm5> Logical shift left | ||
Line 110: | Line 118: | ||
RBIT Rd, Rm Reverse bits in register, so bit 0 becomes bit 31 | RBIT Rd, Rm Reverse bits in register, so bit 0 becomes bit 31 | ||
REV Rd, Rm Byte-Reverse Word, reverses the byte order in a 32-bit register | REV Rd, Rm Byte-Reverse Word, reverses the byte order in a 32-bit register | ||
− | REV16 Rd, Rm Byte-Reverse Packed Halfword, reverses the byte order in each 16-bit halfword of a 32-bit register | + | REV16 Rd, Rm Byte-Reverse Packed Halfword, reverses the byte order in each 16-bit halfword of a |
− | REVSH Rd, Rm Byte-Reverse Signed Halfword, reverses the byte order in the lower 16-bit of a 32-bit register, and sign extends to 32 bit | + | 32-bit register |
+ | REVSH Rd, Rm Byte-Reverse Signed Halfword, reverses the byte order in the lower 16-bit of a | ||
+ | 32-bit register, and sign extends to 32 bit | ||
UXTB Rd, Rm{, <ROR #><0|8|16|24>} Unsigned Extend Byte, extracts an 8-bit value from a register, zero extends it to 32 bit. | UXTB Rd, Rm{, <ROR #><0|8|16|24>} Unsigned Extend Byte, extracts an 8-bit value from a register, zero extends it to 32 bit. | ||
UXTH Rd, Rm{, <ROR #><0|8|16|24>} Unsigned Extend Halfword, extracts a 16-bit value from a register, zero extends it to 32 bit | UXTH Rd, Rm{, <ROR #><0|8|16|24>} Unsigned Extend Halfword, extracts a 16-bit value from a register, zero extends it to 32 bit | ||
− | + | ||
− | CMP Rn, <Operand2> Does the same as SUBS Rd, Rn, <Operand2> but the result is not written to Rd, only the condition flags are updated | + | CMP Rn, <Operand2> Does the same as SUBS Rd, Rn, <Operand2> but the result is not written to Rd, |
+ | only the condition flags are updated | ||
CMN Rn, <Operand2> Rn + <Operand2> | CMN Rn, <Operand2> Rn + <Operand2> | ||
TST Rn, <Operand2> Rn AND <Operand2> | TST Rn, <Operand2> Rn AND <Operand2> | ||
TEQ Rn, <Operand2> Rn EOR <Operand2> | TEQ Rn, <Operand2> Rn EOR <Operand2> | ||
− | + | ||
AND{S} Rd, Rn, <Operand2> Bitwise AND, Rd = Rn AND <Operand2> | AND{S} Rd, Rn, <Operand2> Bitwise AND, Rd = Rn AND <Operand2> | ||
ORR{S} Rd, Rn, <Operand2> Bitwise OR, Rd = Rn OR <Operand2> | ORR{S} Rd, Rn, <Operand2> Bitwise OR, Rd = Rn OR <Operand2> | ||
Line 125: | Line 136: | ||
ORN{S} Rd, Rn, <Operand2> Or not, Rd = Rn OR NOT <Operand2> | ORN{S} Rd, Rn, <Operand2> Or not, Rd = Rn OR NOT <Operand2> | ||
BIC{S} Rd, Rn, <Operand2> Bit clear, Rd = Rn AND NOT <Operand2> | BIC{S} Rd, Rn, <Operand2> Bit clear, Rd = Rn AND NOT <Operand2> | ||
− | + | ||
BFC Rd, #<lsb>, #<width> Bit field clear | BFC Rd, #<lsb>, #<width> Bit field clear | ||
BFI Rd, Rn, #<lsb>, #<width> Bit field insert | BFI Rd, Rn, #<lsb>, #<width> Bit field insert | ||
SBFX Rd, Rn, #<lsb>, #<width> Signed bit field extract | SBFX Rd, Rn, #<lsb>, #<width> Signed bit field extract | ||
UBFX Rd, Rn, #<lsb>, #<width> Unsigned bit field extract | UBFX Rd, Rn, #<lsb>, #<width> Unsigned bit field extract | ||
− | + | ||
<Address> can be one of the following Example Action | <Address> can be one of the following Example Action | ||
− | [Rn {, #<-imm8|+imm12>}] | + | [Rn {, #<-imm8|+imm12>}] LDR R0, [R1, #8] R0 = [R1 + 8] |
− | [Rn {, #<+-imm8>}]! | + | [Rn {, #<+-imm8>}]! LDR R0, [R1, #8]! R1 = R1 + 8, R0 = [R1] |
− | [Rn], #<+-imm8> | + | [Rn], #<+-imm8> LDR R0, [R1], #4 R0 = [R1], R1 = R1 + 4 |
− | [Rn, Rm {, <LSL #0-3>}] | + | [Rn, Rm {, <LSL #0-3>}] STR R0, [R1, R2, LSL #2] R0 = [R1 + (R2 * 4)] |
<label> | <label> | ||
− | + | ||
LDR Rd, <Address> Load 32 bit word from memory | LDR Rd, <Address> Load 32 bit word from memory | ||
LDRH Rd, <Address> Load 16 bit half-word from memory | LDRH Rd, <Address> Load 16 bit half-word from memory | ||
Line 143: | Line 154: | ||
LDRB Rd, <Address> Load 8 bit byte from memory | LDRB Rd, <Address> Load 8 bit byte from memory | ||
LDRSB Rd, <Address> Load signed 8 bit byte from memory | LDRSB Rd, <Address> Load signed 8 bit byte from memory | ||
− | + | ||
STR Rd, <Address> Store 32 bit word to memory | STR Rd, <Address> Store 32 bit word to memory | ||
STRH Rd, <Address> | STRH Rd, <Address> | ||
STRB Rd, <Address> | STRB Rd, <Address> | ||
− | + | ||
<AddressDual> can be one of the following | <AddressDual> can be one of the following | ||
[<Rn>{, #+/-<imm8>}] | [<Rn>{, #+/-<imm8>}] | ||
[<Rn>], #+/-<imm8> | [<Rn>], #+/-<imm8> | ||
[<Rn>, #+/-<imm8>]! | [<Rn>, #+/-<imm8>]! | ||
− | + | ||
LDRD<c> <Rt>, <Rt2>, <label> Load register dual, literal (range -1020 to 1020.) | LDRD<c> <Rt>, <Rt2>, <label> Load register dual, literal (range -1020 to 1020.) | ||
LDRD<c> <Rt>, <Rt2>, <AddressDual> Load register dual | LDRD<c> <Rt>, <Rt2>, <AddressDual> Load register dual | ||
STRD<c> <Rt>, <Rt2>, <AddressDual> Store register dual | STRD<c> <Rt>, <Rt2>, <AddressDual> Store register dual | ||
− | + | ||
− | LDM{IA|IB|DA|DB} Rn{!}, <reglist> Load/store multiple, can transfer any list of registers, ! will update Rn to point to the address after/before the last register | + | LDM{IA|IB|DA|DB} Rn{!}, <reglist> Load/store multiple, can transfer any list of registers, |
− | STM{IA|IB|DA|DB} Rn{!}, <reglist> IA = increment after (default), IB = increment before, DA = decrement after, DB = decrement before (Action on address) | + | ! will update Rn to point to the address after/before the last register |
− | + | STM{IA|IB|DA|DB} Rn{!}, <reglist> IA = increment after (default), IB = increment before, | |
+ | DA = decrement after, DB = decrement before (Action on address) | ||
+ | |||
IT{pattern} {cond} If-then, sets the execution conditions for up to 4 following instructions | IT{pattern} {cond} If-then, sets the execution conditions for up to 4 following instructions | ||
− | + | <pattern> can be any combination of up to three T(then) and E(else) letters, | |
− | + | the first instruction following IT is always cond (T) | |
− | + | Instructions that can modify the program counter must be last in an IT block | |
+ | |||
B <label> Unconditional jump | B <label> Unconditional jump | ||
BL <label> R14 = address of next instruction, then jump to label | BL <label> R14 = address of next instruction, then jump to label | ||
− | BX Rm Branch and exchange (instruction sets), normal branch on Thumb-2, use it to return from a function like BX LR | + | BX Rm Branch and exchange (instruction sets), normal branch on Thumb-2, |
+ | use it to return from a function like BX LR | ||
BLX Rm R14 = address of next instruction, then jump to Rm | BLX Rm R14 = address of next instruction, then jump to Rm | ||
CB{N}Z Rn,<label> Compare branch, branch forward if a register is (not) zero | CB{N}Z Rn,<label> Compare branch, branch forward if a register is (not) zero | ||
TBB [Rn, Rm] Table branch, loads a byte from (Rn + Rm) and adds twice its value to the program counter | TBB [Rn, Rm] Table branch, loads a byte from (Rn + Rm) and adds twice its value to the program counter | ||
− | TBH [Rn, Rm, LSL #1] Loads a half word (16 bit) form (Rn + (Rm << 1)) and adds twice its value to the | + | TBH [Rn, Rm, LSL #1] Loads a half word (16 bit) form (Rn + (Rm << 1)) and adds twice its value to the PC |
− | + | PUSH <reglist> Push registers on the stack pointed to by SP, decrement address before each store, | |
− | PUSH <reglist> Push registers on the stack pointed to by SP, decrement address before each store, lowest-numbered register to the lowest memory address | + | lowest-numbered register to the lowest memory address |
− | POP <reglist> Restore them again | + | POP <reglist> Restore them again,increment address after each load |
− | + | ||
MRS Rd, <PSR> Rd = PSR (processor status register) | MRS Rd, <PSR> Rd = PSR (processor status register) | ||
MSR <PSR>_<fields>, Rm PSR = Rm (selected bytes only) | MSR <PSR>_<fields>, Rm PSR = Rm (selected bytes only) | ||
MSR <PSR>_<fields>, #<imm8m> PSR = immed_8r (selected bytes only) | MSR <PSR>_<fields>, #<imm8m> PSR = immed_8r (selected bytes only) | ||
− | |||
==The stack== | ==The stack== | ||
Line 212: | Line 226: | ||
IT instructions can also be paired for free with 16 bit instructions. | IT instructions can also be paired for free with 16 bit instructions. | ||
− | The general rules for generating the 16 bit form of the instructions | + | ===The general rules for generating the 16 bit form of the instructions=== |
− | + | * Use registers in the range R0-R7 | |
− | + | * Set the condition flags unless the instruction is conditional wherever possible | |
− | + | * Use immediate constants in the range 0-7 or 0-255 | |
− | Instructions encoded in 16 bit when using registers R0-R7 | + | ===Instructions encoded in 16 bit when using registers R0-R7=== |
ADR Rd, <label> (range 0 to 1020) | ADR Rd, <label> (range 0 to 1020) | ||
<ADDS|SUBS|MOVS> Rd, #imm8 | <ADDS|SUBS|MOVS> Rd, #imm8 | ||
Line 239: | Line 253: | ||
B <label> (range -2048 to 2046) | B <label> (range -2048 to 2046) | ||
− | Instructions encoded in 16 bit using registers R0-R15 | + | ===Instructions encoded in 16 bit using registers R0-R15=== |
MOV Rd, Rm | MOV Rd, Rm | ||
ADD Rd, Rm | ADD Rd, Rm | ||
BLX Rm | BLX Rm | ||
BX Rm | BX Rm | ||
− | |||
==How to enumerate the legal immediate constants for <Operand2>== | ==How to enumerate the legal immediate constants for <Operand2>== | ||
Line 306: | Line 319: | ||
The table branch byte instruction loads a byte from (Rn + Rm) and adds twice its value to the program counter. | The table branch byte instruction loads a byte from (Rn + Rm) and adds twice its value to the program counter. | ||
TBB [PC,R0] | TBB [PC,R0] | ||
− | table dcb (case0 - table) >> 1 We divide by 2 here because the instruction will multiply by 2 | + | table dcb (case0 - table) >> 1 We divide by 2 here because the instruction will multiply by 2 |
dcb (case1 - table) >> 1 | dcb (case1 - table) >> 1 | ||
dcb (case2 - table) >> 1 | dcb (case2 - table) >> 1 | ||
Line 313: | Line 326: | ||
case1 nop If R0 = 1 we arrive here | case1 nop If R0 = 1 we arrive here | ||
case2 nop If R0 = 2 we arrive here | case2 nop If R0 = 2 we arrive here | ||
− | |||
===Finding the span of the leftmost and rightmost ones=== | ===Finding the span of the leftmost and rightmost ones=== |
Latest revision as of 23:19, 5 April 2022
The instruction set of the ARM Cortex-M3 CPU used in the STM32 Microcontroller
Contents
- 1 Hardware registers
- 2 Register names
- 3 Immediate constants
- 4 Parameters
- 5 Optional parameters
- 6 Condition flags
- 7 Shifts
- 8 Thumb-2 instruction set
- 9 The stack
- 10 C language calling convention
- 11 Thumb-2 variable instruction length
- 12 How to enumerate the legal immediate constants for <Operand2>
- 13 Example code
Hardware registers
- R0-R12 General purpose registers
- R13 Used as stack pointer, is also called SP (can be used as a general purpose register with some restrictions)
- R14 Used as link register to keep the return address for fast function calls, also called LR (can be used as a general purpose register)
- R15 This is the program counter, also called PC
Register names
- Rd Destination register
- Rn First operand register (the operation is performed on this register using the second operand, so Rd = Rn - Rm)
- Rm Second operand register
- SP Stack pointer (R13)
- LR Link register (R14)
- PC Program counter (R15)
- <reglist> means a list of registers like {R0, R3, R7-R10} (R7-R10 is the range R7, R8, R9, R10)
Immediate constants
- imm<n> means a constant of n bits (a value that is fixed as assemble time and can not be changed during execution)
- # tells the assembler that the following is an immediate constant
Parameters
- <x> means always x
- <x|y> means either x or y
Optional parameters
- {x} means x or nothing
- {x|y} means either x or y or nothing
Condition flags
- Some instructions will update the condition flags if <S> (set condition flags) is added to the instruction name
- N Negative Bit 31 of the result
- Z Zero 1 if all bits of the result are 0
- C Carry Carry from the ALU adder, otherwise from the last bit shifted out of the barrel shifter
- V Overflow Overflow from the ALU adder, 0x7fffffff + 0x7fffffff are two positive numbers that gives a negative result and sets the overflow flag
<cond> Flag state Integer ALU / Shifter Vector Floating Point coprocessor ----------------------------------------------------------------------------------------------------- EQ Z = 1 Equal (to zero) Equal NE Z = 0 Not equal Not equal, or unordered CS / HS C = 1 Carry Set / Unsigned higher or same Greater than or equal, or unordered CC / LO C = 0 Carry Clear / Unsigned lower Less than MI N = 1 Negative Less than PL N = 0 Positive Greater than or equal, or unordered VS V = 1 Overflow Unordered (at least one NaN operand) VC V = 0 No overflow Not unordered HI C = 1 and Z = 0 Unsigned higher Greater than, or unordered LS C = 0 or Z = 1 Unsigned lower or same Less than or equal GE N = V Signed greater than or equal Greater than or equal LT N <> V Signed less than Less than, or unordered GT Z = 0 and N = V Signed greater than Greater than LE Z = 1 or N <> V Signed less than or equal Less than or equal, or unordered AL Any Always (normally omitted) Always (normally omitted) * If a two character condition code is added to the end of the instruction name, the assembler will generate the correct IT (If-Then) instructions E.g. ADDEQ r0,R0,#1 (execute the instruction if the zero flag is set) will be converted by the assembler to IT EQ ADD r0,R0,#1 <Operand2> may be one of the following: #imm8<<imm5 One byte shifted left by a constant to form a 32 bit value #(imm8 imm8 imm8 imm8) The same byte copied 4 times to create a 32 bit value #( 0 imm8 0 imm8) Same but two bytes are set to zero #(imm8 0 imm8 0) Same with the other two bytes set to zero Rm Normal register operation Rm, <LSL|LSR|ASR|ROR> #imm5 Register operation with constant shift Rm, RRX Register operation with rotate right with extend <S> Update the condition flags after the instruction has executed If using this together with condition codes, it is in form of: ADDSEQ R0,R1,R2
Shifts
LSL Logical shift left 0xFFFFFF00 LSL #4 = 0xFFFFF000 (shifts in zero at the bottom) LSR Logical shift right 0xFFFFFF00 LSR #4 = 0x0FFFFFF0 (shifts in zero at the top) ASR Arithmetic shift right 0xFFFFFF00 ASR #4 = 0xFFFFFFF0 (shifts in the original bit 31 at the top) ROR Rotate right 0x12345678 ROR #4 = 0x81234567 RRX Rotate right with extend Rotates the operand one bit to the right through the carry as a 33 bit value, Carry -> operand -> Carry
Thumb-2 instruction set
MOV{S} Rd, <Operand2> Move Rd = Operand2 MVN{S} Rd, <Operand2> Move not Rd = 0xFFFFFFFF EOR Operand2 MOV Rd, #<imm16> Move wide Rd = imm16 MOVT Rd, #<imm16> Move top Rd[31:16] = imm16, the constant is put in the upper 16 bits of Rd, the lower 16 bits are unaffected ADD{S} Rd, Rn, <Operand2> Add Rd = Rn + Operand2 ADD Rd, Rn, #<imm12> Add wide Rd = Rn + Imm12 ADC{S} Rd, Rn, <Operand2> Add with carry Rd = Rn + Operand2 + Carry SUB{S} Rd, Rn, <Operand2> Subtract Rd = Rn - <Operand 2> SBC{S} Rd, Rn, <Operand2> Subtract with carry Rd = Rn – Operand2 - (1 - Carry) SUB Rd, Rn, #<imm12> Subtract wide Rd = Rn - imm12 RSB{S} Rd, Rn, <Operand2> Reverse subtract Rd = <Operand 2> - Rn RSC{S} Rd, Rn, <Operand2> Reverse subtract with carry Rd = Operand2 – Rn – (1 - Carry) MUL{S} Rd, Rm, Rs Multiply Rd = Rn * Rm Return 32 least significant bit MLA Rd, Rm, Rs, Rn Multiply and accumulate Rd = (Rn + (Rm * Rs)) Return 32 least significant bit MLS Rd, Rm, Rs, Rn Multiply and subtract Rd = (Rn - (Rm * Rs)) Return 32 least significant bit UMULL RdLo, RdHi, Rm, Rs Multiply unsigned long, 64 bit result UMLAL RdLo, RdHi, Rm, Rs Multiply unsigned accumulate long, 64 bit result SDIV Rd, Rn, Rm Signed division 0x80000000 / 0xFFFFFFFF = 0x80000000, Rn / 0 = 0 UDIV Rd, Rn, Rm Unsigned division Rn / 0 = 0 ASR{S} Rd, Rm, <Rs|#imm5> Arithmetic shift right, canonical form of MOV{S} Rd, Rm, ASR <Rs|#imm5> LSL{S} Rd, Rm, <Rs|#imm5> Logical shift left LSR{S} Rd, Rm, <Rs|#imm5> Logical shift right ROR{S} Rd, Rm, <Rs|#imm5> Rotate right RRX{S} Rd, Rm Rotate right with extent, uses Carry as a 33rd bit CLZ Rd, Rm Count leading zeros RBIT Rd, Rm Reverse bits in register, so bit 0 becomes bit 31 REV Rd, Rm Byte-Reverse Word, reverses the byte order in a 32-bit register REV16 Rd, Rm Byte-Reverse Packed Halfword, reverses the byte order in each 16-bit halfword of a 32-bit register REVSH Rd, Rm Byte-Reverse Signed Halfword, reverses the byte order in the lower 16-bit of a 32-bit register, and sign extends to 32 bit UXTB Rd, Rm{, <ROR #><0|8|16|24>} Unsigned Extend Byte, extracts an 8-bit value from a register, zero extends it to 32 bit. UXTH Rd, Rm{, <ROR #><0|8|16|24>} Unsigned Extend Halfword, extracts a 16-bit value from a register, zero extends it to 32 bit CMP Rn, <Operand2> Does the same as SUBS Rd, Rn, <Operand2> but the result is not written to Rd, only the condition flags are updated CMN Rn, <Operand2> Rn + <Operand2> TST Rn, <Operand2> Rn AND <Operand2> TEQ Rn, <Operand2> Rn EOR <Operand2> AND{S} Rd, Rn, <Operand2> Bitwise AND, Rd = Rn AND <Operand2> ORR{S} Rd, Rn, <Operand2> Bitwise OR, Rd = Rn OR <Operand2> EOR{S} Rd, Rn, <Operand2> Bitwise Exclusive-OR. Rd = Rn EOR <Operand2> ORN{S} Rd, Rn, <Operand2> Or not, Rd = Rn OR NOT <Operand2> BIC{S} Rd, Rn, <Operand2> Bit clear, Rd = Rn AND NOT <Operand2> BFC Rd, #<lsb>, #<width> Bit field clear BFI Rd, Rn, #<lsb>, #<width> Bit field insert SBFX Rd, Rn, #<lsb>, #<width> Signed bit field extract UBFX Rd, Rn, #<lsb>, #<width> Unsigned bit field extract <Address> can be one of the following Example Action [Rn {, #<-imm8|+imm12>}] LDR R0, [R1, #8] R0 = [R1 + 8] [Rn {, #<+-imm8>}]! LDR R0, [R1, #8]! R1 = R1 + 8, R0 = [R1] [Rn], #<+-imm8> LDR R0, [R1], #4 R0 = [R1], R1 = R1 + 4 [Rn, Rm {, <LSL #0-3>}] STR R0, [R1, R2, LSL #2] R0 = [R1 + (R2 * 4)] <label> LDR Rd, <Address> Load 32 bit word from memory LDRH Rd, <Address> Load 16 bit half-word from memory LDRSH Rd, <Address> Load signed 16 bit half-word from memory LDRB Rd, <Address> Load 8 bit byte from memory LDRSB Rd, <Address> Load signed 8 bit byte from memory STR Rd, <Address> Store 32 bit word to memory STRH Rd, <Address> STRB Rd, <Address> <AddressDual> can be one of the following [<Rn>{, #+/-<imm8>}] [<Rn>], #+/-<imm8> [<Rn>, #+/-<imm8>]! LDRD<c>
The stack
A stack is a last in first out data structure, it is used to store temporary variables and data. It grows from high to low memory address, SP (R13) points to the last piece of data written. A set of registers will be transferred with the lowest numbered registers at the lowest addresses. Use the PUSH and POP instructions to transfer any set of registers containing R0-R12, LR and PC.
If SP contains 0x8000 and we execute the instruction PUSH {R0,R1,R7} the result will be
0x8000 .. <- Original address in SP 0x7ffc R7 0x7ff8 R1 0x7ff4 R0 <- SP points here now
If we now execute POP {R10-R12}
0x8000 .. <- SP points here now 0x7ffc R7 -> R12 0x7ff8 R1 -> R11 0x7ff4 R0 -> R10 <- Original address in SP
C language calling convention
Parameters are passed and returned in R0-R3 A double-word sized type is passed in two consecutive registers. A 128-bit containerized vector is passed in four consecutive registers. The content of the registers is as if the value had been loaded from memory with a single LDM instruction A subroutine must preserve the contents of the registers r4-r8, r10, r11 and SP (and r9 in PCS variants that designate r9 as v6). Return by doing BX LR
Thumb-2 variable instruction length
It is important to have at least half the instructions encoded as 16 bit to get maximum performance from flash memory. IT instructions can also be paired for free with 16 bit instructions.
The general rules for generating the 16 bit form of the instructions
- Use registers in the range R0-R7
- Set the condition flags unless the instruction is conditional wherever possible
- Use immediate constants in the range 0-7 or 0-255
Instructions encoded in 16 bit when using registers R0-R7
ADR Rd, <label> (range 0 to 1020) <ADDS|SUBS|MOVS> Rd, #imm8 <ADDS|SUBS> Rd, Rn, #imm3 <ADDS|SUBS> Rd, Rn, Rm <ADCS|ANDS|EORS|BICS|MVNS|ORRS|SBCS|UXTB|UXTH|MULS> Rd, Rm (MULS may be slower than MUL on some CPUs) RSBS Rd, Rn, #0 <REV|REV16|REVSH> Rd, Rm <ASRS|LSRS|LSLS> Rn, Rm, #imm5 CMP Rn, #imm8 <CMP|CMN|TST> Rn, Rm (Rm can be any register for CMP) <LDM|STM> Rn!, <registers> <LDR|STR>{H|B} Rt, [Rn{, #imm5}] <LDR|STR>{H|B} Rt, [Rn, Rm ] LDRS<H|B> Rt, [Rn, Rm] LDR Rt, <label> (0-1020) <PUSH|POP> <registers> IT{x{y{z}}} <cond> CB{N}Z Rn, <label> (range 0 to 126) B<cond> <label> (range -256 to 254) B <label> (range -2048 to 2046)
Instructions encoded in 16 bit using registers R0-R15
MOV Rd, Rm ADD Rd, Rm BLX Rm BX Rm
How to enumerate the legal immediate constants for <Operand2>
'abcdnnnnnnnn' is a 12 bit bitfield to be expanded
ThumbExpandImm()
if 'ab' = '00' case 'cd' when '00' imm32 = 'nnnnnnnn' ( Always encode 0 like this ) when '01' imm32 = '00000000 nnnnnnnn 00000000 nnnnnnnn' when '10' imm32 = 'nnnnnnnn 00000000 nnnnnnnn 00000000' when '11' imm32 = 'nnnnnnnn nnnnnnnn nnnnnnnn nnnnnnnn' else imm32 = ROR('1nnnnnnn', 'abcdn')
Example code
Condition flags
It is important to make full use of the condition flags to write efficient code. This code will set R0 to 0 or -1 depending on if R1 + R2 is 0 or not. ADD R0,R1,R2 CMP R0,#0 BEQ zero MOV R0,#-1 zero ...
The optimised code using the condition flags becomes easier to read, more compact and faster. ADDS R0,R1,R2 MOVNE R0,#-1 ...
A branch is better if more than a few lines of code is to be skipped. ADDS R0,R1,R2 BEQ zero ... Block of code to be skipped ... zero ...
If-then
The IT instruction will make 1 to 4 following instructions conditional. The letter T specifies <cond> and E specifies inverse of <cond>. The first letter of the pattern is always T so the first conditional instruction will always have the condition <cond>. IT EQ Read this as If EQual Then ADD R0,R0,#1 ADD R0,R0,#1 <- This will only be executed if the Z condition flag is 1
ITE EQ Read this as If EQual Then ADD R0,R0,#1 Else ADD R1,R1,#1 ADD R0,R0,#1 <- This will only be executed if the Z condition flag is 1 ADD R1,R1,#1 <- This will only be executed if the Z condition flag is 0
It is easier to let the assembler generate IT instructions automatically, just append the condition to the end of the instruction name. The assembler will enforce this form for the code affected by the IT instruction anyway. ADDEQ R0,R0,#1 ADDNE R1,R1,#1
Table branch
The table branch byte instruction loads a byte from (Rn + Rm) and adds twice its value to the program counter. TBB [PC,R0] table dcb (case0 - table) >> 1 We divide by 2 here because the instruction will multiply by 2 dcb (case1 - table) >> 1 dcb (case2 - table) >> 1 align Align here because instructions must start at an even address case0 nop If R0 = 0 we arrive here case1 nop If R0 = 1 we arrive here case2 nop If R0 = 2 we arrive here
Finding the span of the leftmost and rightmost ones
CLZ R1,R0 R1 now contains the number of zeros to the left of the leftmost 1 in R0 RBIT R0,R0 R0 is now mirrored CLZ R0,R0 R0 now contains the number of zeros to the right of the rightmost 1 in the original value ADD R0,R1 R0 now contains the number of bits that are not part of the span RSB R0,R0,#32 R0 now contains the span (R0 = 32 - R0)