# ARM Assembly Instruction Details For Project 1

1503

## Introduction

There are many ARM instructions, and we will introduce them over time as we need them for programming projects. For this first project, we need instructions that can load data from main memory into a register, store information from a register to main memory, move data between registers, add data stored in registers, shift data stored in a register, and finally, branch to a “loop” label. These instructions are examined in more detail here. In later projects, we will introduce more instructions. Once you get a feel for ARM instruction format and use, you can refer to the ARM Architecture Reference Manual for complete information on all instructions.

Data used by a running program must come from an external source (i.e., main memory or a port). Data can be placed into main memory during programming, or it can arrive during run-time via a port. Either way, all data consumed by the ARM processor must be accessed using the external memory bus. Since the ALU can only operate on data stored in a register, we must first “load” data into a register before we can use it. Likewise, ALU results can only be written to a register, so we must “store” results back into main memory in order to free up the register for use by later instructions.

There are several load and store instructions to load data from a memory location into a register or store data from a register into memory. Some deal with bytes, some load two consecutive bytes, and some load 4 consecutive bytes (32 bits); some are privileged (meaning they can write to protected areas of memory – more on that later); and some provide shared memory synchronization. The table below shows all available load and store instructions; we will only use the instructions in the left two columns to load or store words (LDR, STR), half words (LDRH, STRH), and/or bytes (LDRB, STRB). You can find more information on the other privileged and exclusive load and store instructions in the ARM Architecture Reference Manual.

Table 1. Load and Store instructions (From ARM® Architecture Reference Manual, page A4-175)
32-bit word LDR STR LDRT STRT LDREX STREX
16-bit halfword - STRH - STRHT - STREXH
16-bit unsigned halfword LDRH - LDRHT - LDREXH -
16-bit signed halfword LDRSH - LDRSHT - - -
8 -bit byte - STRB - STRBT - STREXB
8 -bit unsigned byte LDRB - LDRBT - LDREXB -
8 -bit signed byte LDRSB - LDRSBT - - -
Two 32-bit words LDRD STRD - - - -
64-bit doubleword - - - - LDREXD STREXD

Load and store instructions can only use memory addresses stored in another register. The basic forms are as follows.

LDR   Rn, [Rm]    @ Load rn with the 32-bit memory contents pointed at by rm
STR   Rn, [Rm]    @ Store the contents of rn at the memory location pointed at by rm.


The square brackets around the second register indicate that it contains an address, and that it is acting as a pointer to a memory location. This is the fundamental addressing mode – it uses only value stored in the “base register” as the address. An immediate 8-bit offset can also be included, and the offset value can be added to or subtracted from the base address prior to forming the address. The value in the base register is unchanged.

LDR   Rn, [Rm, #0x04]    @load Rn with the memory contents at Rm +4
STR   Rn, [Rm, -#0x08]   @Store the contents of Rn at the memory location pointed at by Rm -8.


An offset stored in another register can also be added to or subtracted from the base address.

LDR   Rn, [Rm, -Rt]   @ Load Rn with the memory contents at Rm - Rt
STR   Rn, [Rm, Rt]    @ Store the contents of Rn at the memory location pointed at by Rm + Rt.


An offset stored in a register can also be shifted prior to adding to or subtracting from the base register. This addressing mode is useful for accessing arrays – the index can be scaled to the size of the array element.

LDR   Rn, [Rm, Rt, LSL #0x16]   @ Load Rn with the memory contents at Rm + (left-shifted 16 Rt)
STR   Rn, [Rm, Rt, LSR #0x4]    @ Store Rn at the location pointed at by Rm + (left-shifted 4 Rt).


The load and store instructions presented so far do not change the base register contents. If desired, the base address register can also be automatically updated. Adding a “!” after the bracket-enclosed base register definition causes the base register to be updated after the access occurs. This is called a “pre indexed” access, because memory is accessed before the base register is modified.

LDR   Rn, [Rm, Rt, LSL #0x16]!  @ load Rn with the memory contents at Rm + (left-shifted 16 Rt), and then
@ After the access, update Rm with the address that was used.


Post indexed accesses are also available.

LDR Rn, [Rm], #0x04   @ Load Rn from Rm, and then add 4 to Rm afterwards.
STR Rn, [Rm], Rt, LSR #0x4   @ Store Rn at the location pointed at by Rm, then add the shifted Rt.


There are also load and store instructions that operate on multiple memory locations. We will examine these later.

### Move instructions

A MOV instruction can move data between registers, or from an immediate to a register. A MVN instruction also moves information, but does a bit-wise negation in the process.

MOV   Rd, Rs   	 @ Load Rd with the contents of Rs; leave Rs unchanged
MOV   Rd, #0xFFF @ Store FFF in Rd (up to 12-bit immediates can be used).
MVN   Rd, Rs	 @ Load Rd with the bit-wise inverted contents of Rs; leave Rs unchanged.


Move Wide (MOVw) and Move Top (MOVt) instructions move an immediate value of 16 bits at a time.

MOVw Rd, #N	 @ Move 16-bit immediate value into the bottom half of Rd, zeroing the top 16 bits of Rd.
MOVt Rd, #N	 @ Move a 16-bit immediate value into the top half of Rd without changing the bottom 16 bits


There are many other flavors of MOV instructions, to move half words or bytes, to move data into special registers, and to move data to coprocessors. You can get more information about these move instructions from the Arm Architectural Reference starting on page A8-484.

The ADD and ADC (add with carry) instructions add the contents in two 32-bit “source” registers and place the result in a 32-bit “destination” register (the destination register can be the same as one of the sources).

ADD    R0,R1,R2	       @ R0 <= R1 + R2; R1 and R2 unchanged; status bits not updated
ADC    R0,R1,R2        @ R0 <= R1 + R2 + C; R1 and R2 unchanged; status bits not updated
ADC    R3, R2, #0xABC  @ R3 <= 0xABC + R2; R2 unchanged; status bits not updated
ADDS   R0,R1,R2	       @ R0 <= R1 + R2; R1 and R2 unchanged; status bits updated
ADCNEQ R0,R1,R2        @ R0 <= R1 + R2 + C *if* Z bit = 0; R1 and R2 unchanged; status bits updated


The ARM processor has no increment instruction. Instead, an immediate ‘1’ is added to a register, and the result is stored back into the same register.

ADD    R0,R0,#1	       @ R0 <= R0 + 1; status bits not updated
ADDS   R0,R0,#1        @ R0 <= R0 + 1; status bits updated


There are also many other related ADD instructions. One source operand can be shifted prior to being added, and/or add instructions can use the SP or PC as the destination. You can get more information about these instructions from the Arm Architectural Reference starting on page A8-300.

### Shift Instruction

The ARM has two shift instructions: arithmetic shift and logical shift. An arithmetic shift can right-shift a number by up to 32 bit positions, and the sign bit is placed in all vacated bits. A logical shift can shift right (LSR) or left (LSL) up to 32 bits, and a ‘0’ will be placed in all vacated bits. The number of bits to shift can be an immediate operand, or it can be placed in a register. The source register contents are not changed.

ASR   r0,r1,#0x04      @ Arithmetic Shift Right R1 four bits, sign extend, and place result in r0
ASL   r0,r1,r2	       @ Arithmetic Shift Left R1 by the number of bits in the bottom of r1, sign extend, and place result in r0
LSL   r0,r1,#0x06     @ Logical Shift Left R1 6 bits, 0 fill, place result in R0
LSR   r0,r1,r2	      @ Logcal Shift Right R1 by the number of bits in the bottom of R1, 0 fill, place result in r0


### Branch

A branch instruction (mnemonic B) loads a new value into the PC. It can be used for conditional or unconditional branches (based on the conditional “extended mnemonics”, just like any other instruction). Branch instructions are used for creating loops and if-then-else constructs. Branch and Link instructions (mnemonic BL) are used to call subroutines. BL is identical to B, except that it copies the current PC to the Link register (R14) before performing the branch. This allows the programmer to copy the link register back into the PC at the end of the subroutine, so that on return, the program will resume executing at the instruction immediately following the one that called it.

B       label1	@ Unconditional branch to “label1” – good for “always” loops
BEQ     label1	@ Conditional branch to label1 – branch taken only if Z flag set by last instruction


The branch opcode requires only 6 bits, leaving 24 opcode bits for an immediate operand in the opcode. In this case, the immediate operand is a number that is added to (or subtracted from) the current PC. As soon as the new address is calculated and loaded into the PC, the next executed instruction will be at that new address. This means two things: first, branches can only redirect program flow 2^23 address locations away from the current PC (not 2^24 - the most significant bit is used as a sign bit so branches can go both forward and backward); and second, we need to know how far away from the current PC to branch.

One of the Assembler’s main duties is to calculate branch addresses from labels. Labels are text strings in assembly source files that are not mnemonics, but are instead included to provide information to the assembler itself. Label text strings start in the leftmost column, and end with a colon. When the Assembler encounters a label, it simply records its location in the source file in a “symbol table”. Any given line in your source file can have a label - they do not change how the program functions in any way (if you wanted to, you could put a label on every line in your source file). The Assembler uses label location information to (among other things) calculate branch offsets.

In the example code below, before the Assembler can create the opcode for the BNE menomonic, it needs to know what immediate operand to load in the lower 24 bits. It calculates the immediate operand value by finding the difference between the address of the BNE instruction and the address of the referenced label (and as discussed earlier, the address of the label is in the symbol table created by the Assembler).

MOV	R2,#10		@ R2 <- 10; R2 is loop index
myloop:
ADD 	R1,R1, #1   @ Increment R1
SUBS 	R2,R2, #1   @ Decrement R2
BNE	myloop	@ If R2 != 0, branch to myloop
MOV	R4, 0xABC @ Continue on with program...