Data segments in Assembly language
The assembler and linker can be used for more than just code, in fact it can place data alongside code to be loaded. A programmer might want to include a lookup table or character strings to use in their program, they can do this by using some of the assembler’s directives. The programmer can assign these segments addresses, or can leave address calculation up to the assembler/linker. The data segment will be associated with a symbol, which can be used in writing assembly to make more readable code.
Using data associated directives
There are a number of assembler directives that can be used to setup data sections in your source files.
The .data
directive indicates the following statements in the source file are to be put into data segments of memory. Before using other ‘data’ directives, place .data
to ensure the data gets placed in the correct location in the resulting binary.
The .byte
directive indicates to the assembler that data following will be placed in sequential addresses. Data typed after the expression, seperated by commas will be placed in memory. A newline ends the expression, to have a multiline byte structure, use multiple byte statements.
Remember that ARM data words are aligned by addresses with multiples of 4 and between each word are 4 byte addresses. Don’t use ldr
and str
to access non-word aligned addresses (addresses not a multiple of 4). When accessing individual bytes, use the strb
and ldrb
instructions to access only the byte at the given address.
The .byte
directive can be preceded with a label to associate a symbol with the base address, you can use this symbol the same as you would for a symbol defined by .equ
, .set
or any other label in your code.
In addition to .byte
, .hword
and .word
can be used to define 16 and 32-bit wide data entries.
Example using ‘.byte’
Below a data segment is defined before a program. The data is accessed sequentially in the program.
#indicate main is in this file
.global main
#Indicate data follows
.data
#label start of array as 'data_start'
data_start:
.byte 0x8, 0xBA, 0xAD, 0xF0, 0x00, 0xD, 0x22, 0x56
#program is below
.text
main:
@load r0 with the base address of our data
ldr r0, =data_start
@load r1 with limit of the defined data segment
ldr r1, =(data_start+8)
read_loop:
ldrb r2,[r0],#1 @load r2 with current byte from data segment
cmp r0,r1 @check if r0 at limit
bne read_loop
b . @hold program here
the ‘.comm’ directive
In addition to placing data into a binary file, directives can be used to associate a memory region with a symbol without placing data in it. This can be used to set-up memory regions for data arrays that will be filled at runtime.
The .comm
directive is given a symbol and length (in bytes) to declare the memory region, for example ‘.comm dat_arr, 32’, will indicate the assembler/linker that the symbol dat_arr
is a 32-byte long data section.
Code example
The folloing program defines a region of memory holding initialized data, as well as a region uninitialized. In the program, data is copied from the initialized array to the second array.
.global main
.data
#set 16 bytes of data associated with 'store_dat'
store_dat:
.byte 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
#set aside 64 bytes for dat_arr
.comm dat_arr, 64
.text
main:
ldr r0, =store_dat @r1 is store_dat pointer
ldr r1, =dat_arr @r2 is dat_arr pointer
ldr r2, #15 @accessing 16 bytes
copy_loop:
ldrb r3,[r0],#1 @copy from store_dat pointer
strb r3,[r1],#1 @copy to dat_arr pointer
subs r2,r2,#1
bne copy_loop
b .
Other Directives
The GNU AS (GAS) assembler used by Xilinx SDK has many other expressions not covered. Take some time to go over some of the others, they might be handy in your assembly programming endeavors. A reference For GNU AS directives can be found here .