Assembly Data Segments

Linking Data into ARM Binaries alongside Code

5343

Data segments in Assembly language

The assembler and linker can be used for more than just code, in fact it can place data alongside code to be loaded. A programmer might want to include a lookup table or character strings to use in their program, they can do this by using some of the assembler’s directives. The programmer can assign these segments addresses, or can leave address calculation up to the assembler/linker. The data segment will be associated with a symbol, which can be used in writing assembly to make more readable code.

Using data associated directives

There are a number of assembler directives that can be used to setup data sections in your source files.

The .data directive indicates the following statements in the source file are to be put into data segments of memory. Before using other ‘data’ directives, place .data to ensure the data gets placed in the correct location in the resulting binary.

The .byte directive indicates to the assembler that data following will be placed in sequential addresses. Data typed after the expression, seperated by commas will be placed in memory. A newline ends the expression, to have a multiline byte structure, use multiple byte statements.

Remember that ARM data words are aligned by addresses with multiples of 4 and between each word are 4 byte addresses. Don’t use ldr and str to access non-word aligned addresses (addresses not a multiple of 4). When accessing individual bytes, use the strb and ldrb instructions to access only the byte at the given address. The .byte directive can be preceded with a label to associate a symbol with the base address, you can use this symbol the same as you would for a symbol defined by .equ, .set or any other label in your code.

In addition to .byte, .hword and .word can be used to define 16 and 32-bit wide data entries.

Example using ‘.byte’

Below a data segment is defined before a program. The data is accessed sequentially in the program.

#indicate main is in this file
.global main


#Indicate data follows
.data

#label start of array as 'data_start'
data_start:
.byte 0x8, 0xBA, 0xAD, 0xF0, 0x00, 0xD, 0x22, 0x56



#program is below
.text

main:
	@load r0 with the base address of our data
	ldr r0, =data_start
	
	@load r1 with limit of the defined data segment
	ldr r1, =(data_start+8)
	
	read_loop:
		ldrb r2,[r0],#1 @load r2 with current byte from data segment
		cmp r0,r1 @check if r0 at limit
	bne read_loop


b . @hold program here

the ‘.comm’ directive

In addition to placing data into a binary file, directives can be used to associate a memory region with a symbol without placing data in it. This can be used to set-up memory regions for data arrays that will be filled at runtime.

The .comm directive is given a symbol and length (in bytes) to declare the memory region, for example ‘.comm dat_arr, 32’, will indicate the assembler/linker that the symbol dat_arr is a 32-byte long data section.

Code example

The folloing program defines a region of memory holding initialized data, as well as a region uninitialized. In the program, data is copied from the initialized array to the second array.

.global main


.data

#set 16 bytes of data associated with 'store_dat'
store_dat:
.byte 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15

#set aside 64 bytes for dat_arr
.comm dat_arr, 64


.text
main:

	ldr r0, =store_dat 	@r1 is store_dat pointer
	ldr r1, =dat_arr	@r2 is dat_arr pointer
	ldr r2, #15		@accessing 16 bytes
	
	copy_loop:
		ldrb r3,[r0],#1	@copy from store_dat pointer
		strb r3,[r1],#1	@copy to dat_arr pointer
		subs r2,r2,#1
	bne copy_loop
b .

Other Directives

The GNU AS (GAS) assembler used by Xilinx SDK has many other expressions not covered. Take some time to go over some of the others, they might be handy in your assembly programming endeavors. A reference For GNU AS directives can be found here .