01

#0: Begin programming assembler with gcc

avr

This post is a translation of « #0: Débuter la programmation Assembleur avec gcc »

I was asked several times how to begin programming in assembler on ARM and where to find documentation or tutorials.
It is true that this is not so easy to find information begins when the ARM assembler.
On the one hand, the ARM is a specialized processor, on the other, the assembler language is not too fashionable these days (which is a shame).

So I told myself, « go! » If I can help new adventurers to embark on the adventure.

Assembler is still, in my opinion, the best way to understand how a processor works and how other languages ​​works.
Of course, this is also the best way to optimize a program, even if compilers have made ​​significant progress in recent years.

I do not have RiscOS (incidentally if anyone wants to install me the system on a SdCard, he can contact me), the examples provided in this section will work with the GCC under linux.

This first tutorial will simply show how to use gcc to program in assembler.

gcc and assembler

The gcc assembler is called gas. I have searched a long time before understand that we use the command gcc to assemble an assembler code.
So just make sure you have gcc for working with assembler under Linux.

It is possible to program in assembler « inline ». Ie to include your assembler code in a C source. Personally I tried once, and I dropped this method which has no great interest.
The syntax is complicated and the readability is horrific. So I can only advise anyone who wants to code in assembler to separate assembler file (. s) C files (. c or. cpp).

A simplified documentation of the syntax gcc can be obtained here.
I do not know the half of the directives. I discover them when I need it.

A small example

The best is yet to make a small test. So let’s go!

First of all, we need a C file which will call a function assembler. It is possible to code everything in assembler but I do not see interest to do that.
So here is a C source code (not far from minimum): first-test.c

#include < stdio.h>
#include < stdlib.h>

int add_one_in_asm(int a);

void main()
{
   int r;
   r = add_one_in_asm(10);
   printf("Result : %d\n", r);
}

The declaration of the assembler function is optional as your function returns an integer. If you want to return another type, you’ll have to declare your function. Make a habit of doing that, because you can spend hours searching why the float you returned is not displayed properly!

Now, let’s see the assembler code of our function: add-one-in-asm.s

   .global     add_one_in_asm
   .type add_one_in_asm, %function
add_one_in_asm:
   add         r0, r0, #1
   mov         pc, lr

.global add_one_in_asm
This first line just make our function visible to the C program

.type add_one_in_asm, %function
This line indicates that the label « add_one_in_asm » is a function.
This statement is optional.

add_one_in_asm:
A string followed by the character : is a « label ».
A label is used to define a textual memory address. This label can be used for a calling function like this will be the case here, but also for a jump or for referencing a memory location where data will be stored.
When we called the function « add_one_in_asm » we tell to the program to continue its progress at the label « add_one_in_asm ».

add r0, r0, #1
I’ll come back a little further, but when calling our function, we pass a parameter 10 value to the function « add_one_in_asm ». This value (10) has been copied into the register r0.
Our program only adds 1 to the value passed to our function. That’s exactly what this assembler instruction do : r0 = r0 + 1

mov pc, lr
Indicates that the function is over and the program must now continue just after the function call.

If you are a true beginner. All this must seem a little obscure … But do not worry things will get better soon.

Compile and link!

Let us first see if our program works. For that we must first compile the files and then link them.

To compile the C program, use the following command:

gcc -c -O9 -marm first-test.c -o first-test.o

-c indicates we just want to compile, we’ll link files later.
-O9 indicates that we want maximum optimization (not very useful here)
-marm says we do not want Thumb instruction.

To compile the assembler program, use the following command:

gcc -c -marm add-one-in-asm.s -o add-one-in-asm.o

Finally for linking all this, use the command

gcc first-test.o add-one-in-asm.o -o first-test

You’ll have more information on gcc and compiler options, here.

Execute the command. If all goes well, we expect to see displayed
Result: 11

Registers and memory

A processor such as the Cortex (and all ARM processors as far as I know) are unable to perform operations on data stored in memory. Operations are performed only on registers, which are internal variables of the processor in limited number.
The ARM has 16 registers named r0, r1, r2, … r15.

Some of these registers may be named differently.
r15 = pc (program counter)
r14 = lr (link return)
r13 = sp (stack pointer)
r12 = ip (internal procedure scrath pointer)
r11 = fp (frame pointeur)
r10 = sl (stack limit)
r9 = sb (???)

Others have a particular utility.
r0, r1, r2, r3 are used to receive the value of the parameters function.
r0 (and in some cases the couple r0-r1) contains the return value of a function.
r12 allow to work with the stack without having to change the stack pointer. (we’ll see it in the next tutorial).

When you write a function in assembler, r0, r1, r2, r3 and r12 can be freely modified.
The program that calls your function does not expect to find these registers in the same state to the return of the function than when calling.
For cons, the registers r4 to r11 and r13 must be returned with the same values ​​as during the call, which means that if you intend to use these registers, you will have to save their contents before modify them in order to restore them with the correct values ​​before the end of the function.

Finally r14 (lr) and r15 (pc) registers are a bit peculiar.
r15 still contains the address of the next instruction. This means that changing r15 will make a jump or a function call. I will not go into detail about the value of r15. You can forget this register for the moment and decide never to use it. The day you’ll have to use it, you will not need tutorials :)
r14 contains the return address. When a function call is executed, the ARM save in this register the return address of the function, ie the address of the instruction following the call.

At first, you can also put aside the register.
You just understand that when you call our function assembler, ARM made ​​both of the following:
Put the value of r15 into r14.
Put into r15 the value of the memory address referenced by the label of the function.

This amounts to saying that the ARM (in our case):
Store the return address of the function into r14
Continue its execution into function « add_one_in_asm »

Resuming now our 2-line assembler code (since the first 3 lines are just declarations)

   add         r0, r0, #1
   mov         pc, lr

Upon entering the function, r0 receives the first parameter of the function containing 10.
It adds 1 to r0 because we can change r0 without take care about safeguarding its contents.
We must put in the return value into r0. Luckily, it already is.
we copy lr value into pc register. this action has the effect of return just after the call to our function (it’s a kind of C return).

On C side, the return value (contained in r0) will be copied to the variable r (C program) and then displayed.

And that’s it …
It’s the end of this first tutorial. You now know how to create a assembler function called from an C program.

There’s not just me bump!

If you have digested all this, I suggest you to train a little bit. Here’s a little exercise (I should have be a teacher)

Code a function that adds two integers and returns the result.
Here are two small tips that can help beginners:
values ​​you pass to a function are stored in registers r0, r1, r2, and r3 (obviously if you have 4 parameters. If you have only 2, r0 and r1 will be used)
The opcode (instruction) ADD can perform an addition.

Going further
For those who have passed this stage and want to go further, here is a list of basic instruction that is useful to know.
AND, EOR, SUB, RSB, ADD, ORR, BIC, MOV, MVN

In the next tutorial
We ‘ll discuss the concept of stack, memory variables and loops.

Rem: This tutorial is a repeated of this post I wrote few months ago …

Feel free to send me a better translation of this post at pulsar[at]webshaker.net
You can also send me a translation to another language if you want.

 | Tags: ,

2 Responses to “#0: Begin programming assembler with gcc”

  1. Tamera Rippe dit :

    woah thats a lot to take in at once

  2. [...] brief introduction about writing ARM Assembly code with GCC can be found at Begin Programming Assembler with GCC. Same resource also has a very good inline assembler cookbook at ARM GCC Inline Assembler [...]

Leave a Reply

Human control : 3 + 8 =