## Stacks

### Hardware stacks

Area of memory that grows and shrinks according to the LIFO (last in, first out) principle

### Stack models

There are four models of different position *sp* register points.

Full Ascending

Empty Ascending

Full Descending

Empty Descending

**AAPCS specifies Full Descending stack, 8-byte width.**

Stack pointer points to an address divisible by 8, say, 0x400.

In the block of 0x400, there are 8 bytes. The next block is 0x408.

### Store multiple decrement before (overwrite)

STMDB sp!, {r0, r1}

SUB sp, sp, #8 STR r0, [sp] STR r1, [sp, #4]

push {r0, r1}

STM can operate on other registers, whilst push only operates on the stack pointer.

STMDB sp!, {r0, r1} ADD sp, sp, #8 @ equivalent to pop, but pop doesn't work here.

LDMIA sp!, {r0, r1} @ increment after (pop)Published on November 20, 2015

## Reference

### Basic Char syntax

?\a ⇒ 7 ; control-g,Published on November 20, 2015C-g?\b ⇒ 8 ; backspace,BS,C-h?\t ⇒ 9 ; tab,TAB,C-i?\n ⇒ 10 ; newline,C-j?\v ⇒ 11 ; vertical tab,C-k?\f ⇒ 12 ; formfeed character,C-l?\r ⇒ 13 ; carriage return,RET,C-m?\e ⇒ 27 ; escape character,ESC,C-[?\s ⇒ 32 ; space character,SPC?\\ ⇒ 92 ; backslash character,\?\d ⇒ 127 ; delete character,DEL

## Bitwise Instructions

A | B | A AND B | A OR B | NOT A | A XOR B | A BIC B |
---|---|---|---|---|---|---|

0 | 0 | 0 | 0 | 1 | 0 | 0 |

0 | 1 | 0 | 1 | 1 | 1 | 0 |

1 | 0 | 0 | 1 | 0 | 1 | 1 |

1 | 1 | 1 | 1 | 0 | 0 | 0 |

### XOR

[A AND (NOT B)] OR [(NOT A) AND B]

A XOR 0 = A

A XOR 1 = NOT A

In ARM | In C | |
---|---|---|

AND | AND r0, r1,r2 | r0 = r1 & r2; |

OR | ORR r0, r1, r2 | r0 = r1 || r2; |

XOR | EOR r0, r1, r2 | r0 = r1 ^ r2; |

BIC | BIC r0, r1, r2 | r0 = r1 & (~r2) |

NOT | MVN r0, r1 | r0 = ~r1 |

### ARM

MVN r0, #0

-1 is stored in r0.

#0 is automatically extended to 32 bits of 0’s. Flipping 32 0’s to 32 1’s gets -1 in two’s complement.

ADD r3, r0, r1

r1 is the mask.

### Bit shifts and multiplication

##### LSL

Logical shift by n bits – unsigned multiplication by 2^{n}

LSL r0, r0, #4

The above code multiplies r0 with 2^{4}, i.e. 16.

##### LSR

Logical shift by n bits – unsigned division by 2^{n}

##### ASR

Arithmetic shift by n bits – signed division by 2^{n}

##### ROR

Logical rotate by n bits – 32 bit rotate

ror r1, r0, #1

Rotate amount is non-negative.

Shift 4 bits: r0 = 0x12345678, r1 would be 0x81234567.

Shift 8 bits: r0 = 0x12345678, r1 would be 0x78123456.

ARM | C | |
---|---|---|

LSL | LSL r0, r1, #n | r0 = r1 << n; |

LSR | LSR r0, r1, #n | r0 = (unsigned int) r1 >> n; |

ASR | ASR r0, r1, #n | r0 = r1 >> n; |

ROR | ROR r0, r1, #n | UNFINISHED |

##### RSB

Reverse subtract, used for easier multiplication

RSB r0, r1, r2 @ r0 = r2 - r1

RSB r10, r9, r9, LSL #3 @ r10 = r9 * (8 - 1)

### Addressing modes

ADD r0, r1, r2, LSL #2 @ r0 = r1 + (r2 << 2)

LSL r2, r2, #r2 ADD r0, r1, r2

LDR r0, [r1, r2] @ r0 = [r1 + r2]

LDR r0, [r1, r2, LSL #n] @ LSL applied to r2

ADD r0, r1, #0xff, #8 @ r0 = r1 + 0xff000000 @ performs ROR, #8 is the rot

ADD r2, r2, LSL, #2 @ r2 = 5 * r2Published on November 6, 2015

## ARM Nested Procedures

### Nested Procedures

int sumSquare(int x, int y){ return (mult(x, x) + y); }

sumSquare: push {r4-r11, lr} MOV r4, r1 MOV r1, r0BL multADD r0, r0, r4 pop {r4-r11, lr} BX lr @ then?

main: MOV r0, #2 MOV r1, #3BL sumSquareMOV r3, r0

Without pushing **lr**, calling another function overwrites it.

### String constants in ARM

x = 0xaabbccdd; printf("%d\n", x);

// how printf works %int printf(const char *format, ...)

##### Where to store

- String “%d\n” — in data segment
- Value 0xaabbccdd —

##### String formats (unfinished)

.ascii

.asciz

.data mystr: .asciz "%d\n" mynum: .word 0xaabbccdd print_fun: push {lr} LDR r0, =mystr @ location with label mystr is store in r0 LDR r1, =mynum LDR r1, [r1] BL printf pop {lr} BX lr

### Alignment

.text .align 2

mystr is placed in a location that is a multiple of 2^2 = 4.

Data objects should start at memory addresses that are divisible by the size of the data object

- short (2-byte) at address divisible by 2
- int (4-bytes) at address divisible by 4
- double (8-byte) at address divisible by 8

struct p { char x; int y; }; struct p sp;

In this case, sizeof(p) is 8. int should be aligned.

Published on November 2, 2015

## ARM Loops

### While loops

while (a < 0){ a++; }

loop: CMP r0, #0 BGE end ADD r0, r0, #1 @ Body of loop B loop end:

@ Equivalent to above loop: CMP r0, #0 ADDLT r0, r0, #1 BLT loop

##### Do while

do { a++; } while (a < 0);

@ loop part different to while loop loop: ADD r0, r0, #1 @ body of 'do' CMP r0, #0 BLT loop

##### For loops

for (i = 0; i < 10; i++){ a++; b--; }

MOV r0, #0 loop: CMP r0, #10 BGE end ADD r1, r1, #1 SUB r2, r2, #1 B loop end:

@ Equivalent to above MOV r0, #0 loop: CMP r0, #10 ADDLT r1, r1, #1 SUBLT r2 ,r2, #1 ADDLT r0, r0, #1 BLT loop

### C functions

Caller and callee functions:

main() { // main in this case is a caller int a = 10, b = 20; int c = sum(a, b); } int sum(int x, y){ // sum is a callee return x + y; }

##### Steps in function call

- Pass parameters (using registers)
- Call functions (sum)
- Compute return value, put it in the right register
- Transfer control back to caller (return)

1 and 2 are in main();

##### Making the function call

- Simple branch instruction

BL sum @ function call BX @ return

... B sum return ... ... sum: ADD r0, r0, r1 B return_loc ...

Problems:

- Cannot call a function twice
- Always go to return_loc

Reason:

- sum() might be called by many functions, so we cannot return to a fixed place.
- The calling proc to sum() must be able to say “return back here” somehow.
**BX**instruction returns to the location specified in lr.**BL**instruction stores the location of next line so that branch could return to where it was called.

**lr (link register): register reserved to store the return address**

... BL sum ... ... sum: ADD r0, r0, r1 BX lr ; MOV pc,lr i.e., return ...

##### Passing arguments

main() { int a = 10, b = 20, c; c = sum(a, b); c++; }

main: MOV r0, #20 MOV r1, #20 BL sum sum: ADD r0, r0, r1 @ return value in r0 BX lr

##### Register conventions

A set of generally accepted rules as to which registers are guaranteed to be unchanged after a procedure call and which may be changed.

###### r0 – r3

Arguments and return values, otherwise corruptible.

###### r4 – r11

Callee has to make sure that the value in r4-r11 are **unchanged** by it.

###### r12 scratch register

##### Push and pop

Push registers onto, and pop registers off a full descending stack (usually within function call).

push {r4-r11} ... pop {r4-r11}Published on October 30, 2015

## Directed Acyclic Graphs

Def Directed graph with no cycles are called directed acyclic graphs.

Lemma 1: every DAG has at least one source (proof by contradiction)

Lemma 2: If v is a souce vertex of G, then G is a DAG if and only if G-v is a DAG

### Find topological ordering

While G has at least one vertex

If G has some source,

Choose one source and output it

Delete the source and all its outgoing edges from G

Else

Return that G is not a DAG

## Compile By Hand

### Basic arithmetic

g = h + A[8]; // r0 stores A array; r1 stores h; r2 to store g.

ADD r0, r0, #32 LDR r3, [r0] @ r0 = &A + 8 * 4, since int is of 4 bytes.

LDR r3, [r0, #32]

### Accessing arrays

void foo (int *p, int size){ // p stored in r2, size stored in r1 *p = 0; *(p + 1) = 0; *(p + size - 1) = 0; }

foo: MOV r2, #0; @ store 0 in *p STR r2, [r0] @ *p = 0 MOV r3, #4 ADD r4, r0, r3 @ p + 1 in C STR r2, [r4] @ store 0 in *(p + 1) SUB r1, r1, #1 @ r1 = size - 1 MUL r4, r1, r3 @ r4 = r1 * r3 STR r2, [r0, r4] @ store 0 in *(p + size - 1)

### Data transfer instructions

void swap (int *x, int *y){ // x stored in r0, y stored in r1 int tmp = *x; *x = *y; *y = tmp; }

swap: LDR r2, [r0] @ Load *x from memory and store in r2 LDR r3, [r1] @ Load *y from memory and store in r3 STR r3, [r0] @ Store r3 in register to *x in memory MOV r2, [r1] @ Store r2 (tmp) in register to *y in memory

- For short, LDR -> LDRSH, STR -> STRSH
- For unsigned short, LDR -> LDRH, STR -> STRH

### Linked lists

- head -> *next -> *next -> … -> value

head -> value = 100; head -> *next -> value = 100; // r0 stores the first value

MOV r1, #100 STR r1, [r0] @ store to memory ... LDR r3, [r0, #4] @ store 100 to next value STR r1, [r3]Published on October 26, 2015

## ARM Instruction

### ARM *goto* Instruction

goto label (C) is the same as B label in ARM.

### Conditional Branch

- Set the condition bits (N, Z, V, C) in the program status register.
- Then check on these condition bits to branch conditionally.

Conditional bits

N = 1 if result is Negative

Z = 1 if result is Zero

V = 1 if there’s overflow

C = 1 if there’s carry out

#### SUBS (with the ending S) sets conditional bits in cpsr register.

SUBS r1, r0, #0

r0 = r1 – #0, and set the condition bits NZVC based on the result.

If r1==r2, then N = 0, Z= 1, V = 0, C =1

If r < r2 (no overflow), then N =1, Z = 0, V=0

#### Another way to do that is to use a ‘Comparison Instruction’

CMP r1, r2 BEQ label @if Z = 1, then go to label

- Compare and set condition bits
- Subtract a register or an immediate value from a register value and updates status register
- Then check on these condition bits to branch conditionally

COM r0, #4 BEQ this_instr ADD r4, r5, r6 B end this_instr: SUB r0, r0, #1 end:

if(r0 == 4) r0 --; r4 = r5 + r6;

if(r0 != 4) r4 = r5 + r6; r0 --;

### ARM Decision Instructions

BEQ label | == | BRANCH EQUAL |

BNE label | != | BRANCH NOT EQUAL |

BLE label | <= | BRANCH LESS THAN EQUAL |

BLT label | < | BRANCH LESS THAN |

BGE label | >= | BRANCH GREATER THAN EQUAL |

BGT label | > | BRANCH GREATER THAN |

The condition is T/F based upon the fields in the Program Status Register.

### If…else

if (x==y) x += y; else x -= y;

CMP r01, r1 BNE else_aprt ADD r0, r0, r1 B end else_part: SUB r0, r0,r1 end:Published on October 23, 2015

## ARM Memory Map / Basic Instructions

### ARM Memory Map

Section (high address to low address) | Function |
---|---|

OS and Memory-Mapped IO | |

Dynamic Data | Stack and heap |

BSS | Uninitialized global data Zero-initialized global data Static data |

Data | Initialized global data |

Text | Machine code (read only) |

Exception Handlers |

- “Text” (instructions in machine language)
- “Data” contains
**any global or static variables which have a pre-defined value**and can be modified. That is any variables that are not defined within a function (and thus can be accessed from anywhere) or are defined in a function but are defined as static so they retain their value across subsequent calls. - “BSS” also known as uninitialized data, is usually adjacent to the data segment. The BSS segment contains
**all global variables and static variables that are initialized to zero or do not have explicit initialization**in source code. - “Heap” (for dynamically allocated data)
- “Stack” (for function local variables)
- Heap and stack change in size as the program executes.

### View the memory map using “size”

```
$ gcc -O foo.c -c
$ size foo.o
text data bss dec hex filename
52 0 0 52 34 foo.o
```

### Program Execution

Interactions between CPU, registers, and memory. Each instruction (4 bytes) handles the data from registers (more frequently) and memory.

#### Registers

Each ARM register is 32 bits wide. There are 30 general purpose registers (6 status registers, 1 program counter).

r0 to r12, general

r13, stack pointer,

r14, subroutine *Link Register*

r15, program counter

### Basic Type of Instructions

- Arithmetic (involves only CPU and registers)

Compute the sum (or difference) of two registers, and store the value in a register.

Move the contents of one register to another

- Memory Instructions (transfer data between memory and registers)

- Control Transfer Instructions: Change flow of execution

In C: a = b + 10; In ARM: ADD r0, r1, #10

#10 is an immediate (constant), which has fixed length of 4 bytes.

### Assignment Instructions

In C: a = b; // a and b are integers a = 10; In ARM: mov r0, r1 @r0 = r1

### Data Transfer

Separate instructions to transfer data between registers and memory:

- Memory to register (LOAD)
- Register to memory (STORE)

#### Load/store Syntax

LDR r0, [r1]

- r1 must be a register that contains a valid memory location (as a pointer).
- Copy 4 bytes from the location pointed by r1 into r0
- Equivalent to r0 = * r1; (in C)

STR r0, [r1]

- r1 stores the address of destination memory.
- Store contents of r0 to location pointed by r1.
- *r1 = r0; (in C)

**In Little Endian machines (ARM and Intel x86), the least significant byte (like 5 in 0x00000005) goes into the lowest memory address (like 0x200 of 0x200 to 0x0203).**

### Variations on LDR/STR

##### LOAD

LDRH r0, [r1] @ r1 points to 2 bytes (unsigned)

- Fill the significant bits with zeros.
- Treat content pointed by r1 as
**unsigned**.

LDRSH r0, [r1] @ r1 points to 2 bytes (signed)

- Treat content pointed by r1 as
**signed**. - Identify if the most significant bit is 0 or 1 (positive or negative) and do sign expansion.
- Fill F’s in the spare bits.

LDRB r0, [r1] @ r1 points to 1 byte (unsigned)

LDRSB r0, [r1] @ r1 points to 1 byte (signed)

##### STORE

STR r0, [r1] @ store 4 bytes STRH r0, [r1] @ store least significant 2 bytes STRB r0, [r1] @ store least significant byte

There’s no STRSH or STRSB, since the two bytes or one byte can be signed or unsigned. Raw data in r0 ought not be changed.

Published on October 19, 2015## Struct

A struct is a data structure composed of simpler data types.

There is no methods or inheritance.

struct point { int x; int y; }

main(){ struct point p1; // get memory allocated for two int p1.x = 5; p2.y = 10; }

void printPoint(struct point p){ printf("(%d,%d)", p.x, p.y); }

struct point *p; p = malloc(sizeof(struct point)); printf("x is %d\n",(*p).x); printf("x is %d\n", p->x);Published on October 16, 2015