We'll be using only an editor capable of modifying binary files (i.e. a hex editor) and 'chmod' to make the file executable. If that doesn't turn you on, I don't know what will.
— Robin Hoksbergen

Machine code is rarely seen but in movies and by brave souls in real life. It's the lowest level of code that we'll touch. While punching out 1s and 0s will be painful, the experience should be educational and a badge of pride.

Workflow

To write our initial machine code, I recommend the following workflow:

  1. Write the 1s and 0s in a text file with lots of comments and whitespace
  2. Duplicate the file
  3. Manually remove all comments and whitespace from the file copy
  4. Run cat ones-and-zeros.txt | binify > kernel8.img to convert those ASCII 1s and 0s into a real binary file
  5. Emulate the raspi4 by running emulate kernel8.img
  6. Optionally run the kernel8.img on real hardware

section Debugging!!

aarch64-none-elf-objdump -D -b binary -maarch64 /tmp/x.img

Note that, for now, we're manually removing comments and whitespace (instead of using `sed`/`grep` to do so). That's intentional. One of the first compiler functionalities we'll implement will be automatically removing comments and whitespace from a file then parsing the 1s and 0s.

The Plan

One of the goals of this guide is to make our OS development process minimally frustrating. For that reason, I recommend the following register usage, based on what we'll later need to do:

Register Usage
r0 to r7 For now, keep these unused. The code we're writing right now will be the boilerplate for all future generated code, so we don't want to stomp on easily-memorable registers.
r8 0xFE200000 (GPIO offset)
r9 0xFE201000 (UART offset)
r10 to r15 Miscellaneous

Let's Begin

Writing Machine Code

Very few people venture below assembly to machine code, so most machine code is described in terms of the equivalent assembly.

For example, let's say we wanted to execute r7 <= r2 + 16.

Once we find the add instruction in the ARMv8 Manual (pg 531) or via the documentation below, we see that the encoding for 64-bit add is as follows:

sf 0 0 1 0 0 0 1 shift2 imm12 Rn Rd

Note the numbers after some variable names; they indicate how many bits wide their encodings are.

In our case, to do r7 <= r2 + 16, we calculate the following:

sf = 1
  shift = 00
  imm = 000000010000
  Rn = r2 = 00010
  Rd = r7 = 00111

Therefore, we encode to the following:

1 0 0 1 0 0 0 1 00 000000010000 00010 00111
  = 10010001 00000000 01000000 01000111

However, because our chip is little-endian, we need to reverse the order of the bytes: (whitespace added for clarity)

01000111 01000000 00000000 10010001

Before we get started building our self-eating-snake of a compiler, we need to implement a way to get information in & out of our processor. We'll do this via UART.

QEMU Hello World

To begin, we'll write machine code to print x (ascii code 0x78) to UART.

All the information we need is in the provided documentation. For now, only worry about directly printing to UART. Don't worry about proper setup or waiting for UART to be ready.

If you do this correctly, you should see x being outputed from the emulator.

Hardware Hello World

Our first Hello World should be exciting. However, if you ever plan on running this on hardware, now is the time to invest time into doing the required work.

After implementing the documented UART setup procedure and writing the code to not print until UART is ready, we can setup our Raspberry Pi 4 SD card by doing the following:

  1. Download Raspberry boot image from RPi website
  2. Flash the image onto an SD card
  3. Delete everything on the SD card except bootcode.bin, fixup.dat, and start.elf
  4. Copy the binary file we wrote onto the SD card, naming it kernel8.img

### Stopping Cores

The Raspberry Pi 4 has four cores. Right now, our code is running on all four cores. Using the provided documentation, figure out which cores are not the core you want to stop (e.g. core 0), then put those extra cores to sleep.

Echo

For each character sent to UART, send it back.

Cat

Store a bunch of characters to memory, then send them back all at once we receive a null byte (ASCII code 0x00).

Parsing Machine Code

---

var len = 700
  var baseInAddr = BASE_MEM_ADDR+100

  fn removeComments {
          var baseOutAddr = baseInAddr + len
          var inIdx = 0
          var outIdx = 0
          var commentMode = false

          var currentNum = 0
          var bitNum = 0

          while {
                  var char = baseInAddr[inIdx];
                  if char == 0 {
                          break
                  }
                  if char == '\n' {
                          if commentMode {
                                  commentMode = false
                          }
                          if baseInAddr[inIdx+1] == ';' {
                                  commentMode = true
                          }
                  }
                  if commentMode {
                          continue
                  }
                  if char != '0' && char != '1'currentNum {
                          continue
                  }
                  currentNum = currentNum << 1
                  currentNum += 1

                  if currentNum == 8 {
                          baseOutAddr[outIdx] = currentNum
                          outIdx += 1
                          currentNum = 0
                          currentNum = 0
                  }

                  inIdx += 1
          }
  }