-
-
Notifications
You must be signed in to change notification settings - Fork 59
User Guide
So, you want to write programs for that new microprocessor you've built, or that custom virtual machine you've created? We'll take a look at how to define your instruction set and then use it to assemble programs.
Let's start off by creating the following file, which we can name main.asm
:
#cpudef
{
#bits 8
nop -> 0x00
hlt -> 0xff
}
nop
nop
hlt
You can assemble this by running:
$ customasm main.asm
You can also print the resulting binary to screen, which might make it easier to understand what is being produced, by running:
$ customasm main.asm -f hexdump
Yet another option is to use the online version, which doesn't require any downloads, then copying and pasting the code above into the page, and hitting the "Assemble" button.
After it is assembled, you should see three bytes' worth of code being output!
00 00 ff
, corresponding to the three-instruction program given.
We'll now take a closer look at the contents of our main.asm
file.
There are two parts to this file:
a #cpudef
structure defining your instruction set,
and a list of actual instructions that form your assembly program.
You can see that the #cpudef
lives in the same file as the rest of your program,
but you can also split them up into multiple files and use #include
s.
The #cpudef
structure starts with a #bits
declaration.
This may be thought of as the number of bits in a byte for your particular CPU.
So, while this is usually 8 for modern CPUs, you can really use any value, if you have some kind of esoteric machine.
The size of a byte impacts the address space of your machine:
you won't be able to reference anything at a finer grain than a single byte.
customasm
will also use this value to verify the size of the binary representation
of all instructions you define: every instruction must have a size that is a
multiple of a byte. So, for an 8-bit CPU, valid instruction sizes are 8, 16, 24 bits, and so on.
Next, it starts defining the instruction set, by listing mnemonics together with their binary representations. You can combine any number of letters, words, and punctuation for a given mnemonic. For example:
#cpudef
{
#bits 8
mov a, #b -> 0x35
sub x, [hl] -> 0b11010001
add.gt r0, r3, r4, LSL #6 -> 0x46
}
For the binary representation, the way you write out values matter.
Their size is derived from the number of digits given.
So, for example, 0x0
is four bits long, since it's a single hexadecimal digit, and 0x001
is 12 bits long
(which is to say that leading zeroes matter).
In the example below, you can also see single-line comments, which start with a ;
.
#cpudef
{
#bits 8
; a single-byte instruction
mov a, b -> 0x35
; double-byte instructions
add a, b -> 0x6834
sub a, b -> 0x0002
}
You can also split up these values for visual aid by using the concatenation operator @
:
#cpudef
{
#bits 8
; a single-byte instruction
; 3 bits + 2 bits + 3 bits = 8 bits
mov a, b -> 0b101 @ 0b11 @ 0b001
; a double-byte instruction
; 8 bits + 4 bits + 4 bits = 16 bits
add a, b -> 0x08 @ 0x3 @ 0b1001
}
[to be finished...]
- Getting started
- Defining mnemonics β #ruledef, #subruledef
- Declaring labels and constants
- Setting the minimum addressable unit β #bits
- Outputting data blocks β #d
- Working with banks β #bankdef, #bank
- Address manipulation directives β #addr, #align, #res
- Splitting your code into multiple files β #include, #once
- Advanced mnemonics, cascading, and deferred resolution β assert()
- Available expression operators and functions β incbin(), incbinstr(), inchexstr()
- Functions β #fn
- Conditional Compilation β #if, #elif, #else