-
-
Notifications
You must be signed in to change notification settings - Fork 59
User Guide
So, you want to write programs for that new microprocessor you've built, or that custom virtual machine you've created? We'll take a look at how to define your instruction set and then use it to assemble programs.
Let's start off by creating the following file, which we can name main.asm
:
#bits 8
#ruledef
{
nop => 0x00
hlt => 0xff
}
nop
nop
hlt
You can assemble this by running:
$ customasm main.asm
The command above will write a binary file to disk. You can also print the output to screen, which might make it easier to understand what is being produced, by running:
$ customasm main.asm -p
Yet another option is to use the online version, which doesn't require any downloads. You can copy and paste the code above into the page, and hit the "Assemble" button.
After it is assembled, you should see three bytes' worth of code being output!
00 00 ff
, corresponding to the three-instruction program given.
We'll now take a closer look at the contents of our main.asm
file.
There are two parts to this file:
a #ruledef
block defining your instruction set,
and a list of actual instructions that form your actual program.
You can see that the #ruledef
lives in the same file as the rest of your program,
but you can also split it up into multiple files using #include
directives.
You can start the file with a #bits
directive, which defines the smallest
addressable unit for your machine. The default is 8 if you don't specify it, which
is the most common value for modern CPUs.
You should use a #ruledef
block to list mnemonics and their binary representations.
You can have as many #ruledef
blocks as you need, so you can easily split up your declarations.
You can combine any number of letters, words, and punctuation for a given mnemonic.
For example, these are valid patterns:
#ruledef
{
mov a, #b => 0x35
sub x, [hl] => 0b11010001
add.gt r0, r3, r4, LSL #6 => 0x46
}
For the binary representation, the way you write out values matter.
Their size is derived from the number of digits given.
So, for example, 0x0
is four bits long, since it's a single hexadecimal digit, and 0x001
is 12 bits long
(which is to say: leading zeroes do matter).
In the example below, you can also see single-line comments, which start with a ;
.
#ruledef
{
; a single-byte instruction
mov a, b => 0x35
; double-byte instructions
add a, b => 0x6834
sub a, b => 0x0002
}
You can also split up these values for visual aid by using the concatenation operator @
:
#ruledef
{
; a single-byte instruction
; 3 bits + 2 bits + 3 bits = 8 bits
mov a, b => 0b101 @ 0b11 @ 0b001
; a double-byte instruction
; 8 bits + 4 bits + 4 bits = 16 bits
add a, b => 0x08 @ 0x3 @ 0b1001
}
[to be finished...]
- Getting started
- Defining mnemonics β #ruledef, #subruledef
- Declaring labels and constants
- Setting the minimum addressable unit β #bits
- Outputting data blocks β #d
- Working with banks β #bankdef, #bank
- Address manipulation directives β #addr, #align, #res
- Splitting your code into multiple files β #include, #once
- Advanced mnemonics, cascading, and deferred resolution β assert()
- Available expression operators and functions β incbin(), incbinstr(), inchexstr()
- Functions β #fn
- Conditional Compilation β #if, #elif, #else