-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.txt
431 lines (353 loc) · 14.8 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
* *__* _* * * _* *__*_ *__* *
* | \(_)_ _ __\ \ / / \/ | *
* | |) | | ' \/ _ \ V /| |\/| | *
* |___/|_|_||_\___/\_/ |_| |_| *
* * * * * * * * * * * *
An LDPL VM
Written in
LDPL
🦖
=== INTRODUCTION =====================================================
Dino is an interpreter for the LDPL programming language, written in
LDPL. Because LDPL is a compiled language, Dino's goal is to provide a
lightweight, scriptable version of the language that can be used to
quickly prototype ideas or perform system tasks. Dino can also be used
to run basic LDPL programs on systems which lack a C++ compiler, or to
experiment with new LDPL language features and syntax. Mostly, though,
it's a prehistoric toy.
=== EXAMPLES =========================================================
HELLO:
$ cat hi.ldpl
PROCEDURE:
display "Hey pardner" crlf
$ dino hi.ldpl
Hey pardner
LDPL-SPARK:
$ git clone https://github.com/photogabble/ldpl-spark
$ dino ldpl-spark/spark.ldpl 9 13 5 17 1
▄▆▂█▁
$ dino ldpl-spark/spark.ldpl 0 30 55 80 33 150
▁▂▃▄▂█
LDPL-SPACE-MINES:
$ git clone https://github.com/photogabble/ldpl-space-mines
$ dino ldpl-space-mines/spacemines.ldpl
==================================================
YEAR 1:
There are 55 people in the colony...
LBI:
$ git clone https://github.com/Lartu/LBI
$ dino LBI/src/LBI.ldpl LBI/examples/fib.b
0
1
1
2...
$ dino LBI/src/LBI.ldpl LBI/examples/squares.b
0
1
4
9...
LDPL Examples:
$ git clone https://github.com/lartu/ldpl
$ dino ldpl/examples/explode.ldpl
Enter a sentence: That's all folks!
That's
all
folks!
$ dino ldpl/examples/sqrt.ldpl
Enter a number: 50
sqrt(50) = 7.07106781186548
=== GETTING STARTED ==================================================
You must have version 3.0.5 of the official LDPL compiler installed
in your $PATH:
https://www.ldpl-lang.org/
Once that's done, clone Dino:
git clone https://github.com/xvxx/dino
And build it:
cd dino
make dino
You should see a "File(s) compiled successfully." message if
everything worked. You now have a `dino` command line program sitting
in the current directory. Run it directly, or add it to your $PATH and
enjoy the fruits of this installation process:
./dino -h
To test Dino, run it against the official LDPL Test Battery[1]:
make test
You should see another "success" message if everything is working
properly. If not, kindly report an issue at this address:
https://github.com/xvxx/dino/issues
[1] We actually use a slightly modified version of the official LDPL
Test Battery, since Dino doesn't have a compilation step.
=== BASIC USAGE ======================================================
Let's look at a simple LDPL program:
$ cat math.ldpl
DATA:
x is number
y is number
z is number
PROCEDURE:
store 1 in x
store 2 in y
add x and y in z
display x "+" y "=" z crlf
First we'll run it using LDPL 3.0.5 as a sanity check:
$ ldpl math.ldpl
LDPL: Compiling...
* File(s) compiled successfully.
* Saved as math-bin
$ ./math-bin
1+2=3
Okay, that seems right. Next we'll run it using Dino:
$ dino math.ldpl
1+2=3
Great! We can stop here. But if you want to look under the hood a
bit, you can see the tokens produced by Dino's lexer for this file:
$ dino lex math.ldpl
tokens (41):
<DATA:>, <:NL:>
<X>, <IS>, <NUMBER>, <:NL:>
<Y>, <IS>, <NUMBER>, <:NL:>
<Z>, <IS>, <NUMBER>, <:NL:>
<PROCEDURE:>, <:NL:>
<STORE>, <1>, <IN>, <X>, <:NL:>
<STORE>, <2>, <IN>, <Y>, <:NL:>
<ADD>, <X>, <AND>, <Y>, <IN>, <Z>, <:NL:>
<DISPLAY>, <X>, <"+">, <Y>, <"=">, <Z>, <"\r\n">, <:NL:>
Pretty fun. The next step would turning those tokens into the parse
tree, which you can see using `dino parse`:
$ dino parse math.ldpl
vars (3):
0. NUM: X
1. NUM: Y
2. NUM: Z
nodes (4):
STORE
0. 1
1. <NUM> X
STORE
0. 2
1. <NUM> Y
ADD
0. <NUM> X
1. <NUM> Y
2. <NUM> Z
DISPLAY
0. <NUM> X
1. "+"
2. <NUM> Y
3. "="
4. <NUM> Z
5. "\r\n"
These nodes are used by the generator to emit dino assembly, our VM's
imaginary syntax and instruction set:
$ dino asm math.ldpl
SET %var0, 1
STORE %X, %var0
SET %var1, 2
STORE %Y, %var1
ADD %X, %Y, %Z
PRINT %X
PRINT "+"
PRINT %Y
PRINT "="
PRINT %Z
PRINT "\r\n"
EXIT
If we want, we can save this output to a .dinoasm file and run it:
$ dino math.dinoasm
1+2=3
Still looks right! Running dinoasm directly can be helpful in
debugging or development of Dino itself.
If you want to explore further, there are a few files in `examples/`
with hand written dinoasm you can examine or run, too:
$ dino examples/99.dinoasm
99 bottles of beer on the wall...
Finally, we can see the bytecode produced by the assembler for our
LDPL computer program:
$ dino bytes math.ldpl
76 68 80 76 2 09 17 01 08 18 17 09 19 02 08 20 19 20 18 20 21 31
18 31 16384 31 20 31 16385 31 21 31 16386 06 "+" "=" "\r\n"
While internally the bytecode is stored as a vector of numbers, when
it's printed to the screen or loaded from a file we separate each
number with a space and display strings literally.
This means we can save `dino bytes`'s output to a .dinocode file and
run it directly. Or even modify it before running it:
$ dino bytes math.ldpl | sed 's/17 01/17 13/g' > math.dinocode
$ dino math.dinocode
13+2=15
Some prefer to write all their code this way:
$ echo "76 68 80 76 02 31 16384 01 -4 06 \"hax!\n\"" > hi.dinocode
$ dino hi.dinocode
hax!
There's also `dino dis` which turns dinocode back into dinoasm, kinda.
It's useful when debugging and checking or challenging assumptions.
=== HOW IT WORKS =====================================================
Internally, Dino is organized into three parts: compiler, virtual
machine, and tooling, with the `dino` command line program serving as
the primary means of interacting with the suite.
The architecture is pretty standard: Dino's compiler converts LDPL
source code into bytecode using a lexer, a parser, a code generator,
and an assembler. The virtual machine then loads that bytecode into
its memory and performs each instruction one by one, just like your
old Nintendo. The tooling is just the `dino` command line program that
drives the compiler suite.
The traditional bytecode/VM architecture means Dino could (with a few
changes) support languages other than LDPL in the future, but for now
it's focused on supporting the full LDPL 3.0.5 specification on Linux,
MacOS, Windows, WebAssembly, and Raspberry Pi.
=== TECHNICAL SPECIFICATION ==========================================
* "Words" are LDPL numbers.
* Instructions are 1-4 words: opcode and then operands.
* Two native types are number and text.
* 11 number registers: $a, $x, $y, $z, $e, $c, $i, $t, $sp, $pc, $ac
* $sp is stack pointer, $pc is program counter, $ac is argc, $e error code
* 5 text registers: @a, @x, @y, @t, @e
* One address space for number registers, number variables, text
registers, text variables, and text literals.
* Parallel address space for number vectors and text vectors.
=== REFERENCE ========================================================
# --- ADDRESS SYNTAX -------------------------------------------------
| NAME | SYNTAX
+-----------------+---------------------------------------------------
| Number Register | $a, $pc
| Number Variable | %bufsize, %Users
| Text Variable | @name, @City
| Text Literal | "heya", "LDPL rox!"
| Label | print-fn, DISPLAY
# ----- MEMORY ADDRESSES ---------------------------------------------
| 1ST | LAST | TYPE | DESCRIPTION
+------+------+-------------------------------------------------------
| 0000 | 000F | NUM | Registers ($x, $y, $a, $pc)
| 0010 | 2FFF | NUM | Variables (%count, %item-size)
| 3000 | 300F | TEXT | Registers (@A, @X, @E)
| 3010 | 3010 | TVEC | Command line arguments @argv
| 3020 | 3FFF | TEXT | Variables (@beer, @name, @label)
| 4000 | FFFF | TEXT | Literals ("Hiya", "SCORE", "????")
# --- REGISTERS ------------------------------------------------------
| NUM | NAME | DESCRIPTION
+------+------+-------------------------------------------------------
| 0000 | $A | Accumulator
| 0001 | $X | Parameter
| 0002 | $Y | Parameter
| 0003 | $Z | Parameter
| 0004 | $E | Non-zero error code
| 0005 | $C | Carry
| 0006 | $I | Incrementor
| 0007 | $T | Temporary value
| 0008 | $SP | Stack pointer
| 0009 | $PC | Program counter
| 0010 | $AC | Num of command line arguments given aka ARGC. 8 max.
| 0010 | | Number variables
| .... | |
| 3000 | @A | Text accumulator
| 3001 | @X | Text register
| 3002 | @Y | Text register
| 3003 | @T | Text register
| 3004 | @E | Error message
| .... | |
| 3010 | @argv| Command line arguments vector
| .... | |
| 3020 | | Text variables
| .... | |
| 4000 | | Text literals
| .... | |
| FFFF | | Final address
# --- BYTECODE FORMAT ------------------------------------------------
| BYTE | DATA | DESCRIPTION
+------+------+-------------------------------------------------------
| 0000 | 76 | First four bytes are char codes for "LDPL"
| 0001 | 68 |
| 0002 | 80 |
| 0003 | 76 |
| 0004 | 01 | Bytecode version number
| 0005 | | First instruction
| 0006+| | Program instructions
| 00XX | 06 | Final EXIT
| 00XX | | Sub-procedure definitions
| 00XX | | Text literals
# --- INSTRUCTIONS ---------------------------------------------------
| CODE | NAME | DESCRIPTION
+------+-------------------+------------------------------------------
| 00 | n/a | n/a
| ==== | ================= | CONTROL FLOW ============================
| 01 | JUMP label | Jump to location of label
| 02 | JIF label | Jump to label if $a is 0 (false)
| 03 | JIT label | Jump to label if $a is 1 (true)
| 04 | CALL label | Push location on stack and jump to label
| 05 | RETURN | Pop loc off top of stack and jump to it
| 06 | EXIT | Exit program
| 07 | WAIT $r | Pause for milliseconds in register.
| ==== | ================= | MEMORY COMMANDS =========================
| 10 | STORE %var $r | %var = value at address $r
| 11 | SET $r 314 | Set $r to a literal number value
| 12 | FETCH $r $x | Set $r to the value at address in $x. Like a pointer.
| 13 | PUSH $x | Push $x onto the stack.
| 14 | POP $a | Pop off the stack into $a.
| 15 | STOREV %vec $r %v | Set %vec:$r to value of %v. %vec:@t and @v work too.
| 16 | PUTV %vec $r %a | Put %vec:$r into %a. %vec:@t and @v work too.
| ==== | ================= | ARITHMETIC ==============================
| 20 | EQ $x $y $a | Set $a=1 if $x == $y
| 21 | GT $x $y $a | Set $a=1 if $x > $y
| 22 | GTE $x $y $a | Set $a=1 if $x > $y
| 23 | LT $x $y $a | Set $a=1 if $x < $y
| 24 | LTE $x $y $a | Set $a=1 if $x < $y
| 25 | ADD $x $y $a | Set $x + $y to $a
| 26 | SUB $x $y $a | Set $x - $y to $a
| 27 | MUL $x $y $a | Set $x * $y to $a
| 28 | DIV $x $y $a | Set $x / $y to $a, $e will be set to 1 if $y is 0.
| 29 | MOD $x $y $a | Set $x % $y to $a
| 2A | ABS $x | Convert $x to its absolute value.
| 2B | CEIL $x | Round $x to next whole number.
| 2C | FLOOR $x | Round $x to previous whole number.
| 2D | RANDOM $a | Put random number in $a.
| 2E | INCR $x | Add 1 to $x.
| 2F | DECR $x | Subtract 1 from $x.
| ==== | ================= | I/O COMMANDS ============================
| 30 | PRINT $x | Print content of register $x
| 31 | PRINL $x | Print content of register $x and newline.
| 32 | ACCEPT $x | Accept user input into num or text var.
| 33 | ACCEOF $x | Accept user input until EOF.
| 34 | EXEC @x | Run @x.
| 35 | EXECO @x @a | Run @x and put output in @a.
| 36 | EXECC @x $a | Run @x and put exit code in $a.
| 37 | READ @x @a | Read file at path @x into @a. Sets $e, @e
| 38 | WRITE @x @y | Write @x to file at path @y.
| 39 | APPEND @x @y | Append @x to file at path @y.
| ==== | ================= | TEXT OPERATIONS =========================
| 40 | LEN @x $a | Get length of string in @x.
| 41 | JOIN @x @y @a | Concatenate text in registers into @a.
| 42 | GETC $x @str @a | Get character in @str at $x and put into @a.
| 43 | GETCC @str $a | Get character code of @str and put into @a.
| 44 | GETIDX @x @y $a | Get index of @x in @y, put in $a.
| 45 | PUTCC $x @a | Put ascii character with code $x into @a.
| 46 | COUNT @x @y $a | Count occurrences of @x in @y, put in $a.
| 47 | SUBSTR @x $x $y @a| Put @x[$x..$y] into @a.
| 48 | SPLIT @x @y @a | Split @x by @y and put in vector @a
| 49 | REPLCE @x @y @z @a| Replace @x from @y with @z in @a
| 4A | TRIM @x @a | Strip L/R whitespace from @x, put in @a.
| ==== | ================ | VECTOR OPERATIONS =======================
| 50 | CLEAR %v | Clears vector %v.
| 51 | COPY %x %y | Copies contents of vector %x to vector %y
| 52 | INDEXC %v %a | Store index count of vector %v in %a
| 53 | INDEXS %v @v | Store indices of vector %v in vector @v
=== ISSUES ===========================================================
1. This first iteration plays fast and loose with the "byte" in
bytecode. The .dinocode files aren't really binary and we're not
doing any bit shifting or fun stuff like that. Once LDPL supports
bitwise operations we'll revisit the core design so it's more bit-
tastic. For now, we're just using numbers.
2. Dino is super slow. Performance may never be a priority.
3. The bytecode format, version number, and set of CPU instructions
are going to change a lot while this is still in development.
4. Extensions are not, and probably won't ever be, supported.
5. Nothing is optimized at all, not even number constants. There are
way too many instructions generated in most cases.
6. There is hardly any error checking yet, so you might end up
generating bytecode that can't be run without knowing why.
7. Nested vectors don't work yet, like: `vec1:vec2:2`
8. The `IN - SOLVE` instruction doesn't work yet.
9. You can't use `-i=` to include files yet. For now, `cat` them all
together and use `dino -` to run a program from stdin.
So this:
$ ldpl -i=lib.ldpl main.ldpl
becomes:
$ cat lib.ldpl main.ldpl | dino -