Skip to content

The Long Overdue (Pre-) Release

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 19 Feb 19:11
· 63 commits to master since this release
f68e3c2

About the Release

This pre-release is long past due. In the meantime Travis-CI.org services were terminated, and other things in life of the maintainer insisted to be more important than learning about GitHub Actions. For a long time many improvements didn't make it into a binary release.

Now the transition to GitHub Actions is complete. In addition to the old features the volatile release tag will contain the binaries from the latest successful build-test workflow.

New Features

Improved CREATE ... DOES>

Issue #427 provides a much better implementation of DOES> - and better means both faster and leaner.

The execution speed of the new solution is on par with an ordinary CREATE, VARIABLE or CONSTANT as can be shown in the following example:

: EMPTY CREATE DOES> ;      \ just return the address   
: CONS  CREATE , DOES> @ ;  \ return data

      CREATE   tcreate 
      VARIABLE tvar
      EMPTY    tempty
  0  CONS      tcons
  0  CONSTANT  tconstant

The following measurements where done with PulseView on a STM8S001J3M3 with 16MHz HSI and code compiled to Flash ROM.

Test runtime cycles
tcreate 1.3µs 21
tempty 1.82µs 29
tvariable 1.89µs 30
tconstant 3.2µs 51
tcons 3.3µs 53

This means that a runtime of an "empty" DOES>, which returns the address of any data stored by a word definition, is 1.82µs. That's marginally faster than VARIABLE and just a bit slower than CREATE.

The simple constant value implementation : CONS CREATE , DOES> @ ; is also just a bit slower than the literal stored by CONSTANT. The latter uses the STM8 instruction TRAP and requires just 3 byte, just like the CALL to the word defined through CONS).

The memory requirements compare as follows:

[bytes] old new diff
: empty CREATE DOES> ; 22 18 4
empty a 13 7 6
STM8S001J3 binary 4697 4662 35

The old implementation needs 4 bytes more for a "defining word" and 6 bytes more for a "defined" (the new DOES> has the same memory needs as a word defined by CREATE, CONSTANT or VARIABLE).

Also the STM8 eForth binary is 35 bytes smaller than before.

Improved >REL

>REL is an implementation of IF ... ELSE ... THEN using relative addressing modes. It's meant to be used as a compiler extension loaded into RAM as a scaffold for, e.g., compiling fast and extra compact ISRs (interrupt service routines) into Flash ROM.

@Eelkhoorn noticed that RAM space for the scaffolding code can be reduced and provided an improved implementation.

Words for Forth Standard compatibility

Issue #430 and #438 added library words for making STM8 eForth a bit more compatible with the Forth Standard. Some of the words are just "No Operation" dummy words (e.g. ALIGN), some aliases (e.g., INVERT), some simple definitions (e.g. >BODY and some genuine extensions (e.g., VALUE ... TO).

Please be aware that not all of these Forth Standard words will always do what you expect, e.g.:

  • VALUE ... TO (like DEFER ... IS) assumes a writable dictionary
  • some words like STATE emulate some just of the standard semantics
Forth Standard STM8 eForth implementation
>BODY : >BODY ( xt -- a-addr ) 3 + ;
ALIGN no op
ALIGNED no op
C" ' $" ALIAS C"
CHAR+ ' 1+ ALIAS CHAR+ ( c-addr1 -- c-addr2 )
CHAR : CHAR ( "char" -- c ) BL WORD CHAR+ C@ ;
CHARS no op
[CHAR] : [CHAR] ( "name"<spaces -- ) CHAR POSTPONE LITERAL ; IMMEDIATE
COMPILE, ' CALL, ALIAS COMPILE, ( xt -- )
ENVIRONMENT? : ENVIRONMENT? ( c-addr u -- false ) 2DROP 0 ;
INVERT ' NOT ALIAS INVERT ( x1 -- x2 )
J like I (only for DO ... LOOP, not FOR ... NEXT)
STATE "kludge" using STATE? and a variable stateflag
TO see VALUE
VALUE limited to writable dictionary (RAM or NVM when writable) see lib/VALUE

Issue #430 refactored CREATE and VARIABLE in order to facilitate implementing the Forth Standard words VALUE and TO.

The following additional words are already available in volatile and they will be available in the next release (2.2.29):

Forth Standard STM8 eForth implementation
CELL+ ' 2+ ALIAS CELL+ ( c-addr1 -- c-addr2 )
CELLS ' 2* ALIAS CELLS ( n1 -- n2 )
FALSE ' 0 ALIAS FALSE ( -- false )
RSHIFT like LSHIFT ( n1 u -- n2 )
TRUE ' -1 ALIAS TRUE ( -- true )

Improved "pictured number" words

While working on optional words for Forth Standard compatibility it became clear that while Forth Standard compliant "pictured number output" with # ( ud -- ud) instead of # ( u -- u) (double instead of single math) would increase the code size only marginally but the math would make printing numbers in a background process slower. This might break applications that print numbers in a background task as the limit of 1ms task run-time is exceeded (unless a fast 32bit/8bit division or buffered I/O is used).

Issue #433 explored options for improving the code. It turned out that # can be made faster by using the instruction DIV X,A (with the DIV/DIVW erratum work-around). The code could also be made leaner by in-lining the code of DIGIT and EXTRACT (these are eForth words which are not available in other 16bit Forth implementations, e.g., the well known F83 - they also don't appear in the Forth Standard).

PulseView and the word .. (which toggles a GPIO with PLo and PHi) were used for testing:

: .. ( u -- u ) PLo <# PHi #S PLo #> PHi TYPE ;

For example, here is the timing for DECIMAL 65535 ..:

image

The following table shows that # and #S are much faster now:

.. Base <# #S #> old [µs] <# #S #> improved [µs]
65535 10 155 31
6 10 53 22
65535 16 131 29
65535 2 446 60

The toggles around <# and #> revealed that about 4µs can be saved by coding the 16bit <literal> + in PAD in assembler (13µs to 9µs - the numbers in the table contain this optimization). In a BG task PAD is slightly faster as it returns a constant address. When using numeric output in a background task, e.g. for presenting measurements on a LED display with CR ., the more efficient "pictured number words" makes a real difference.

Note: Forth Standard compatible "pictured number" words with double number output (e.g. D.) can be provided later through library words. in In a 16bit Forth it's important to keep in mind that a limitation of UM/MOD ( ud un -- ur uq ) - the 16bit result - correct output for double numbers is limited to "65536 x BASE - 1" (e.g., 655359 for base 10). For larger numbers a 32bit division with 32bit result is required (with 8bit divisor).

Bug fixes and other improvements

Improved .0 (3-digit signed number print)

Issue #432 fixes a few edge cases of .0, the signed number output for 3 digit (LED) displays: numbers smaller than -994 or larger than 9994 had digit overruns - and thus potentially wrong display values.

The updated version was shown to work for the following values:

-999 .0 DEF. ok
-995  .0 -99 ok
-99  .0 -9.9 ok
0    .0 0.0 ok
999  .0 99.9 ok
1000 .0 100 ok
7876 .0 788 ok
9995 .0 DEF. ok

Leaner console text input words

Issue #435 saved some ROM space in the input words ACCEPT, KTAP, and QUERY.

CREATE and VARIABLE refactored

Common functionality from CREATE and VARIABLE was refactored into the new word ENTRY (used by VALUE).

Set INT_TLI to COLD

@Eelkhoorn ran into a problem when changing ISR code in a development cycle:

Uploading the I2C interrupt service routine to STM8L (both 051F3 and 151K4) can lead to corrupted ITC_SPR registers, persistent even after power cycle. Writing xt of COLD to INT_TLI (reset vector) solved the issue.
The last four entries of the interrupt vector table (0x8070 to 0x8080) seem to be corrupted after boot for STM8L.

Pull request #440 appears to solve the issue. The problem needs further analysis.