.ifdef 0 This file is an attempt to port a byte-code forth system from the x86 architecture to the atmega328 chip (usually running on some kind of arduino system). The x86 system is at bumble.sf.net/books/osdev/os.asm The system uses ijmp to call the opcodes and caches the top-of-stack in r25:r24 CHALLENGES Dealing with the Harvard architecture of the avr chip is going to be tricky. However Flashforth contains some solutions to this. However, in forth, data words (defined with "var" or "constant" or something else) are kept in the link-list dictionary right next to functions or procedures. On an x86 system we allocate the actually data space right next to, or after the definition of the data word in the dictionary, but in a Harvard archicture system we cannot do this. The reason, is that a program may write to the data space thousands or millions of times during a single program execution and this would soon exhaust the flash-memory technology which is used to implement to the code space on an avr chip (and on many other microcontrollers). So the data-space for actually storing (for example) a "string" will not be contiguous with the definition of the data word. The data space will be stored in "data memory" (such as eeprom or other) and this data memory is usually extremely limited on microcontrollers (2K in size on a atmega328p "arduino"). So "var" will push onto the data-stack a pointer to this data in data-memory. The system will then "translate" that address into The an "arduino" atmega328p contains a bootloader which is not necessary for a forth system. So I need to learn how to burn the forth system into the arduino board (and override the bootloader). Also, I need to learn how to use the SPM instruction to actually write to flash (program) memory. This may involve using a ram buffer, because some flash memory can only be written one page at a time. Flash memory should not be written to overly frequently (unlike data memory), so it may be better not to recompile code that is already on the atmega/avr chip. So "create" could check for a word of the same name and abort if it already exists in the dictionary. COMPILE UPLOAD AND RUN Use some aliases, bash functions to aid development. In the "vimrc" file the commands will not have any leading ">>" * compile an assembly arduino/atmega/avr program >> map ,A :! avra %; * compile whole file and upload to the arduino 'uno' board >> map ,V :! avra %; sudo avrdude -p atmega328p -carduino -P/dev/ttyACM0 -b115200 -D -Uflash:w:%:r.hex:i * compile and upload to the arduino 'uno' board >> avra os.avr.asm; sudo avrdude -p atmega328p -carduino -P/dev/ttyACM0 -b115200 -D -Uflash:w:os.avr.hex:i * compile and upload to the arduino 'duemilenove' board >> avra os.avr.asm; sudo avrdude -p atmega328p -carduino -P/dev/ttyUSB0 -b57600 -D -Uflash:w:os.avr.hex:i * start a serial terminal to communicate with the arduino >> sudo picocom -b 9600 --echo /dev/ttyACM0 * a bash function to compile and upload to uno board ----- aos() { # cat os.avr.asm | sed -f os.sed > os.avr.pre.asm avra os.avr.asm sudo avrdude -p atmega328p -carduino -P/dev/ttyACM0 \ -b115200 -D -Uflash:w:test.hex:i } ,,, GOTCHAS Uneven numbers of bytes in .db sections get padded with an extra zero, which may cause absolute havoc in many situations...!!! "LIT, 132" doesnt do what is expected, because the 132 will be sign extended into 2 bytes (and will become a negative number). Asci characters seem to work because the "emit" opcode ignores the high byte of the stack. If there is a serial connection to the micro then avrdude will report avrdude: stk500_getsync() attempt 1 of 10: not in sync: resp=0x00 When using fcall we need to use the label<<1 idiom because rom code memory is word addressable. eg .db STAR, FCALL, low(proc<<1), high(proc<<1), KEY, EMIT HISTORY * 29 Dec 2018 The bug in fcall maybe that the "Y" double register is NOT the return stack but "fcall" thinks that it is!! Compiling with no errors today with "avra". I will remove domain prefixes because I will use an interleaved link-list dictionary to implement namespaces. When I left this in July, I had not solved the Return-stack and Pstack problem. Perhaps I can solve this by writing a ".s" stack display word in avr assembler so as to glimpse what is going on in the dark recesses of the displayless mini-computer. Once that is done, then incorporate the ideas of the interleaved link-list dictionary to allow namespaces and universal unique forth names. Also, need to think about how to incorporate the avr (arduinos) timers into the machine. Then deal with the Harvard archicture tangle for separating code and data. (Both code and data spaces can be written but with different avr opcodes). One idea is the "memory map" from flash forth. In August 2018 I did work on the arduino book bumble.sf.net/books/arduino which contains useful but incomplete timer and pwm examples. I realised that the Mazidi avr book is less useful than I hoped because the instructions need to be rewritten for atmega328p and the Mazidi coding style is sometimes not good (He uses hardcoded register numbers rather than defined mnemonics which would make the code more portable between different avr chips as well as more readable). Time to print out this code and look at it closely to sort out the proceedure calling problems. * 28 july 2018 Wrote fcall, exit, pcall. But I think that Rstack and Pstack are interfering with each other (since they occupy the same buffer). This means that the initialisation of the Rstack is not working as it should. gotcha! avra adds a zero byte to odd numbered bytes in .db sections (because atmega flash is word addressable). This makes the "code:" section abort if uneven number of bytes! would be good to write a .s(hex) in machine code to make debugging easier. Also, started to write a series of tests to make sure each opcode is working properly. * 27 july 2018 System is currently 1684 bytes. trying to debug jumpnz logic. sbiw and adiw are equivalent to sbi + sbci but occupy less space (16bits vs 32 bits). Maybe can just use the Y index for the return stack pointer. The return stack could grow up in the same buffer as the datastack, but not sure of the advantage. * 26 july 2018 Code is currently 1212 bytes (only opcodes). Need to look at the 'neg' code, dont think it is correct. Ported more opcodes up to "litw". Found the following way to sign extend 8bits to 16. ;signextend: ldi r25, 127 ; can be hoisted out of loops, and any reg is fine. cp r25, r24 ; C = (r24 < 0) sbc r25, r25 ; r25 = (r24 < 0) ? -1 : 0 ; result in r25:r24 * 25 july 2018 Ported more opcodes from x86. Seems fairly straight forward. Maybe tosh and tosl should be kept in Z which would make "@", "fetch" and "!" easier and faster. No! because the exec routine needs zh:zl to get opcodes. Also, the idea of an opcode which pushes the memory map onto the stack (eg eeprom, rom, ram). In a von-neumann machine the rom and ram will be the same but in a harvard architecture they will not. Had the idea of programming the bytecode onto the arduino using "key" and ! or c, * 24 july 2018 continuing to port opcodes from the x86 version. Grappling with a .db problem in the code: section. When I write two .db sections on separate lines, the assembler (avra) seems to put a zero at the end of the first .db line. Added some vim macros to compile and upload whole file to arduino uno. * 22 july 2018 Need to move the exec routine into a word. Then deal with jumps. Then recode all of the x86 system in avr assembler. Got the offset working but not propagating the carry to pch (program counter high byte). Struggling with jump to get the offset correct in the byte code section. Separated this into its own file to continue development there. This is a direct parallel to bumble.sf.net/books/osdev/os.asm (which is x86 code) * 20 july 2018 The opcode retrieval and indirect procedure call appear to be working. These are the essential elements for a bytecode system. using LPM and ICALL. Will convert to IJMP so that the data stack is easier to use; that is, by using ijmp instead of ICALL no return code address is placed on the avr data-stack by the calling procedure. So we do not need to "juggle" this return address when trying to get access to the data items underneath it on the stack (which we are constantly doing because forth is a stack-machine). * 19 July 2018 Started writing a bytecode system. Revisiting this to try to create a forth style bytecode system using avr assembly. I will use ideas from the x86 forth bytecode system at bumble.sf.net/books/osdev/os.asm * 8 November 2016 I combined the arduino book with the avr book, since my main focus now with the arduino will be programming in assembler. First I need to learn the avr architecture. .endif .include "m328Pdef.inc" ; or use Y for the return stack .def rph = r1 ; return stack pointer, high byte .def rpl = r0 ; return stack pointer, low byte .def pch = r23 ; (bytecode program counter high-byte) .def pcl = r22 .def tosh = r25 ; (top-of-data-stack pointer high byte) .def tosl = r24 ; also here define the datastack start location and the ; return stack location ; eg, This allows recursive procedures. .equ DSTACKH = high(ramend) .equ DSTACKL = low(ramend) .equ RSTACKH = high(ramend)-512 .equ RSTACKL = low(ramend)-512 ; Opcode constants for the bytecode machine. These are (naturally) ; the same as for the x86 machine (hopefully to allow "write once, ; run anywhere"). Some mnemonics are just aliases for other bytecodes ; such as "count" and "cfetchplus" which is "c@+". Count is also ; a standardish forth word and character-fetch-plus is analogous to ; real post increment machine fetch opcodes. ; Some opcodes may not be necessary or useful on a simple microcontroller ; but they should be "installable" depending on what hardware is ; connected and available to the micro. This is an extended idea of ; what a virtual machine is or should be (it seems that normally VMs ; do not include periferal hardward, sensor, transducers, etc in the ; semantics of the VM). .equ DUP = 1 .equ DROP = DUP+1 .equ SWAP = DROP+1 .equ OVER = SWAP+1 .equ TWODUP = OVER+1 .equ FLAGS = TWODUP+1 .equ DEPTH = FLAGS+1 .equ RDEPTH = DEPTH+1 .equ RON = RDEPTH+1 .equ ROFF = RON+1 .equ STORE = ROFF+1 .equ STOREPLUS = STORE+1 .equ FETCH = STOREPLUS+1 .equ FETCHPLUS = FETCH+1 .equ CSTORE = FETCHPLUS+1 .equ CSTOREPLUS = CSTORE+1 .equ CFETCH = CSTOREPLUS+1 .equ CFETCHPLUS = CFETCH+1 .equ COUNT = CFETCHPLUS ; count is an alias for c@+ .equ EQUALS = CFETCHPLUS+1 .equ NOTEQUALS = EQUALS+1 .equ LESSTHAN = NOTEQUALS+1 .equ ULESSTHAN = LESSTHAN+1 .equ LIT = ULESSTHAN+1 .equ LITW = LIT+1 .equ EMIT = LITW+1 .equ KEY = EMIT+1 .equ EKEY = KEY+1 .equ GETXY = EKEY+1 .equ ATXY = GETXY+1 .equ PLUS = ATXY+1 .equ MINUS = PLUS+1 .equ INCR = MINUS+1 .equ DECR = INCR+1 .equ NEGATE = DECR+1 .equ DPLUS = NEGATE+1 .equ LOGOR = DPLUS+1 ; logical or .equ LOGXOR = LOGOR+1 .equ LOGAND = LOGXOR+1 .equ DIVMOD = LOGAND+1 .equ TIMESTWO = DIVMOD+1 .equ MULT = TIMESTWO+1 .equ UMULT = MULT+1 .equ FCALL = UMULT+1 .equ PCALL = FCALL+1 .equ EXIT = PCALL+1 .equ LJUMP = EXIT+1 .equ JUMP = LJUMP+1 .equ JUMPZ = JUMP+1 .equ JUMPF = JUMPZ ; "jumpf" jump-false is an alias for jumpz (zero) .equ JUMPNZ = JUMPZ+1 .equ JUMPT = JUMPNZ ; "jump-true" is an alias for jump-not-zero .equ RLOOP = JUMPNZ+1 .equ DIVTWO = RLOOP+1 .equ READ = DIVTWO+1 ; loads sectors from disk .equ WRITE = READ+1 ; writes sectors to disk (usb etc) .equ FG = WRITE+1 ; foreground colour .equ BG = FG+1 ; background colour .equ CLS = BG+1 ; clear screen .equ VID = CLS+1 ; video mode .equ PIX = VID+1 ; one pixel on screen at xy .equ GLYPH = PIX+1 ; display glyph on screen .equ RTC = GLYPH+1 ; real time clock .equ CLOCK = RTC+1 ; number of clock ticks since midnight .equ NOOP = CLOCK+1 ; no operation & end marker .equ STAR = NOOP+1 .equ DIGIT = STAR+1 ; A list of symbolic constants for domain name prefixes .equ LANG = 1 ; language name .equ CORE = 2 ; core ops or primitives .equ MON = 3 ; monitor .equ TIME = 4 ; time and date .equ DISK = 5 ; storage .equ INOUT = 6 ; input output .dseg ; ram data segment .cseg ; start of flash "rom" code segment .org 0x0000 jmp wakeup ; a table of code pointers. The pointers have the ; same offset (*2) in the table as the value of the opcode optable: .dw 0, dupx, dropx, swapx, overx, twodupx .dw flagsx, depthx, rdepthx, ronx, roffx .dw storex, storeplusx, fetchx, fetchplusx .dw cstorex, cstoreplusx, cfetchx, cfetchplusx .dw equalsx, notequalsx .dw lessthanx, ulessthanx, litx, litwx .dw emitx, keyx, ekeyx, getxyx, atxyx .dw plusx, minusx, incrx, decrx, negx .dw dplusx, logorx, logxorx, logandx .dw divmodx, timestwox, multx, umultx .dw fcallx, pcallx, exitx .dw ljumpx, jumpx, jumpzx, jumpnzx, rloopx .dw divtwox, readx, writex .dw fgx, bgx, clsx, vidx, pixx .dw glyphx .dw rtcx, clockx ; could put just-in-time opcodes here. .dw noopx, .dw starx, digitx, -1 ; execx: jmp nextopcode dupdoc: ; db 'Duplicates the top item on the stack.' ; dw $-dup.doc duph: .db LANG ; domain name prefix for universal naming .dw 0 ; link to previous word .db "dup", 3 ; strings are 'counted' dupx: push tosh ; high byte push tosl ; low byte jmp nextopcode dropdoc: ; db 'removes the top item on the stack.' ; dw $-drop.doc droph: .db 0 ; domain name prefix for universal naming .dw dupx ; link to previous word .db "drop", 4 ; name with reverse count dropx: pop tosl pop tosh jmp nextopcode swapdoc: ; db 'swaps the top 2 items on the stack.' ; dw $-swap.doc swaph: .dw dropx ; link to previous word .db "swap", 4 swapx: movw r17:r16, tosh:tosl pop tosl pop tosh push r17 push r16 jmp nextopcode overdoc: ; db ' ( n1 n2 -- n1 n2 n1 ) ' ; db ' Puts a copy of 2nd stack item on top of stack. ' ; db ' dont use this, will probably remove. ' ; dw $-over.doc overh: .db LANG ; domain name prefix for universal naming .dw swapx ; back-link .db "over", 4 overx: pop r16 pop r17 push r17 push r16 push tosh push tosl movw tosh:tosl, r17:r16 jmp nextopcode twodupdoc: ; db ' ( n1 n2 -- n1 n2 n1 n2 ) ' ; db ' copies 2 stack items onto stack ' ; dw $-twodup.doc twoduph: .db LANG ; domain name prefix for universal naming .dw overx .db "2dup", 4 twodupx: pop r16 pop r17 push r17 push r16 push tosh push tosl push r17 push r16 jmp nextopcode flagsdoc: ; db ' ( -- n ) ' ; db ' push flag register onto the stack ' ; db ' execution flags such as carry overflow negate etc ' ; db ' are pushed onto the stack ' ; dw $-flags.doc flagsh: .db LANG ; domain name prefix for universal naming .dw twodupx .db "flags", 5 flagsx: ;push dx ; save NOS on stack ;pushf ;pop dx ; get flags into TOS (dx) jmp nextopcode depthdoc: ; db ' ( -- n ) ' ; db ' Puts on the stack the number of stack items ' ; db ' before this word was executed ' ; dw $-depth.doc depthh: .db LANG ; domain name prefix for universal naming .dw flagsx .db "depth", 5 depthx: ; untested code ldi r17, high(ramend) in r19, sph ldi r16, low(ramend) in r18, spl sub r16, r18 ; r17:r16 - r19:r18 sbc r17, r19 ; need to divide by 2 lsr r17 ; div r17:r16 by 2 ror r16 push tosh ; high byte push tosl ; low byte (character) movw tosh:tosl, r17:r16 jmp nextopcode rdepthdoc: ; db ' ( -- n ) ' ; db ' Puts on the stack the number of stack items ' ; db ' on the return stack before this word ' ; db ' was executed ' ; dw $-rdepth.doc rdepthh: .db LANG ; domain name prefix for universal naming .dw depthx .db "rdepth", 6 rdepthx: ;push dx ;mov dx, di ;shl dx, 1 ; where to implement the return stack? ; above the data stack? jmp nextopcode rondoc: ; db '( S: n -- )( R: -- n ) ' ; db ' put the top item of the data stack onto the return stack.' ; dw $-ron.doc ronh: .db LANG ; domain name prefix .dw rdepthx .db ">r", 2 ronx: st Y+, tosl st Y+, tosh ; store on return stack, increment pop tosl pop tosh jmp nextopcode roffdoc: ; db '( S: -- n)( R: n -- ) ' ; db ' put the top item of the return stack onto the data stack.' ; dw $-roff.doc roffh: .db LANG ; domain name prefix .dw ronx .db "r>", 2 roffx: push tosh ; tos becomes nos push tosl ld tosh, -Y ; get top of return stack and decrement ld tosl, -Y ; jmp nextopcode ; this needs to be "vectored" for eeprom/rom/ram memory types storedoc: ; db '( w adr -- ) ' ; db ' place 2 byte value w at address "adr" ' ; dw $-store.doc storeh: .db LANG ; domain name prefix .dw roffx .db "!", 1 storex: ; here we need to look at the address requested and parlay ; or bifurcate that into ST "store data memory" or SPM ; "store program memory" instructions depending where the address ; is in the pre-defined "memory map". This may be tricky ; because program-memory (flash memory on the atmega) may need ; to be buffered before writing. Also, the standard arduino bootloader ; prevents writing to program memory so it must be zapped before ; this will word. movw xh:xl, tosh:tosl ; movw xh:xl, tosh:tosl ; pointer to address pop r16 pop r17 ; value to store at address st X+, r16 st X+, r17 ; 2 bytes are stored pop tosl pop tosh jmp nextopcode storeplusdoc: ; db '( w adr -- adr+2 ) ' ; dw $-storeplus.doc storeplush: .db LANG ; domain name prefix .dw storex ; link to previous word .db "!+", 2 storeplusx: movw xh:xl, tosh:tosl ; pointer to address pop r16 pop r17 ; value to store at address st X+, r16 st X+, r17 ; 2 bytes are stored movw tosh:tosl, xh:xl jmp nextopcode ; needs to be vectored for different memory types fetchdoc: ; .db '( adr -- n ) ' ; .db ' Replace the top element of the stack with the ' ; .db ' value of the 16bites at the given memory address ' ; .dw $-fetch.doc fetchh: .db LANG ; domain name prefix .dw storeplusx ; link to previous word .db "@", 1 fetchx: ; here we need to look at the address requested and parlay ; or bifurcate that into LD "load data memory" or LPM ; "load program memory" instructions depending where the address ; is in the pre-defined "memory map". This should not be too hard ; but "store" (!) will be trickier because program-memory (flash ; memory on the atmega) may need to be buffered before writing. movw xh:xl, tosh:tosl ; ld tosl, X+ ld tosh, X+ jmp nextopcode ; needs to be vectored for different memory types fetchplusdoc: ; db '( adr -- adr+2 n ) ' ; db ' Replace the top element of the stack with the ' ; db ' value of the 16bites at the given memory address ' ; db ' and increment the address by 2 bytes. ' ; dw $-fetchplus.doc fetchplush: .db LANG ; domain name prefix .dw fetchx ; link to previous word .db "@+", 2 fetchplusx: movw xh:xl, tosh:tosl ; ld tosl, X+ ld tosh, X+ push xh push xl jmp nextopcode ; needs to be vectored for different memory types cstoredoc: ; db '( n adr -- ) store the byte value n at address adr.' ; dw $-cstore.doc cstoreh: .db LANG ; domain name prefix .dw fetchplusx ; link to previous word .db "c!", 2 cstorex: movw xh:xl, tosh:tosl ; pointer to address pop r16 pop r17 ; value to store at address r16 st X+, r16 ; only lower byte is stored pop tosl pop tosh jmp nextopcode ; needs to be vectored for different memory types cstoreplusdoc: ; db '( n adr -- adr+1 ) store the byte value n at address adr.' ; db ' And increment the address ' ; dw $-cstoreplus.doc cstoreplush: .db LANG ; domain name prefix .dw cstorex ; link to previous word .db "c!+", 3 cstoreplusx: movw xh:xl, tosh:tosl ; pointer to address pop r16 ; value to store at adr pop r17 ; st X+, r16 ; only lower byte stored movw tosh:tosl, xh:xl jmp nextopcode ; needs to be vectored for different memory types cfetchdoc: ; db '( adr -- n ) Replace the top element of the stack with the value ' ; db ' of the byte at the given memory address.' ; db ' eg: myvar @ . ' ; db ' displays the value at the address given by "myvar" ' ; dw $-cfetch.doc cfetchh: .db LANG ; domain name prefix .dw cstoreplusx ; link to previous word .db "c@", 2 cfetchx: movw xh:xl, tosh:tosl ; ld tosl, X ; sign extend tosl to tosh:tosl so -1 (8 bits) becomes -1 (16 bits) ldi tosh, 127 ; can be hoisted out of loops, and any reg is fine. cp tosh, tosl ; C = (tosl < 0) sbc tosh, tosh ; tosh = (tosl < 0) ? -1 : 0 jmp nextopcode ; needs to be vectored for different memory types cfetchplusdoc: ; db '( adr -- adr+1 n ) ' ; db ' Replace the top element of the stack with the value ' ; db ' of the byte at the given memory address and increment the ' ; db ' address . This is exactly the same as "count"' ; dw $-fetchplus.doc cfetchplush: .db LANG ; domain name prefix .dw cfetchx ; link to previous word .db "c@+", 3 cfetchplusx: movw xh:xl, tosh:tosl ; ld tosl, X+ ; sign extend tosl to tosh:tosl so -1 (8 bits) becomes -1 (16 bits) ldi tosh, 127 ; can be hoisted out of loops, and any reg is fine. cp tosh, tosl ; C = (tosl < 0) sbc tosh, tosh ; tosh = (tosl < 0) ? -1 : 0 push xh push xl jmp nextopcode equalsdoc: ; db ' ( n1 n2 -- flag ) ' ; db 'Puts -1 (true) on the stack if n1==n2 ' ; db 'otherwise puts zero (false) on the stack. ' ; dw $-equals.doc equalsh: .db LANG ; domain name prefix .dw cfetchplusx ; link to previous word .db "=", 1 equalsx: pop r16 pop r17 ; NOS high byte cp r16, tosl ; brne notequal cp r17, tosh ; brne notequal ldi tosh, 0 ldi tosl, 1 jmp nextopcode notequal: ldi tosh, 0 ldi tosl, 0 jmp nextopcode notequalsdoc: ; db ' ( n1 n2 -- flag ) ' ; db 'Puts 0 (false) on the stack if n1==n2 ' ; db 'otherwise puts -1 (true) on the data stack' ; dw $-notequals.doc notequalsh: .db LANG ; domain name prefix .dw equalsx ; link to previous word .db "<>", 2 notequalsx: pop r16 pop r17 ; NOS high byte cp r16, tosl ; brne doesnotequal cp r17, tosh ; brne doesnotequal ldi tosh, 0 ldi tosl, 0 jmp nextopcode doesnotequal: ldi tosh, 0 ldi tosl, 1 jmp nextopcode lessthandoc: ; db ' ( n1 n2 -- flag ) ' ; db 'Puts 0 (false) on the stack if not n1dx; quotient->ax ;push dx ; put remainder on stack ;mov dx, ax ; put quotient on top of stack jmp nextopcode timestwodoc: ; db '(n1 -- n1*2 ) ' ; db ' double n1. This basically performs a ' ; db ' left shift on the bits in n1 ' ; dw $-timestwo.doc timestwoh: .db LANG ; domain name prefix .dw divmodx .db "2*", 2 timestwox: lsl tosl rol tosh jmp nextopcode multdoc: ; db '(n1 n2 -- n1*n2 ) ' ; db ' signed multiplication ' ; dw $-mult.doc multh: .db LANG ; domain name prefix .dw timestwox .db "*", 1 multx: ;pop ax ;imul dx ; do dx:ax := ax*dx ;mov dx, ax ; result in top of stack, tos=dx jmp nextopcode umultdoc: ; db '(n1 n2 -- n1*n2 ) ' ; db ' unsigned multiplication ' ; dw $-umult.doc umulth: .db LANG ; domain name prefix .dw multx .db "u*", 2 umultx: ;pop ax ;mul dx ; do dx:ax := ax*dx ;mov dx, ax ; result in top of stack, tos=dx jmp nextopcode fcalldoc: ; .db 'Call a virtual proceedure on the bytecode stack machine' ; .db 'The current code pointer is saved - pushed onto the ' ; .db 'return stack at pointer Y or rph:rpl (r1:r0) starting at ' ; .db 'RAMEND-512 and the address ' ; .db 'of the virtual proc to execute is loaded into the Z register ' ; .dw $-fcall.doc fcallh: .db LANG ; remove these domain name prefixes! .dw umultx .db "fcall", 5 fcallx: movw zh:zl, pch:pcl ; get pc program counter lpm r16, z+ ; get low byte of call address lpm r17, z+ ; get high byte of call address ; Is this the problem !!! Return stack is NOT "Y" !!! st Y+, zl st Y+, zh ; save PC program counter on Rstack movw pch:pcl, r17:r16 ; load new program counter value jmp nextopcode pcalldoc: ; db ' ( xt -- ) ' ; db ' Call a procedure using the top element on ' ; db ' the data stack as the execution address. This ' ; db ' allows the implementation of function pointers' ; db ' In standard forths this is called "execute". ' ; dw $-pcall.doc pcallh: .db LANG ; domain name prefix .dw fcallx .db "pcall", 5 ; or call it exec/ex/execute pcallx: ; is Y the rstack pointer????? bug?? st Y+, pcl st Y+, pch ; save PC program counter on Rstack movw pch:pcl, tosh:tosl; load new program counter value pop tosl pop tosh jmp nextopcode exitdoc: ; db 'exit a virtual procedure by restoring the program counter ' ; db 'from the Rstack ' ; dw $-exit.doc exith: .db LANG ; domain name prefix .dw pcallx .db "exit", 4 exitx: ; Is Y really the return stack pointer?????? ld pch, -Y ; get call return address from top of Rstack ld pcl, -Y ; jmp nextopcode ljumpdoc: ; db 'jumps to a relative virtual instruction.' ; db ' The jump is given in the next 2 bytes' ; dw $-ljump.doc ljumph: .dw exitx ; link to prev .db "ljump", 5 ; reverse count ljumpx: movw zh:zl, pch:pcl ; get ip lpm r16, z+ ; get jump lpm r17, z+ ; get jump subi pcl, 2 sbci pch, 0 ; adjust offset to JUMP instruction movw pch:pcl, r17:r16 ; load new PC program counter jmp nextopcode jumpdoc: ; db 'jumps to a relative virtual instruction.' ; db ' The relative jump is given in the next byte.' ; db ' eg: JUMP, -2, jumps back 2 instructions in the bytecode' ; db ' eg: LIT, '*', EMIT, JUMP, -3, ' ; db ' prints a never-ending list of asterixes ' ; dw $-jump.doc jumph: .dw ljumpx ; .db "jump", 4 ; reverse count jumpx: movw zh:zl, pch:pcl ; get ip lpm r16, z ; get jump ; sign extend ldi r17, 127 ; cp r17, r16 ; C = (tosl < 0) sbc r17, r17 ; tosh = (tosl < 0) ? -1 : 0 ; add or subtract jump offset from PC program counter ; can only use sbiw with r24,26,28,30 subi pcl, 1 sbci pch, 0 ; adjust offset to JUMP instruction add pcl, r16 adc pch, r17 jmp nextopcode jumpzdoc: ; db ' ( n -- ) ; db 'jumps to a relative virtual instruction if top ' ; db 'stack element is zero. The flag value is removed ' ; db 'from the stack ; db ' The relative jump is given in the next byte.' ; db ' eg: JUMPZ, -2, jumps back 2 instructions in the bytecode' ; db ' eg: KEY, DUP, EMIT, LIT, '0', MINUS, JUMPNZ, -6 ' ; db ' allows the user to type until zero is pressed. ' ; dw $-jump.doc ; handle jumps by modifying virtual ip (in this case SI) jumpzh: .dw jumpx ; link to prev .db "jumpz", 5 ; reverse count jumpzx: movw zh:zl, pch:pcl ; get ip lpm r16, z+ ; get jump movw pch:pcl, zh:zl ; save ip (if no jump) cpi tosl, 0 ; check TOS for zero brne exitjumpz cpi tosh, 0 ; check TOS for zero brne exitjumpz ; sign extend ldi r17, 127 ; cp r17, r16 ; C = (tosl < 0) sbc r17, r17 ; tosh = (tosl < 0) ? -1 : 0 subi pcl, 2 sbci pch, 0 ; adjust offset to JUMP instruction ; add or subtract jump offset from PC program counter add pcl, r16 adc pch, r17 exitjumpz: pop tosl pop tosh jmp nextopcode jumpnzdoc: ; db ' ( n -- ) ; db 'jumps to a relative virtual instruction if top stack element ' ; db ' is not zero. Top stack item is removed. ; db ' The relative jump is given in the next byte.' ; db ' eg: JUMPNZ, -2, jumps back 2 instructions in the bytecode' ; db ' eg: KEY, DUP, EMIT, LIT, 'q', MINUS, JUMPNZ, -6 ' ; db ' allows the user to type until "q" is pressed. ' ; dw $-jumpnz.doc jumpnzh: .dw jumpzx ; link to prev .db "jumpnz", 6 ; reverse count jumpnzx: movw zh:zl, pch:pcl ; get ip lpm r16, Z+ ; get jump movw pch:pcl, zh:zl ; save ip (if no jump) ; logic may not be working! especially for zero "non-jumps" cpi tosl, 0 ; check TOS for zero brne notzerotosl cpi tosh, 0 ; check TOS for zero breq exitjumpnz notzerotosl: ; sign extend ldi r17, 127 ; cp r17, r16 ; C = (tosl < 0) sbc r17, r17 ; tosh = (tosl < 0) ? -1 : 0 subi pcl, 2 ; PC-2 to realign to JUMPNZ sbci pch, 0 ; add or subtract jump offset from PC program counter add pcl, r16 adc pch, r17 exitjumpnz: pop tosl pop tosh jmp nextopcode rloopdoc: ; db ' ( R: n -- n-1 ) ' ; db ' Decrements loop counter on return stack and jumps to ' ; db ' target if counter > 0 ' ; db ' like the x86 loop instruction this is a pre-decrement ' ; db ' so a loop counter of 2 will loop twice. The disadvantage ' ; db ' is that a loop counter of 0 will loop 2^16 times. ' ; dw $-rloop.doc rlooph: .dw jumpnzx ; link to prev .db "rloop", 5 ; reverse count rloopx: ; ; handle loops by modifying virtual ip (in this case SI) ; xor ax, ax ; set ax := 0 ; lodsb ; al := [si]++ get relative loop target into AL ; check return stack for zero, if so continue with next ; instruction (dont jump/loop) ; mov bx, [es:di-2] ; get top return stack item into bx ; dec bx ; decrement the loop counter on the return stack ; cmp bx, 0 ; if bx != 0 continue ; mov [es:di-2], bx ; update the counter ; je .exit ; the only difference with jumpz ! ; cbw ; convert signed byte al to signed word ax (neg offset) ; sub si, 2 ; realign si to JUMP instruction, ; add si, ax ; adjust the si code pointer by jump offset ;.exit: jmp nextopcode divtwodoc: ; db '(n1 - n1/2) ' ; db ' divide n1 by 2 ' ; dw $-divtwo.doc divtwoh: .db LANG ; domain name prefix .dw logandx .db "/2", 2 divtwox: lsr tosh ror tosl jmp nextopcode ; readdoc: ; db ' ( 1st-sector n addr -- flag=T/F ) ; db ' reads n sectors from disk starting at sector to memory addr ' ; db ' returns 0 on failure, 1 on success. But on a microcontroller ' ; db ' system, this is not really necessary ' ; dw $-readdoc readh: .db DISK ; domain name prefix .dw rloopx .db "read", 4 readx: ; just ignore stack parameters pop tosl pop tosh pop tosl pop tosh pop tosl pop tosh clr tosl clr tosh jmp nextopcode writedoc: ; db ' ( first-sector n addr -- flag=T/F/2 ) ; db ' write sectors to disk. possibly not necessary on a ' ; db ' microcontroller ' ; dw $-write.doc writeh: .db DISK ; domain name prefix .dw readx .db "write", 5 writex: ; just ignore stack parameters pop tosl pop tosh pop tosl pop tosh pop tosl pop tosh clr tosl clr tosh jmp nextopcode ; This opcode and vid.x are obviously not going to be ; available on all hardware. So we need to think about ; how to configure plugable opcodes. Eg: what if a device ; has a gyroscope. We want opcodes to read from that gyroscope fgdoc: ; db ' ( n -- ) ' ; db ' set foreground colour for emit' ; dw $-fg.doc fgh: .db MON ; domain name prefix .dw writex .db "fg", 2 fgx: pop tosl pop tosh jmp nextopcode ;fg.d: db 5 ; colour cyan bgdoc: ; db ' ( n -- ) ' ; db ' set the background colour for emit' ; dw $-bg.doc bgh: .db MON ; domain name prefix .dw fgx .db "bg", 2 bgx: pop tosl pop tosh jmp nextopcode ; data element to store current background colour ;bg.d: db 5 ; ; how to clear the serial screen ?? clsdoc: ; db ' ( -- ) ' ; db ' clear screen' ; dw $-cls.doc clsh: .db MON ; domain name prefix .dw bgx .db "cls", 3 clsx: jmp nextopcode viddoc: ; db ' ( n -- ) ' ; db ' set video mode to n' ; db ' on x86 try 13h mode' ; dw $-vid.doc vidh: .db MON ; domain name prefix .dw clsx .db "vid", 3 vidx: pop tosl pop tosh jmp nextopcode ; This opcode may not have much meaning on a microcontroller ; with no display. This raises the concern: we need "pluggable" ; opcodes which are installable/uninstallable depending on the ; configuration of the hardware. pixdoc: ; db ' ( x y -- ) ' ; db ' display one pixel at position xy' ; dw $-pix.doc pixh: .db MON ; domain name prefix .dw vidx .db "pix", 3 pixx: pop tosl pop tosh pop tosl pop tosh jmp nextopcode glyphdoc: ; db ' ( a -- ) ' ; db ' display a ppm (?) glyph at pixel position xy' ; dw $-glyph.doc glyphh: .db MON ; domain name prefix .dw pixx .db "glyph", 5 glyphx: jmp nextopcode ; this may be useful for timing code ; on x86 ... ; clock updates at 1193180/65536 (about 18.2) ticks per second. ; counts per second 18 ; counts per minute 1092 ; counts per hour 65543 ; counts per day 1573040 ; clock incremented approx every 55ms clockdoc: ; db ' ( -- D ) ' ; db ' number of clock ticks since midnight ' ; dw $-clock.doc clockh: .db TIME ; domain name prefix .dw glyphx .db "clock", 5 clockx: jmp nextopcode ; There are "issues" here. It is not always possible to ; set format for rtc time, so need to check format from status ; register B and convert if necessary. Also, should check for ; 2 values the same in a loop, so overcome updating problems rtcdoc: ; db ' ( -- secs mins hours days months years ) ' ; db 'return 6 values on stack representing real time and date. ' ; db 'called time&date in standard forths. ' ; dw $-rtc.doc rtch: .db TIME ; domain name prefix .dw clockx .db "rtc", 3 rtcx: jmp nextopcode noopdoc: ; db 'Does nothing. For some reason most machines ' ; db 'include this instruction. Also it is a good ' ; db 'end marker for the opcodes ' ; dw $-noop.doc nooph: .db LANG ; domain name prefix .dw rtcx .db "nop", 3 noopx: jmp nextopcode starh: .dw noopx .db "star", 4 starx: waithere: lds r16, UCSR0A ; usart control, status register sbrs r16, UDRE0 ; is UDR empty? rjmp waithere ; if not, then just wait ldi r17, '*' sts UDR0, r17 ; if empty tx next character jmp nextopcode ; a primitive method of debugging ; prints one digit from the top stack item digith: .dw starx .db ".digit", 6 digitx: repeat: lds r16, UCSR0A ; usart control, status register sbrs r16, UDRE0 ; is UDR empty? rjmp repeat ; if not, then just wait movw r17:r16, tosh:tosl ;ldi tosl, 5 adiw tosl, '0' ; convert opcode to asci digit sts UDR0, tosl ; if empty tx next character movw tosh:tosl, r17:r16 jmp nextopcode serialx: ldi r16,(1<