# Assembled with the script 'compile.pss' 
start:
 
#
#   pars/compile.pss 
#
#   This is a parse-script which compiles parse-scripts (!).
#
#   What is more, it can compile itself... so we can do
#     >> pep -f compile.pss compile.pss > asm.new.pp
#     
#   This is useful because the resulting 'assembler' program (in sav.pp)
#   and printed to stdout, can be used as a replacement for 'asm.pp'
#   which is the default parse-script language compiler. The advantage 
#   is that it is easier to maintain and add new sytax to compile.pss
#   than it is to 'asm.pp'.
#
#   This script uses the virtual machine and engine implemented at 
#   http://bumble.sf.net/books/pars/object/ It implements a script language
#   with a syntax reminiscent of sed and awk (much simpler than awk, but
#   more complex than sed).
#   
#   This code was created in a straightforward manner by adapting the 
#   "assembled" code in 'asm.pp'. Some extra error checks were added.
#   Also, the EOF test was placed at the end of the script to remove
#   the 'last character' bug. It was evident that using the script
#   language is much more comfortable that hand-coding parse machine
#   assembler programs.
#
#HOW TO ADD A NEW COMMAND TO PEP
#
#   In general, we would like to avoid adding new commands 
#   (instructions) to the pep script-language/machine since we would
#   like to keep the machine and language as simple as possible.
#   However, after great thought and cogitation, sometimes new commands
#   or features seem like a good idea. To add a new command the 
#   process is as follows:
#
#   Add a constant in command.h and command.c in the info[] array.
#   Implement the commmand in machine.interp.c in the big switch
#   statement. Type "make" in the object/ folder to compile the code.
#   Test the new code in the pep interpreter with
#     >> pep -Ii "some text"
#   and type the new command.
#
#   Then modify compile.pss to recognise the command when it is 
#   in a script. 
#   Then copy asm.pp to asm.old.pp  Important!! be careful to save 
#   a working copy of "asm.pp" or else the whole system stops working!
#    
#   Then do
#     >> pep -f compile.pss compile.pss > asm.new.pp
#     >> cp asm.new.pp asm.pp
#     >> vim asm.pp 
#      (and delete the 2 extra print commands - search for "remove"
#   Now test your "newcommand" with eg
#     >> pep -e "r; newcommand; t;d;" -i "abcd"
#   If pep stops working completely, you will need to copy asm.old.pp
#   back to asm.pp to get things working again.
#
#   Now, if you wish, add the new command to the various 
#   translation scripts in the tr/ folder. The modifications 
#   required in the translation scripts are very similar to the 
#   modifications made to the "compile.pss" script, except that 
#   the target language is different.
#
#REPLACING ASMPP
#
#   We can use this script as a replacement for "asm.pp" or 
#   "asm.handcode.pp" which is a script assembler written by hand in
#   the parse machine assembly format (1 command per line, labels, jumps,
#   tests, etc). 
#
#   * replace asm.pp with compile.pss
#   -----
#    # generate the new script assembler
#    cp asm.pp asm.old; pep -f compile.pss compile.pss > asm.new.pp
#    cp asm.new.pp asm.pp
#    # test the new assembler (the script "r;t;t;t;d;" will be compiled
#    # by the new asm.pp which we have just created.
#    pep -e "r;t;t;t;d;" -i "abcd"
#    # output: aaabbbcccddd
#   ,,,
#
#   The advantage of all this, is that it is much easier to maintain and add
#   new syntax to "pars/compile.pss" than it is directly, to "pars/asm.pp"
#   
#   For example asm.handcode.pp still uses "rabbit hops" to compile "quoteset"
#   tokens (an old version of the "ortestset" token), which is very inefficient
#   but compile.pss uses the new look-ahead technique. Also, there are negated
#   tests implemented in compile.pss but not implemented in asm.handcode.pp 
#   
#   I will no longer continue to maintain asm.handcode.pp because its real
#   purpose was to "bootstrap" the current script. I will maintain working
#   copies of asm.pp as generated by this script in case of future errors.
#     
#NOTES
#
#   The accumulator register was being used to generate true-jump 
#   targets for testsets, but no longer
#   
#   This script can be used as the basis for many others which transform
#   scripts in some way. 
#   
#   For example, to 'pretty-print' scripts, or to generate compilable c code
#   for a script using the functions in machine.methods.c. So, instead of
#   compiling to the "assembler" format for the machine (which is then
#   interpreted by the code in pep.c) we can compile to a series of c function
#   calls. This is c source code which can be compiled with gcc, producing an
#   executable version of the target script.
#
#   This is an interesting idea, because we can transform a script into
#   compilable or executable code in a different language with a different
#   'Machine' object. So, for example, we could write a Machine object in Ruby
#   or Java or Python or x86 assembler and then generate compilable or
#   executable code for that target environment. The compilable code would
#   consist of a series of method calls for the given object and test and
#   jumps. 
#
#   It will also be interesting to see if there is a significant performance
#   advantage in running executed, rather than interpreted scripts. see
#   tr/translate.c.pss or tr/translate.go.pss for creating executable parse
#   programs from scripts
#
#GRAMMAR NOTES
#
#  The machine cannot directly implement the ebnf structures of repetition
#  "{...}", optionality "[...]" or grouping "(...)", so we need to express all
#  grammar rules only in terms of alternation |. Quotesets are a handy way to
#  express this in a script, eg
#
#     * bnf rule: alpha ::= a | b | c ;
#     >> 'a','b','c' { clear; add "alpha*"; push; .reparse } 
#
#  It is sometimes straightforward to factor out the above ebnf structures,
#  but the result is a greater number of rules.
#
#SEE ALSO
#   
#   At http://bumble.sf.net/books/pars/
#   object/pep.c 
#     the current implementation of the machine interpreter and debugger. 
#   object/*.c 
#     the virtual machine and components 
#   tr/translate.java.pss
#     compiles pep scripts to a stand-alone java source code file
#   tr/translate.go.pss
#   tr/translate.ruby.pss
#   tr/translate.python.pss
#     As above, for go, ruby and python.
#   asm.handcode.pp
#     a handcoded "assembly" compiler of the parse script language for 
#     a previous version of the script language. This was how I 
#     initially "bootstrapped" the pep language (before using the
#     current file, compile.pss to create new versions of the pep 
#     language).
#    
#USAGE
#
#   This script is used to replace the hand-coded assembler file
#   "asm.handcode.pp" since it is much easier to maintain and add new syntax
#   for the parse-script language. Comments are preserved (largely) in the
#   output file. 
#   
#   We can also do the seemingly strange operation
#     >> pep -f compile.pss compile.pss
#   which actually creates an 'assembler' version of itself in 'sav.pp'
#   which is then be suitable for use as an 'asm.pp' substitute.
#   (This is how we modify the syntax of the pep language, if need be). 
#   This is quite tricky to think about since it is so self-referential.
#
#   This is analogous to the equally strange operation
#     >> pep -f tr/translate.c.pss tr/translate.c.pss > eg/clang/tr.clang.c
#   which generates a compilable c language program of the compilable 
#   script.
#
#   It is possible to compile this script into a stand-alone
#   executable with:
#   ----
#     pep -f tr/translate.c.pss tr/translate.c.pss > eg/clang/tr.clang.c
#     cd eg/clang/
#     gcc -o tr.clang.c -Lobject -lmachine -Iobject
#   ,,,,
#
#TESTING
#
#   * view how this script compiles an inline script
#   >> pep -f compile.pss -i "[aeiou] {a '(vowel)'; } t;d;"
#
#   The result will also be saved in "sav.pp"
#
#   * see how the compiled script runs
#   >> pep -a sav.pp -i "abcde"
#   output: a(vowel)bcde(vowel)
#
#   * test "test chaining" compilation  
#   >> pep -f compile.pss -i "r;'a','b','c'{t;}t;d;"
#   >> pep -a sav.pp -i "axbxcx"
#   output should be: aaxbbxccx
#
#   * view/debug how compile.pss compiles test chains (or something else)
#   >> pep -If compile.pss -i "r;'a','b','c'{t;}t;d;"
#
#   This provides interactive debugging of the compilation process.
#
#FIXED BUGS
#
#  I was getting segmentation faults because of one-off errors etc
#  >> pep -f compile.pss compile.pss
#  Mainly fixed with "valgrind", but still a bug in "until" (in
#  object/machine.interp.c execute()... need to implement endsWith() function.
#  And one other bug.
#  * didnt need 2 jumps after "tests", just 1 jumpfalse or jumptrue
#    used "replace" to remove the unnecessary jump
#
#  * eg: add "\\"; threw an error and also: replace "\\" "\\\\";
#    This was a problem with the "until" implementation in machine.interp.c
#    It was actually necessary to count the number of escape chars 
#    before the suffix. If even, break, if not, dont.
#
#BUGS
#   
#  * missing braces in scripts dont produce good error messages,
#    just a cryptic "script could not be compiled".
#
#  * should I unescape single quotes in single quote blocks??
#    eg ' abc\'xyz' will become " abc'xyz"
#
#  * doesnt catch B[abc] or E[a-z] type errors in scripts. Also 
#    doesnt catch "r;r;d" type errors.
#  * Also, un-balanced braces give cryptic error messages
#
#  compile.pss should not write the compiled script to stdout
#  because then asm.pp will do the same thing. easy enough to fix
#  in asm.pp as well (comment out final 2 "print" commands).
#
#  Comments may not be parsing correctly.
#
#  Comments and multiline comments should not jump back to read
#  after deleting the comment, because there could be no more 
#  input, and read will throw an error. They should jump to 
#  the EOF end-of-file check. Or they could just call ".reparse"
#  which is safe but not very efficient.
#
#TODO
#
#  Add an "echar" command that changes the default escape
#  character. Also, in some languages a character actually
#  escapes itself, eg '' is ' escaped!
#
#  We could allow single argument "replace" command eg:
#    >> replace "x";
#  which is equivalent to
#    >> replace "x" "":
#  
#  Need to catch multiline quote errors when used with the 
#  "until" command.
#
#  Separate error checking into a new script, and make pep load
#  an assembled version of this error checker. This will allow
#  the same error checker to be used with the scripts
#    tr/translate.java.pss tr/translate.tcl.pss etc.
#
#HISTORY
#    
#  15 feb 2025
#    Added an error check for ".." and ",," but havent compiled 
#  31 aug 2022
#    implemented until; (untiltape) which reads until the 
#    workspace ends with the current tape-cell. untiltape was 
#    provoked by the gnu sed syntax "s#a#b#", that is any
#    substitute character for s/a/b/
#  19 aug 2022
#    Need to add "esc" to change default escape char.
#    "write <filename>" to write to a given file
#    "writeappend <filename>" to append to a given file
#    "quit <exitcode>" to exit with an error code.
#    Try to change the grammar to allow expression syntax
#    as in tr/translate.perl.pss  Need to make better 
#    error checking.
#
#  15 june 2021
#    Adding the commands "upper" "lower" and "cap"
#    "nochars" "nolines"
#
#  13 march 2020
#    Added compilation for multiline arguments for the "add" 
#    command. Appears to be working.
#
#  15 sept 2019
#    Realised that I can have an eof error check block at 
#    the end of the script just before all the tokens are 
#    pushed back on the stack. See the 1 and 2 token eof error
#    check in this script.
#
#  13 september 2019
#    Adding "mark" and "go" commands here.
#    Improved unterminated quote '/" error messages. In general
#    it is much more helpful to catch the error when it happens 
#    and print an informative message (with line-number etc).
#
#  5 september 2019
#
#    Added a "stack" and "unstack" command to the machine and
#    to compile.pss
#
#  29 august 2019
#   
#    Improved some error checks. Could make the error check code
#    more succinct.
#
#    Changed the way testeof and testtape are parsed to include
#    them with other tests. This also allows to negate them with
#    !(==) and !(eof) and also to concatenate with other tests
#    eg: (eof),B"abc" {}
#    added extra syntax <eof> <EOF> and <==> for these tests.
#
#  25 august 2019
#
#    Realised that I dont need 2 jumps for OR test concatenation (with ',')
#    That will greatly improve script interpretation efficiency.
#
#    Added AND concatenation logic to tests so now we can do
#
#     * test if workspace begins with 'a' AND ends with 'z'
#     >> B"a".E"z" {}
#
#    Changed the way .reparse and .restart are parsed and compiled.
#    These are now parsed as 2 tokens ".*word*". This allows me to
#    use '.' for AND logic concatenation in tests. It also allows
#    me to provide special semantic meaning to commands beginning with
#    a dot, which seems like a good thing.
#
#    Added "delim" command here and in machine.c and machine.interp.c, 
#    to change the stack delimiter.
#
#  24 august 2019
#
#    The "state*" token should be separated into "testeof*" and 
#    "testtape*" and then the 2 tests can be elided.(done)
#
#    The conversion to a "test*{*" rule and ellision of 
#    multiple tests will make this script much more compact and hopefully
#    just as readable. Also, as a side effect, negation of all
#    tests will be available soon. Also, it is possible to chain together
#    different types of tests.
#
#    Converted quoteset to "ortestset*" and "andtestset*". 
#    I will introduce a new notation namely:
#
#    * check if workspace begins with "abc" AND ends with "xyz"
#    >> B"abc" . E"xyz" { commands }
#
#    so the dot will become an "AND" (&&) concatenator of tests
#    and "," will remain as the "OR" (||) concatenator of tests
#    In these || and && test lists any type of test can be 
#    included for example
#     
#     * check if workspace starts with "a", only contains chars a|b|c
#     * and ends with the letter "z" (using "." AND concatenator)
#     >> B"abc" . [abc] . E"z" { ... } 
#
#    Experimenting with the new technique to create negated tokens
#    classes.
#
#    * test negated tokens for the equality test
#    >> pep -f compile.pss -i 'r;!"b",!"a"{nop;}'
#
#  23 august 2019
#    
#    Adding begintests B"..." { } and endtests E"..." {} to the quoteset logic.
#    But need to juggle the combinations. Also will add classes and negated
#    classes. More or less working. But should actually change parsing to
#    make quotesets more flexible, see the section of the script for details.
#
#    The new quoteset compilation seems to be working.
#    Needs more testing. We can now use compile.pss as a replacement
#    for asm.pp.
#
#    Converting to a new quoteset (eg: 'n','m' {...} ) lookahead compiling
#    technique.  Also we can compile comments with rules for
#    "comment*command*" and "command*comment*" and "comment*comment*" ->
#    "comment*". Instead of the current shenanigins.
#
#  14 august 2019
#
#    trying to preserve comments here but cant reduce comments
#    with tokens like {* }* !* etc because we never retrieve
#    the attributes for those tokens. more thought required.
#
#    Added a !"text" {...} syntax. very simple to add here. 
#    did the same in compilable.c.pss
#
#    Added a "begin" block to this (for start configurations of scripts).
#    Also need to improve the compilation of "quoteset*" tokens which produce
#    nifty but very poor code. need 'tapereplace' command for this?
#    
#  30 july 2019
#    Fixed the last character bug by putting the EOF test at the very end of
#    the file. The translation is complete and the script appears to be
#    working but no doubt will contain bugs.  Initially translated from
#    asm.pp.
#
#
read
#--------------
# make character number relative to line number for
# compile error messages etc.
# this is causing pep to hang/ infinite loop? not sure why
#"\n" { clear; nochars; (eof) { .reparse } .restart }
# [:space:] { clear; (eof) { .reparse } .restart }
testclass [:space:]
jumpfalse block.end.15951
  clear
  jump parse
block.end.15951:
#---------------
# We can ellide all these single character tests, because
# the stack token is just the character itself with a *
# Braces {} are used for blocks, ',' and '.' for concatenating
# tests with OR or AND logic. 'B' and 'E' for begin and end
# tests. 
testis "{"
jumptrue 16
testis "}"
jumptrue 14
testis ";"
jumptrue 12
testis ","
jumptrue 10
testis "."
jumptrue 8
testis "!"
jumptrue 6
testis "B"
jumptrue 4
testis "E"
jumptrue 2 
jump block.end.16313
  put
  add "*"
  push
  jump parse
block.end.16313:
#---------------
# format: "text"
testis "\""
jumpfalse block.end.16666
  # save the line number in case there is no terminating
  # quote.
  clear
  ll
  put
  clear
  add "\""
  until "\""
  testends "\""
  jumptrue block.end.16608
    clear
    add "Unterminated quote (\") starting at line "
    get
    add " !\n"
    print
    quit
  block.end.16608:
  put
  clear
  add "quote*"
  push
  jump parse
block.end.16666:
#---------------
# format: 'text', single quotes are converted to double quotes
# but we must escape embedded double quotes.
testis "'"
jumpfalse block.end.17212
  # save the line number in case there is no terminating
  # quote.
  clear
  ll
  put
  clear
  until "'"
  testends "'"
  jumptrue block.end.17041
    clear
    add "Unterminated quote (') starting at line "
    get
    add "!\n"
    print
    quit
  block.end.17041:
  # should we unescape single quotes here??
  clip
  escape "\""
  put
  clear
  add "\""
  get
  add "\""
  put
  clear
  add "quote*"
  push
  jump parse
block.end.17212:
#---------------
# formats: [:space:] [a-z] [abcd] [:alpha:] etc 
testis "["
jumpfalse block.end.17360
  until "]"
  put
  clear
  add "class*"
  push
  jump parse
block.end.17360:
#---------------
# formats: (eof) (==) etc. I may change this syntax to just
# EOF and ==
testis "("
jumpfalse block.end.17869
  clear
  until ")"
  clip
  put
  testis "eof"
  jumptrue 4
  testis "EOF"
  jumptrue 2 
  jump block.end.17554
    clear
    add "eof*"
    push
    jump parse
  block.end.17554:
  testis "=="
  jumpfalse block.end.17607
    clear
    add "tapetest*"
    push
    jump parse
  block.end.17607:
  add " << unknown test near line "
  ll
  add " of script.\n"
  add " bracket () tests are \n"
  add "   (eof) test if end of stream reached. \n"
  add "   (==)  test if workspace is same as current tape cell \n"
  print
  clear
  quit
block.end.17869:
#---------------
# multiline and single line comments, eg #... and #* ... *#
testis "#"
jumpfalse block.end.18918
  clear
  read
  testis "\n"
  jumpfalse block.end.18005
    clear
    jump parse
  block.end.18005:
  # checking for multiline comments of the form "#* \n\n\n *#"
  # these are just ignored at the moment (deleted) 
  testis "*"
  jumpfalse block.end.18764
    # save the line number for possible error message later
    clear
    ll
    put
    clear
    until "*#"
    testends "*#"
    jumpfalse block.end.18509
      # convert to # single-line comments
      clip
      clip
      #put; clear; add "#*"; get; add "*#";
      replace "\n" "\n#"
      # create a "comment" parse token
      put
      clear
      add "comment*"
      push
      jump parse
    block.end.18509:
    # make an unterminated multiline comment an error
    # to ease debugging of scripts.
    clear
    add "unterminated multiline comment #* ... *# \n"
    add "stating at line number "
    get
    add "\n"
    print
    clear
    quit
  block.end.18764:
  # single line comments. some will get lost.
  put
  clear
  add "#"
  get
  until "\n"
  clip
  put
  clear
  add "comment*"
  push
  jump parse
block.end.18918:
#----------------------------------
# parse command words (and abbreviations)
# legal characters for keywords (commands)
testclass [abcdefghijklmnopqrstuvwxyzBEFKGOPRUWS+-<>0^]
jumptrue block.end.19301
  # error message about a misplaced character
  put
  clear
  add "!! Misplaced character '"
  get
  add "' in script near line "
  ll
  add " (character "
  cc
  add ") \n"
  print
  clear
  bail
block.end.19301:
# my testclass implementation cannot handle complex lists
# eg [a-z+-] this is why I have to write out the whole alphabet
while [abcdefghijklmnopqrstuvwxyzBEOFKGPRUWS+-<>0^]
#----------------------------------
# KEYWORDS 
# here we can test for all the keywords (command words) and their
# abbreviated one letter versions (eg: clip k, clop K etc). Then
# we can print an error message and abort if the word is not a 
# legal keyword for the parse-edit language
# make ll an alias for "lines" and cc an alias for chars
testis "lines"
jumpfalse block.end.19885
  clear
  add "ll"
block.end.19885:
testis "chars"
jumpfalse block.end.19917
  clear
  add "cc"
block.end.19917:
# one letter command abbreviations
testis "a"
jumpfalse block.end.19984
  clear
  add "add"
block.end.19984:
testis "k"
jumpfalse block.end.20014
  clear
  add "clip"
block.end.20014:
testis "K"
jumpfalse block.end.20044
  clear
  add "clop"
block.end.20044:
testis "D"
jumpfalse block.end.20077
  clear
  add "replace"
block.end.20077:
testis "d"
jumpfalse block.end.20108
  clear
  add "clear"
block.end.20108:
testis "u"
jumpfalse block.end.20139
  clear
  add "upper"
block.end.20139:
testis "U"
jumpfalse block.end.20170
  clear
  add "lower"
block.end.20170:
testis "A"
jumpfalse block.end.20199
  clear
  add "cap"
block.end.20199:
testis "t"
jumpfalse block.end.20230
  clear
  add "print"
block.end.20230:
testis "p"
jumpfalse block.end.20259
  clear
  add "pop"
block.end.20259:
testis "P"
jumpfalse block.end.20289
  clear
  add "push"
block.end.20289:
testis "u"
jumpfalse block.end.20322
  clear
  add "unstack"
block.end.20322:
testis "U"
jumpfalse block.end.20353
  clear
  add "stack"
block.end.20353:
testis "G"
jumpfalse block.end.20382
  clear
  add "put"
block.end.20382:
testis "g"
jumpfalse block.end.20411
  clear
  add "get"
block.end.20411:
testis "x"
jumpfalse block.end.20441
  clear
  add "swap"
block.end.20441:
testis ">"
jumpfalse block.end.20469
  clear
  add "++"
block.end.20469:
testis "<"
jumpfalse block.end.20497
  clear
  add "--"
block.end.20497:
testis "m"
jumpfalse block.end.20527
  clear
  add "mark"
block.end.20527:
testis "M"
jumpfalse block.end.20555
  clear
  add "go"
block.end.20555:
testis "n"
jumpfalse block.end.20586
  clear
  add "count"
block.end.20586:
testis "+"
jumpfalse block.end.20614
  clear
  add "a+"
block.end.20614:
testis "-"
jumpfalse block.end.20642
  clear
  add "a-"
block.end.20642:
testis "0"
jumpfalse block.end.20672
  clear
  add "zero"
block.end.20672:
testis "c"
jumpfalse block.end.20700
  clear
  add "cc"
block.end.20700:
testis "l"
jumpfalse block.end.20728
  clear
  add "ll"
block.end.20728:
testis "C"
jumpfalse block.end.20761
  clear
  add "nochars"
block.end.20761:
testis "L"
jumpfalse block.end.20794
  clear
  add "nolines"
block.end.20794:
testis "^"
jumpfalse block.end.20826
  clear
  add "escape"
block.end.20826:
testis "v"
jumpfalse block.end.20860
  clear
  add "unescape"
block.end.20860:
testis "z"
jumpfalse block.end.20891
  clear
  add "delim"
block.end.20891:
testis "S"
jumpfalse block.end.20922
  clear
  add "state"
block.end.20922:
testis "f"
jumpfalse block.end.20953
  clear
  add "write"
block.end.20953:
testis "F"
jumpfalse block.end.20985
  clear
  add "append"
block.end.20985:
testis "r"
jumpfalse block.end.21015
  clear
  add "read"
block.end.21015:
testis "R"
jumpfalse block.end.21046
  clear
  add "until"
block.end.21046:
testis "T"
jumpfalse block.end.21081
  clear
  add "untiltape"
block.end.21081:
testis "w"
jumpfalse block.end.21113
  clear
  add "while"
block.end.21113:
testis "W"
jumpfalse block.end.21147
  clear
  add "whilenot"
block.end.21147:
testis "o"
jumpfalse block.end.21176
  clear
  add "nop"
block.end.21176:
# we can probably omit tests and jumps since they are not
# designed to be used in scripts (only assembled parse programs).

#   "b" { clear; add "jump"; }
#   "j" { clear; add "jumptrue"; }
#   "J" { clear; add "jumpfalse"; }
#   "=" { clear; add "testis"; }
#   "?" { clear; add "testclass"; }
#   "b" { clear; add "testbegins"; }
#   "B" { clear; add "testends"; }
#   "E" { clear; add "testeof"; }
#   "*" { clear; add "testtape"; }
#   
testis "n"
jumpfalse block.end.21656
  clear
  add "count"
block.end.21656:
testis "+"
jumpfalse block.end.21684
  clear
  add "a+"
block.end.21684:
testis "-"
jumpfalse block.end.21712
  clear
  add "a-"
block.end.21712:
testis "0"
jumpfalse block.end.21742
  clear
  add "zero"
block.end.21742:
testis "c"
jumpfalse block.end.21774
  clear
  add "cc"
block.end.21774:
testis "chars"
jumpfalse block.end.21806
  clear
  add "cc"
block.end.21806:
testis "l"
jumpfalse block.end.21838
  clear
  add "ll"
block.end.21838:
testis "lines"
jumpfalse block.end.21870
  clear
  add "ll"
block.end.21870:
testis "^"
jumpfalse block.end.21902
  clear
  add "escape"
block.end.21902:
testis "v"
jumpfalse block.end.21936
  clear
  add "unescape"
block.end.21936:
testis "z"
jumpfalse block.end.21967
  clear
  add "delim"
block.end.21967:
testis "S"
jumpfalse block.end.21998
  clear
  add "state"
block.end.21998:
testis "q"
jumpfalse block.end.22028
  clear
  add "quit"
block.end.22028:
testis "Q"
jumpfalse block.end.22058
  clear
  add "bail"
block.end.22058:
testis "s"
jumpfalse block.end.22089
  clear
  add "write"
block.end.22089:
testis "o"
jumpfalse block.end.22118
  clear
  add "nop"
block.end.22118:
testis "rs"
jumpfalse block.end.22152
  clear
  add "restart"
block.end.22152:
testis "rp"
jumpfalse block.end.22186
  clear
  add "reparse"
block.end.22186:
# some extra syntax for testeof and testtape
testis "<eof>"
jumptrue 4
testis "<EOF>"
jumptrue 2 
jump block.end.22297
  put
  clear
  add "eof*"
  push
  jump parse
block.end.22297:
testis "<==>"
jumpfalse block.end.22355
  put
  clear
  add "tapetest*"
  push
  jump parse
block.end.22355:
testis "add"
jumptrue 104
testis "clip"
jumptrue 102
testis "clop"
jumptrue 100
testis "replace"
jumptrue 98
testis "clear"
jumptrue 96
testis "upper"
jumptrue 94
testis "lower"
jumptrue 92
testis "cap"
jumptrue 90
testis "print"
jumptrue 88
testis "pop"
jumptrue 86
testis "push"
jumptrue 84
testis "unstack"
jumptrue 82
testis "stack"
jumptrue 80
testis "put"
jumptrue 78
testis "get"
jumptrue 76
testis "swap"
jumptrue 74
testis "++"
jumptrue 72
testis "--"
jumptrue 70
testis "mark"
jumptrue 68
testis "go"
jumptrue 66
testis "read"
jumptrue 64
testis "until"
jumptrue 62
testis "while"
jumptrue 60
testis "whilenot"
jumptrue 58
testis "jump"
jumptrue 56
testis "jumptrue"
jumptrue 54
testis "jumpfalse"
jumptrue 52
testis "testis"
jumptrue 50
testis "testclass"
jumptrue 48
testis "testbegins"
jumptrue 46
testis "testends"
jumptrue 44
testis "testeof"
jumptrue 42
testis "testtape"
jumptrue 40
testis "count"
jumptrue 38
testis "a+"
jumptrue 36
testis "a-"
jumptrue 34
testis "zero"
jumptrue 32
testis "cc"
jumptrue 30
testis "ll"
jumptrue 28
testis "nochars"
jumptrue 26
testis "nolines"
jumptrue 24
testis "escape"
jumptrue 22
testis "unescape"
jumptrue 20
testis "delim"
jumptrue 18
testis "state"
jumptrue 16
testis "quit"
jumptrue 14
testis "bail"
jumptrue 12
testis "write"
jumptrue 10
testis "append"
jumptrue 8
testis "nop"
jumptrue 6
testis "reparse"
jumptrue 4
testis "restart"
jumptrue 2 
jump block.end.22880
  put
  clear
  add "word*"
  push
  jump parse
block.end.22880:
#------------ 
# the .reparse command and "parse label" is a simple way to 
# make sure that all shift-reductions occur. It should be used inside
# a block test, so as not to create an infinite loop.
testis "parse>"
jumpfalse block.end.23196
  clear
  add "parse:"
  put
  clear
  add "command*"
  push
  jump parse
block.end.23196:
# --------------------
# try to implement begin-blocks, which are only executed
# once, at the beginning of the script (similar to awk's BEGIN {} rules)
testis "begin"
jumpfalse block.end.23412
  put
  add "*"
  push
  jump parse
block.end.23412:
put
add "Pep error: unknown command '"
get
add "' \n"
add "on line "
ll
add " (or character "
cc
add ")"
add "of input (file or stream). \n"
print
clear
quit
# ----------------------------------
# PARSING PHASE:
# the lexing phase finishes here, and below is the 
# parse/compile phase of the script. Here we pop tokens 
# off the stack and check for sequences of tokens eg word*semicolon*
# If we find a valid series of tokens, we "shift-reduce" or "resolve"
# the token series eg word*semicolon* --> command*
# At the same time, we manipulate (transform) the attributes on the 
# tape, as required. So Tape=|pop|;| becomes |\npop| where the 
# bars | indicate tape cells. (2 tapes cells are merged into 1).
# Each time the stack is reduced, the tape must also be reduced
# 
parse:
#-------------------------------------
# 2 tokens
#-------------------------------------
pop
pop
# All of the below are currently errors, but may not
# be in the future if we expand the syntax of the parse
# language. Also consider:
#    begintext* endtext* quoteset* notclass*, !* ,* ;* B* E*
# It is nice to trap the errors here because we can emit some
# hopefully not-very-cryptic error messages with a line number.
# Otherwise the script writer has to debug with
#   pep -a asm.pp scriptfile -I
testis "word*word*"
jumptrue 50
testis "word*}*"
jumptrue 48
testis "word*begintext*"
jumptrue 46
testis "word*endtext*"
jumptrue 44
testis "word*!*"
jumptrue 42
testis "word*,*"
jumptrue 40
testis "quote*word*"
jumptrue 38
testis "quote*class*"
jumptrue 36
testis "quote*state*"
jumptrue 34
testis "quote*}*"
jumptrue 32
testis "quote*begintext*"
jumptrue 30
testis "quote*endtext*"
jumptrue 28
testis "class*word*"
jumptrue 26
testis "class*quote*"
jumptrue 24
testis "class*class*"
jumptrue 22
testis "class*state*"
jumptrue 20
testis "class*}*"
jumptrue 18
testis "class*begintext*"
jumptrue 16
testis "class*endtext*"
jumptrue 14
testis "class*!*"
jumptrue 12
testis "notclass*word*"
jumptrue 10
testis "notclass*quote*"
jumptrue 8
testis "notclass*class*"
jumptrue 6
testis "notclass*state*"
jumptrue 4
testis "notclass*}*"
jumptrue 2 
jump block.end.25355
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script (missing semicolon/brace/unescaped quote??) \n"
  print
  clear
  quit
block.end.25355:
testis "{*;*"
jumptrue 6
testis ";*;*"
jumptrue 4
testis "}*;*"
jumptrue 2 
jump block.end.25544
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: misplaced semi-colon? ; \n"
  print
  clear
  quit
block.end.25544:
# comma errors.
testis ",*;*"
jumptrue 6
testis ",*{*"
jumptrue 4
testis ",*}*"
jumptrue 2 
jump block.end.25744
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: misplaced comma? ; \n"
  print
  clear
  quit
block.end.25744:
testis ",*{*"
jumpfalse block.end.25917
  push
  push
  add "Pep: error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: extra comma in list? \n"
  print
  clear
  quit
block.end.25917:
testis "command*;*"
jumptrue 4
testis "commandset*;*"
jumptrue 2 
jump block.end.26109
  push
  push
  add "Pep: error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: extra semi-colon? \n"
  print
  clear
  quit
block.end.26109:
testis "!*!*"
jumpfalse block.end.26375
  push
  push
  add "Pep: error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: \n double negation '!!' is not implemented \n"
  add " and probably won't be, because what would be the point? \n"
  print
  clear
  quit
block.end.26375:
# untested block 
testis ".*.*"
jumpfalse block.end.26595
  push
  push
  add "Pep/nom: error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: \n repeated dot '.' (AND concatenation operator)\n"
  print
  clear
  quit
block.end.26595:
# untested block 
testis ",*,*"
jumpfalse block.end.26816
  push
  push
  add "Pep/nom: error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: \n repeated comma ',' (OR concatenation operator)\n"
  print
  clear
  quit
block.end.26816:
testis "!*{*"
jumptrue 4
testis "!*;*"
jumptrue 2 
jump block.end.27130
  push
  push
  add "Pep: error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: misplaced negation operator (!)? \n"
  add " The negation operator precedes tests, for example: \n"
  add "   !B'abc'{ ... } or !(eof),!'abc'{ ... } \n"
  print
  clear
  quit
block.end.27130:
testis ",*command*"
jumpfalse block.end.27300
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: misplaced comma? \n"
  print
  clear
  quit
block.end.27300:
testis "!*command*"
jumpfalse block.end.27499
  push
  push
  add "error near line "
  ll
  add " (at char "
  cc
  add ") \n"
  add " The negation operator (!) cannot precede a command \n"
  print
  clear
  quit
block.end.27499:
testis ";*{*"
jumptrue 6
testis "command*{*"
jumptrue 4
testis "commandset*{*"
jumptrue 2 
jump block.end.27702
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script: no test for brace block? \n"
  print
  clear
  quit
block.end.27702:
testis "{*}*"
jumpfalse block.end.27833
  push
  push
  add "error near line "
  ll
  add " of script: empty braces {}. \n"
  print
  clear
  quit
block.end.27833:
testis "B*class*"
jumptrue 4
testis "E*class*"
jumptrue 2 
jump block.end.28061
  push
  push
  add "error near line "
  ll
  add " of script:\n  classes ([a-z], [:space:] etc). \n"
  add "  cannot use the 'begin' or 'end' modifiers (B/E) \n"
  print
  clear
  quit
block.end.28061:
testis "}*command*"
jumpfalse block.end.28208
  push
  push
  add "error near line "
  ll
  add " of script: extra closing brace '}' ?. \n"
  print
  clear
  quit
block.end.28208:
testis "comment*{*"
jumpfalse block.end.28397
  push
  push
  add "error near line "
  ll
  add " of script: comments cannot occur between \n"
  add " a test and a brace ({). \n"
  print
  clear
  quit
block.end.28397:
#------------ 
# the .restart command just jumps to the start: label 
# (which is usually followed by a "read" command)
# but '.' is also the AND concatenator, which seems ambiguous,
# but the parsing works.
testis ".*word*"
jumpfalse block.end.29091
  clear
  ++
  get
  --
  testis "restart"
  jumpfalse block.end.28769
    clear
    add "jump start"
    put
    clear
    add "command*"
    push
    jump parse
  block.end.28769:
  testis "reparse"
  jumpfalse block.end.28884
    clear
    add "jump parse"
    put
    clear
    add "command*"
    push
    jump parse
  block.end.28884:
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script:  \n"
  add " misplaced dot '.' (use for AND logic or in .reparse/.restart \n"
  print
  clear
  quit
block.end.29091:
#-----------------------------------------
# compiling comments so as to transfer them to the compiled 
# file. 
# implement these rules to conserve comments
testis "comment*command*"
jumptrue 6
testis "command*comment*"
jumptrue 4
testis "commandset*comment*"
jumptrue 2 
jump block.end.29413
  clear
  get
  add "\n"
  ++
  get
  --
  put
  clear
  add "command*"
  push
  jump parse
block.end.29413:
testis "comment*comment*"
jumpfalse block.end.29527
  clear
  get
  add "\n"
  ++
  get
  --
  put
  clear
  add "comment*"
  push
  jump parse
block.end.29527:
# -----------------------
# negated tokens.
# This is a new more elegant way to negate a whole set of 
# tests (tokens) where the negation logic is stored on the 
# stack, not in the current tape cell. We just add "not" to 
# the stack token.
# eg: ![:alpha:] ![a-z] ![abcd] !"abc" !B"abc" !E"xyz"
#  This format is used to indicate a negative test for 
#  a brace block. eg: ![aeiou] { add "< not a vowel"; print; clear; }
testis "!*quote*"
jumptrue 12
testis "!*class*"
jumptrue 10
testis "!*begintext*"
jumptrue 8
testis "!*endtext*"
jumptrue 6
testis "!*eof*"
jumptrue 4
testis "!*tapetest*"
jumptrue 2 
jump block.end.30270
  # a simplification: just replace the token name with its
  # negative.
  replace "!*" "not"
  push
  # now get the token-value
  # added an extra ++ here.
  get
  --
  put
  ++
  clear
  jump parse
block.end.30270:
#-----------------------------------------
# format: E"text" or E'text'
#  This format is used to indicate a "workspace-ends-with" text before
#  a brace block.
testis "E*quote*"
jumpfalse block.end.30542
  clear
  add "endtext*"
  push
  get
  --
  put
  ++
  clear
  jump parse
block.end.30542:
#-----------------------------------------
# format: B"sometext" or B'sometext' 
#   A 'B' preceding some quoted text is used to indicate a 
#   'workspace-begins-with' test, before a brace block.
testis "B*quote*"
jumpfalse block.end.30853
  clear
  add "begintext*"
  push
  get
  --
  put
  ++
  clear
  jump parse
block.end.30853:
#--------------------------------------------
# ebnf: command := word, ';' ;
# formats: "pop; push; clear; print; " etc
# all commands need to end with a semi-colon except for 
# .reparse and .restart
testis "word*;*"
jumpfalse block.end.31807
  clear
  # check if command requires parameter
  get
  testis "add"
  jumptrue 18
  testis "while"
  jumptrue 16
  testis "whilenot"
  jumptrue 14
  testis "mark"
  jumptrue 12
  testis "go"
  jumptrue 10
  testis "escape"
  jumptrue 8
  testis "unescape"
  jumptrue 6
  testis "delim"
  jumptrue 4
  testis "replace"
  jumptrue 2 
  jump block.end.31405
    put
    clear
    add "Pep: '"
    get
    add "'"
    add " << command needs an argument, on line "
    ll
    add " of script.\n"
    print
    clear
    quit
  block.end.31405:
  # the no-argument version of until
  testis "until"
  jumpfalse block.end.31479
    add "tape"
    put
  block.end.31479:
  # the no-argument version of write
  testis "write"
  jumpfalse block.end.31562
    add "file \"sav.pp\""
    put
  block.end.31562:
  # the no-argument version of 'append' (to file)
  testis "append"
  jumpfalse block.end.31677
    clear
    add "writefileappend \"sav.pp\""
    put
  block.end.31677:
  clear
  add "command*"
  # no need to format tape cells because current cell contains word
  push
  jump parse
block.end.31807:
#-----------------------------------------
# ebnf: commandset := command , command ;
testis "command*command*"
jumptrue 4
testis "commandset*command*"
jumptrue 2 
jump block.end.32131
  clear
  add "commandset*"
  push
  # format the tape attributes. Add the next command on a newline 
  --
  get
  add "\n"
  ++
  get
  --
  put
  ++
  clear
  jump parse
block.end.32131:
#-------------------
# here we begin to parse "test*" and "ortestset*" and "andtestset*"
# 
#-------------------
# eg: B"abc" {} or E"xyz" {}
testis "begintext*{*"
jumptrue 12
testis "endtext*{*"
jumptrue 10
testis "quote*{*"
jumptrue 8
testis "class*{*"
jumptrue 6
testis "eof*{*"
jumptrue 4
testis "tapetest*{*"
jumptrue 2 
jump block.end.33154
  # set accumulator == 0
  zero
  testbegins "begin"
  jumpfalse block.end.32447
    clear
    add "testbegins "
  block.end.32447:
  testbegins "end"
  jumpfalse block.end.32486
    clear
    add "testends "
  block.end.32486:
  testbegins "quote"
  jumpfalse block.end.32525
    clear
    add "testis "
  block.end.32525:
  testbegins "class"
  jumpfalse block.end.32567
    clear
    add "testclass "
  block.end.32567:
  # clear the tapecell for testeof and testtape because
  # they take no arguments. 
  testbegins "eof"
  jumpfalse block.end.32699
    clear
    put
    add "testeof "
  block.end.32699:
  testbegins "tapetest"
  jumpfalse block.end.32748
    clear
    put
    add "testtape "
  block.end.32748:
  get
  add "\n"
  add "jumptrue 2 \n"
  # this extra jump has utility when we parse ortestsets and
  # andtestsets.
  add "jump block.end."
  # the final jumpfalse + target will be added when
  # "test*{*commandset*}*" is parsed, or when
  # "ortestset*{*commandset*}*"
  # "andtestset*{*commandset*}*"
  put
  a+
  a+
  a+
  a+
  clear
  add "test*{*"
  push
  push
  jump parse
block.end.33154:
#-------------------
# negated tests
# eg: !B"xyz {} 
#     !E"xyz" {} 
#     !"abc" {}
#     ![a-z] {}
testis "notbegintext*{*"
jumptrue 12
testis "notendtext*{*"
jumptrue 10
testis "notquote*{*"
jumptrue 8
testis "notclass*{*"
jumptrue 6
testis "noteof*{*"
jumptrue 4
testis "nottapetest*{*"
jumptrue 2 
jump block.end.34123
  # set accumulator == 0
  zero
  testbegins "notbegin"
  jumpfalse block.end.33453
    clear
    add "testbegins "
  block.end.33453:
  testbegins "notend"
  jumpfalse block.end.33495
    clear
    add "testends "
  block.end.33495:
  testbegins "notquote"
  jumpfalse block.end.33537
    clear
    add "testis "
  block.end.33537:
  testbegins "notclass"
  jumpfalse block.end.33582
    clear
    add "testclass "
  block.end.33582:
  # clear the tapecell for testeof and testtape because
  # they take no arguments. 
  testbegins "noteof"
  jumpfalse block.end.33717
    clear
    put
    add "testeof "
  block.end.33717:
  testbegins "nottapetest"
  jumpfalse block.end.33769
    clear
    put
    add "testtape "
  block.end.33769:
  get
  add "\n"
  add "jumpfalse 2 \n"
  # this extra jump has utility when we parse ortestsets and
  # andtestsets.
  add "jump block.end."
  # the final jumpfalse + target will be added later
  # use the accumulator to store the incremented jump target
  put
  a+
  a+
  a+
  a+
  clear
  add "test*{*"
  push
  push
  jump parse
block.end.34123:
#-------------------
# 3 tokens
#-------------------
pop
#-----------------------------
# some 3 token errors!!!
# there are many other of these errors but I am not going
# to write them all.
testis "{*begintext*;*"
jumptrue 6
testis "{*endtext*;*"
jumptrue 4
testis "{*class*;*"
jumptrue 2 
jump block.end.34549
  push
  push
  push
  add "error near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script (misplaced semicolon?) \n"
  print
  clear
  quit
block.end.34549:
testis "{*quote*;*"
jumptrue 6
testis "commandset*quote*;*"
jumptrue 4
testis "command*quote*;*"
jumptrue 2 
jump block.end.34781
  push
  push
  push
  add "[error] near line "
  ll
  add " (char "
  cc
  add ")"
  add " of script (quoted text without a command?) \n"
  print
  clear
  quit
block.end.34781:
# to simplify subsequent tests, transmogrify a single command
# to a commandset (multiple commands).
testis "{*command*}*"
jumpfalse block.end.34977
  clear
  add "{*commandset*}*"
  push
  push
  push
  jump parse
block.end.34977:
# rule 
#',' ortestset ::= ',' test '{'
# trigger a transmogrification from test to ortestset token
# and 
# '.' andtestset ::= '.' test '{'
testis ",*test*{*"
jumpfalse block.end.35214
  clear
  add ",*ortestset*{*"
  push
  push
  push
  jump parse
block.end.35214:
# trigger a transmogrification from "test" to "andtest" by
# looking backwards in the stack
testis ".*test*{*"
jumpfalse block.end.35451
  # the jump counter is 1 too high for AND tests
  a-
  clear
  add ".*andtestset*{*"
  push
  push
  push
  jump parse
block.end.35451:
# errors! mixing AND and OR concatenation
testis ",*andtestset*{*"
jumptrue 4
testis ".*ortestset*{*"
jumptrue 2 
jump block.end.35783
  # push the tokens back to make debugging easier
  push
  push
  push
  add " error: mixing AND (.) and OR (,) concatenation in \n"
  add " in script near line "
  ll
  add " (character "
  cc
  add ") \n"
  print
  clear
  quit
block.end.35783:
#--------------------------------------------
# ebnf: command := keyword , quoted-text , ";" ;
# format: add "text";
testis "word*quote*;*"
jumpfalse block.end.37525
  clear
  get
  testis "replace"
  jumpfalse block.end.36123
    # error 
    add "< command requires 2 parameters, not 1 \n"
    add "near line "
    ll
    add " of script. \n"
    print
    clear
    quit
  block.end.36123:
  testis "add"
  jumptrue 20
  testis "until"
  jumptrue 18
  testis "while"
  jumptrue 16
  testis "whilenot"
  jumptrue 14
  testis "escape"
  jumptrue 12
  testis "mark"
  jumptrue 10
  testis "go"
  jumptrue 8
  testis "unescape"
  jumptrue 6
  testis "delim"
  jumptrue 4
  testis "write"
  jumptrue 2 
  jump block.end.37343
    # check here or in error.pss for multiline quoted arguments
    # for "mark" "go" "until" etc because they are not allowed.
    clear
    add "command*"
    push
    # a command plus argument, eg add "this" 
    --
    get
    # allow multiline text in (only) the add command
    # we do this by turning a multiline "add" command into a 
    # sequence of single line "add" commands (because that is what
    # the assembler format allows). Actually, I could just write
    # replace "\n" "\\n"; which should work but would be much less
    # readable in the assembled file.
    testis "add"
    jumpfalse block.end.36964
      add " "
      ++
      get
      replace "\n" "\\n\"\nadd \""
      --
      put
      ++
      clear
      jump parse
    block.end.36964:
    # maybe it would be useful for the until command to 
    # allow multiline as well
    testis "until"
    jumpfalse block.end.37187
      add " "
      ++
      get
      replace "\n" "\\n"
      --
      put
      ++
      clear
      jump parse
    block.end.37187:
    testis "write"
    jumpfalse block.end.37218
      add "file"
    block.end.37218:
    testis "append"
    jumpfalse block.end.37268
      clear
      add "writefileappend"
    block.end.37268:
    add " "
    ++
    get
    --
    put
    ++
    clear
    jump parse
  block.end.37343:
  # error, superfluous argument
  add ": command does not take an argument \n"
  add "near line "
  ll
  add " of script. \n"
  print
  #state
  quit
block.end.37525:
#----------------------------------
# format: "while [:alpha:] ;" or whilenot [a-z] ;
testis "word*class*;*"
jumpfalse block.end.38033
  clear
  get
  testis "while"
  jumptrue 4
  testis "whilenot"
  jumptrue 2 
  jump block.end.37883
    clear
    add "command*"
    push
    # a command plus argument, eg while [a-z] 
    --
    get
    add " "
    ++
    get
    --
    put
    ++
    clear
    jump parse
  block.end.37883:
  # error 
  add " < command cannot have a class argument \n"
  add "line "
  ll
  add ": error in script \n"
  print
  clear
  quit
block.end.38033:
# -------------------------------
# 4 tokens
# -------------------------------
pop
#-------------------------------------
# ebnf:     command := replace , quote , quote , ";" ;
# example:  replace "and" "AND" ; 
testis "word*quote*quote*;*"
jumpfalse block.end.38701
  clear
  get
  testis "replace"
  jumpfalse block.end.38574
    clear
    add "command*"
    push
    #---------------------------
    # a command plus 2 arguments, eg replace "this" "that"
    --
    get
    add " "
    ++
    get
    add " "
    ++
    get
    --
    --
    put
    ++
    clear
    jump parse
  block.end.38574:
  add " << command does not take 2 quoted arguments. \n"
  add " on line "
  ll
  add " of script.\n"
  print
  quit
block.end.38701:
#-------------------------------------
# format: begin { #* commands *# }
# "begin" blocks which are only executed once (they
# will are assembled before the "start:" label. They must come before
# all other commands.
# "begin*{*command*}*",
testis "begin*{*commandset*}*"
jumpfalse block.end.39085
  clear
  ++
  ++
  get
  --
  --
  put
  clear
  add "beginblock*"
  push
  jump parse
block.end.39085:
# -------------
# parses and compiles concatenated tests
# eg: 'a',B'b',E'c',[def],[:space:],[g-k] { ...
testis "begintext*,*ortestset*{*"
jumptrue 12
testis "endtext*,*ortestset*{*"
jumptrue 10
testis "quote*,*ortestset*{*"
jumptrue 8
testis "class*,*ortestset*{*"
jumptrue 6
testis "eof*,*ortestset*{*"
jumptrue 4
testis "tapetest*,*ortestset*{*"
jumptrue 2 
jump block.end.40009
  testbegins "begin"
  jumpfalse block.end.39414
    clear
    add "testbegins "
  block.end.39414:
  testbegins "end"
  jumpfalse block.end.39454
    clear
    add "testends "
  block.end.39454:
  testbegins "quote"
  jumpfalse block.end.39494
    clear
    add "testis "
  block.end.39494:
  testbegins "class"
  jumpfalse block.end.39537
    clear
    add "testclass "
  block.end.39537:
  # clear the tapecell for testeof and testtape because
  # they take no arguments. 
  testbegins "eof"
  jumpfalse block.end.39672
    clear
    put
    add "testeof "
  block.end.39672:
  testbegins "tapetest"
  jumpfalse block.end.39722
    clear
    put
    add "testtape "
  block.end.39722:
  get
  add "\n"
  add "jumptrue "
  count
  add "\n"
  ++
  ++
  get
  --
  --
  put
  clear
  # this works as long as we dont mix AND and OR concatenations 
  # add "test*{*";
  # need to change to this
  add "ortestset*{*"
  push
  push
  a+
  a+
  jump parse
block.end.40009:
# A collection of negated tests.
testis "notbegintext*,*ortestset*{*"
jumptrue 12
testis "notendtext*,*ortestset*{*"
jumptrue 10
testis "notquote*,*ortestset*{*"
jumptrue 8
testis "notclass*,*ortestset*{*"
jumptrue 6
testis "noteof*,*ortestset*{*"
jumptrue 4
testis "nottapetest*,*ortestset*{*"
jumptrue 2 
jump block.end.40806
  testbegins "notbegin"
  jumpfalse block.end.40281
    clear
    add "testbegins "
  block.end.40281:
  testbegins "notend"
  jumpfalse block.end.40324
    clear
    add "testends "
  block.end.40324:
  testbegins "notquote"
  jumpfalse block.end.40367
    clear
    add "testis "
  block.end.40367:
  testbegins "notclass"
  jumpfalse block.end.40413
    clear
    add "testclass "
  block.end.40413:
  testbegins "noteof"
  jumpfalse block.end.40460
    clear
    put
    add "testeof "
  block.end.40460:
  testbegins "nottapetest"
  jumpfalse block.end.40513
    clear
    put
    add "testtape "
  block.end.40513:
  get
  add "\n"
  add "jumpfalse "
  count
  add "\n"
  ++
  ++
  get
  --
  --
  put
  clear
  # this works as long as we dont mix AND and OR concatenations 
  add "ortestset*{*"
  # need to change to this
  # add "ortestset*{*";
  push
  push
  a+
  a+
  jump parse
block.end.40806:
# this works as long as we dont mix AND and OR concatenations 
# -------------
# AND logic 
# parses and compiles concatenated AND tests
# eg: 'a',B'b',E'c',[def],[:space:],[g-k] { ...
# it is possible to elide this block with the negated block
# for compactness but maybe readability is not as good.
testis "begintext*.*andtestset*{*"
jumptrue 12
testis "endtext*.*andtestset*{*"
jumptrue 10
testis "quote*.*andtestset*{*"
jumptrue 8
testis "class*.*andtestset*{*"
jumptrue 6
testis "eof*.*andtestset*{*"
jumptrue 4
testis "tapetest*.*andtestset*{*"
jumptrue 2 
jump block.end.41736
  testbegins "begin"
  jumpfalse block.end.41350
    clear
    add "testbegins "
  block.end.41350:
  testbegins "end"
  jumpfalse block.end.41390
    clear
    add "testends "
  block.end.41390:
  testbegins "quote"
  jumpfalse block.end.41430
    clear
    add "testis "
  block.end.41430:
  testbegins "class"
  jumpfalse block.end.41473
    clear
    add "testclass "
  block.end.41473:
  testbegins "eof"
  jumpfalse block.end.41517
    clear
    put
    add "testeof "
  block.end.41517:
  testbegins "tapetest"
  jumpfalse block.end.41567
    clear
    put
    add "testtape "
  block.end.41567:
  get
  add "\n"
  add "jumpfalse "
  count
  add "\n"
  ++
  ++
  get
  --
  --
  put
  clear
  add "andtestset*{*"
  push
  push
  a+
  a+
  jump parse
block.end.41736:
# eg
# negated tests concatenated with AND logic (.). The 
# negated tests can be chained with non negated tests.
# eg: B'http' . !E'.txt' { ... }
testis "notbegintext*.*andtestset*{*"
jumptrue 12
testis "notendtext*.*andtestset*{*"
jumptrue 10
testis "notquote*.*andtestset*{*"
jumptrue 8
testis "notclass*.*andtestset*{*"
jumptrue 6
testis "noteof*.*andtestset*{*"
jumptrue 4
testis "nottapetest*.*andtestset*{*"
jumptrue 2 
jump block.end.42537
  testbegins "notbegin"
  jumpfalse block.end.42137
    clear
    add "testbegins "
  block.end.42137:
  testbegins "notend"
  jumpfalse block.end.42180
    clear
    add "testends "
  block.end.42180:
  testbegins "notquote"
  jumpfalse block.end.42223
    clear
    add "testis "
  block.end.42223:
  testbegins "notclass"
  jumpfalse block.end.42269
    clear
    add "testclass "
  block.end.42269:
  testbegins "noteof"
  jumpfalse block.end.42316
    clear
    put
    add "testeof "
  block.end.42316:
  testbegins "nottapetest"
  jumpfalse block.end.42369
    clear
    put
    add "testtape "
  block.end.42369:
  get
  add "\n"
  add "jumptrue "
  count
  add "\n"
  ++
  ++
  get
  --
  --
  put
  clear
  add "andtestset*{*"
  push
  push
  a+
  a+
  jump parse
block.end.42537:
#-------------------------------------
# we should not have to check for the {*command*}* pattern
# because that has already been transformed to {*commandset*}*
testis "test*{*commandset*}*"
jumptrue 6
testis "andtestset*{*commandset*}*"
jumptrue 4
testis "ortestset*{*commandset*}*"
jumptrue 2 
jump block.end.43608
  # indent the assembled code for readability
  testbegins "test*{*"
  jumpfalse block.end.43158
    clear
    # get rid of unnecessary jump but only in "test" cases 
    get
    # for positive tests (eg [a-z] {...})
    replace "jumptrue 2 \njump" "jumpfalse"
    put
    # for negative tests (eg ![a-z] {...})
    replace "jumpfalse 2 \njump" "jumptrue"
    put
  block.end.43158:
  clear
  ++
  ++
  add "  "
  get
  replace "\n" "\n  "
  put
  --
  --
  clear
  get
  # the final jump (to the closing brace) has already been
  # coded in the "test*{*" rule or the other rules.
  # we just need to add the label number with "cc"
  cc
  add "\n"
  ++
  ++
  get
  add "\nblock.end."
  cc
  add ":"
  --
  --
  put
  clear
  add "command*"
  push
  # always reparse/compile
  jump parse
block.end.43608:
# -------------
# multi-token end-of-stream errors
# not a comprehensive list of errors...
testeof 
jumpfalse block.end.44384
  testends "begintext*"
  jumptrue 10
  testends "endtext*"
  jumptrue 8
  testends "test*"
  jumptrue 6
  testends "ortestset*"
  jumptrue 4
  testends "andtestset*"
  jumptrue 2 
  jump block.end.43915
    add "  Error near end of script at line "
    ll
    add ". Test with no brace block? \n"
    print
    clear
    quit
  block.end.43915:
  testends "quote*"
  jumptrue 6
  testends "class*"
  jumptrue 4
  testends "word*"
  jumptrue 2 
  jump block.end.44127
    put
    clear
    add "Error end of script! (line "
    ll
    add ") missing semi-colon? \n"
    add "Parse stack: "
    get
    add "\n"
    print
    clear
    quit
  block.end.44127:
  testends "{*"
  jumptrue 16
  testends "}*"
  jumptrue 14
  testends ";*"
  jumptrue 12
  testends ",*"
  jumptrue 10
  testends ".*"
  jumptrue 8
  testends "!*"
  jumptrue 6
  testends "B*"
  jumptrue 4
  testends "E*"
  jumptrue 2 
  jump block.end.44380
    put
    clear
    add "Error: misplaced terminal character at end of script! (line "
    ll
    add "). \n"
    add "Parse stack: "
    get
    add "\n"
    print
    clear
    quit
  block.end.44380:
block.end.44384:
# put the 4 (or less) tokens back on the stack
push
push
push
push
testeof 
jumpfalse block.end.46425
  #add "end of script!! \n"
  print
  clear
  #---------------------
  # check if the script correctly parsed (there should only
  # be one token on the stack, namely "commandset*" or "command*"
  pop
  pop
  testis "commandset*"
  jumptrue 4
  testis "command*"
  jumptrue 2 
  jump block.end.45277
    push
    --
    add "# Assembled with the script 'compile.pss' \n"
    add "start:\n"
    get
    # an extra space because of a bug in compile()
    add "\njump start \n"
    # put a copy of the final compilation into the tapecell
    # so it can be inspected interactively.
    put
    # remove this print from asm.pp after generating a new asm.pp
    # with pep -f compile.pss compile.pss > asm.new.pp; cp asm.new.pp asm.pp
    print
    # remove!
    # save the compiled script to 'sav.pp'
    writefile "sav.pp"
    clear
    quit
  block.end.45277:
  testis "beginblock*commandset*"
  jumptrue 4
  testis "beginblock*command*"
  jumptrue 2 
  jump block.end.45917
    clear
    add "# Assembled with the script 'compile.pss' \n"
    get
    add "\n"
    ++
    add "start:\n"
    get
    # an extra space because of a bug in compile()
    add "\njump start \n"
    # put a copy of the final compilation into the tapecell
    # so it can be inspected interactively.
    put
    # remove this 'print' from asm.pp after generating a new asm.pp
    # with pep -f compile.pss compile.pss > asm.new.pp; cp asm.new.pp asm.pp
    print
    # remove!
    # also save the compiled script to 'sav.pp'
    writefile "sav.pp"
    clear
    quit
  block.end.45917:
  push
  push
  # state
  clear
  add "After compiling with 'compile.pss' (at EOF): \n "
  add "  parse error in input script, check syntax: \n "
  add "  To debug script try the -I switch with \n "
  add "   >> pep -If script -i 'some input' \n "
  add "  or to debug the compilation process try: \n "
  add "   >> pep -Ia asm.pp script' \n "
  print
  clear
  # clear sav.pp because script could not be compiled
  writefile "sav.pp"
  # bail means exit with error
  bail
block.end.46425:
# not eof
# there is an implicit .restart command here (jump start)
jump start