From: Tim Vaughan Date: Sun, 24 Apr 2016 13:08:44 +0000 (+1200) Subject: Mostly copy-pasting forth code at this point! X-Git-Url: https://thelambdalab.xyz/gitweb/index.cgi?a=commitdiff_plain;h=21567de9ac591909bb28a454b03615a473917327;p=forth.jl.git Mostly copy-pasting forth code at this point! --- diff --git a/src/forth.jl b/src/forth.jl index e04e288..78d8a53 100644 --- a/src/forth.jl +++ b/src/forth.jl @@ -1033,10 +1033,10 @@ DOT = defPrimWord(".", () -> begin return NEXT end) -DOTS = defPrimWord(".s", () -> begin - printPS() - return NEXT -end) +#DOTS = defPrimWord(".s", () -> begin +# printPS() +# return NEXT +#end) DUMP = defPrimWord("DUMP", () -> begin count = popPS() diff --git a/src/lib.4th b/src/lib.4th new file mode 100644 index 0000000..b7cd560 --- /dev/null +++ b/src/lib.4th @@ -0,0 +1,257 @@ +: / /MOD SWAP DROP ; +: MOD /MOD DROP ; + +: '\n' 10 ; +: BL 32 ; + +: CR '\n' emit ; +: SPACE BL emit ; + +: NEGATE 0 swap - ; + +: TRUE -1 ; +: FALSE 0 ; +: NOT 0= ; + +: LITERAL IMMEDIATE ' LIT , , ; + +: ':' + [ + CHAR : + ] + LITERAL +; + +: ';' [ CHAR ; ] LITERAL ; +: '(' [ CHAR ( ] LITERAL ; +: ')' [ CHAR ) ] LITERAL ; +: '"' [ CHAR " ] LITERAL ; +: 'A' [ CHAR A ] LITERAL ; +: '0' [ CHAR 0 ] LITERAL ; +: '-' [ CHAR - ] LITERAL ; +: '.' [ CHAR . ] LITERAL ; + +\ While compiling, '[COMPILE] word' compiles 'word' if it would otherwise be IMMEDIATE. +: [COMPILE] IMMEDIATE + WORD \ get the next word + FIND \ find it in the dictionary + >CFA \ get its codeword + , \ and compile that +; + +\ RECURSE makes a recursive call to the current word that is being compiled. +\ +\ Normally while a word is being compiled, it is marked HIDDEN so that references to the +\ same word within are calls to the previous definition of the word. However we still have +\ access to the word which we are currently compiling through the LATEST pointer so we +\ can use that to compile a recursive call. +: RECURSE IMMEDIATE + LATEST @ \ LATEST points to the word being compiled at the moment + >CFA \ get the codeword + , \ compile it +; + +\ CONTROL STRUCTURES ---------------------------------------------------------------------- +\ +\ So far we have defined only very simple definitions. Before we can go further, we really need to +\ make some control structures, like IF ... THEN and loops. Luckily we can define arbitrary control +\ structures directly in FORTH. +\ +\ Please note that the control structures as I have defined them here will only work inside compiled +\ words. If you try to type in expressions using IF, etc. in immediate mode, then they won't work. +\ Making these work in immediate mode is left as an exercise for the reader. + +\ condition IF true-part THEN rest +\ -- compiles to: --> condition 0BRANCH OFFSET true-part rest +\ where OFFSET is the offset of 'rest' +\ condition IF true-part ELSE false-part THEN +\ -- compiles to: --> condition 0BRANCH OFFSET true-part BRANCH OFFSET2 false-part rest +\ where OFFSET if the offset of false-part and OFFSET2 is the offset of rest + +\ IF is an IMMEDIATE word which compiles 0BRANCH followed by a dummy offset, and places +\ the address of the 0BRANCH on the stack. Later when we see THEN, we pop that address +\ off the stack, calculate the offset, and back-fill the offset. +: IF IMMEDIATE + ' 0BRANCH , \ compile 0BRANCH + HERE @ \ save location of the offset on the stack + 0 , \ compile a dummy offset +; + +: THEN IMMEDIATE + DUP + HERE @ SWAP - \ calculate the offset from the address saved on the stack + SWAP ! \ store the offset in the back-filled location +; + +: ELSE IMMEDIATE + ' BRANCH , \ definite branch to just over the false-part + HERE @ \ save location of the offset on the stack + 0 , \ compile a dummy offset + SWAP \ now back-fill the original (IF) offset + DUP \ same as for THEN word above + HERE @ SWAP - + SWAP ! +; + +\ BEGIN loop-part condition UNTIL +\ -- compiles to: --> loop-part condition 0BRANCH OFFSET +\ where OFFSET points back to the loop-part +\ This is like do { loop-part } while (condition) in the C language +: BEGIN IMMEDIATE + HERE @ \ save location on the stack +; + +: UNTIL IMMEDIATE + ' 0BRANCH , \ compile 0BRANCH + HERE @ - \ calculate the offset from the address saved on the stack + , \ compile the offset here +; + +\ BEGIN loop-part AGAIN +\ -- compiles to: --> loop-part BRANCH OFFSET +\ where OFFSET points back to the loop-part +\ In other words, an infinite loop which can only be returned from with EXIT +: AGAIN IMMEDIATE + ' BRANCH , \ compile BRANCH + HERE @ - \ calculate the offset back + , \ compile the offset here +; + +\ BEGIN condition WHILE loop-part REPEAT +\ -- compiles to: --> condition 0BRANCH OFFSET2 loop-part BRANCH OFFSET +\ where OFFSET points back to condition (the beginning) and OFFSET2 points to after the whole piece of code +\ So this is like a while (condition) { loop-part } loop in the C language +: WHILE IMMEDIATE + ' 0BRANCH , \ compile 0BRANCH + HERE @ \ save location of the offset2 on the stack + 0 , \ compile a dummy offset2 +; + +: REPEAT IMMEDIATE + ' BRANCH , \ compile BRANCH + SWAP \ get the original offset (from BEGIN) + HERE @ - , \ and compile it after BRANCH + DUP + HERE @ SWAP - \ calculate the offset2 + SWAP ! \ and back-fill it in the original location +; + +\ UNLESS is the same as IF but the test is reversed. +\ +\ Note the use of [COMPILE]: Since IF is IMMEDIATE we don't want it to be executed while UNLESS +\ is compiling, but while UNLESS is running (which happens to be when whatever word using UNLESS is +\ being compiled -- whew!). So we use [COMPILE] to reverse the effect of marking IF as immediate. +\ This trick is generally used when we want to write our own control words without having to +\ implement them all in terms of the primitives 0BRANCH and BRANCH, but instead reusing simpler +\ control words like (in this instance) IF. +: UNLESS IMMEDIATE + ' NOT , \ compile NOT (to reverse the test) + [COMPILE] IF \ continue by calling the normal IF +; + +\ COMMENTS ---------------------------------------------------------------------- +\ +\ FORTH allows ( ... ) as comments within function definitions. This works by having an IMMEDIATE +\ word called ( which just drops input characters until it hits the corresponding ). +: ( IMMEDIATE + 1 \ allowed nested parens by keeping track of depth + BEGIN + KEY \ read next character + DUP '(' = IF \ open paren? + DROP \ drop the open paren + 1+ \ depth increases + ELSE + ')' = IF \ close paren? + 1- \ depth decreases + THEN + THEN + DUP 0= UNTIL \ continue until we reach matching close paren, depth 0 + DROP \ drop the depth counter +; + +( + From now on we can use ( ... ) for comments. + + STACK NOTATION ---------------------------------------------------------------------- + + In FORTH style we can also use ( ... -- ... ) to show the effects that a word has on the + parameter stack. For example: + + ( n -- ) means that the word consumes an integer (n) from the parameter stack. + ( b a -- c ) means that the word uses two integers (a and b, where a is at the top of stack) + and returns a single integer (c). + ( -- ) means the word has no effect on the stack +) + +( Some more complicated stack examples, showing the stack notation. ) +: NIP ( x y -- y ) SWAP DROP ; +: TUCK ( x y -- y x y ) DUP ROT ; +: PICK ( x_u ... x_1 x_0 u -- x_u ... x_1 x_0 x_u ) + 1+ ( add one because of 'u' on the stack ) + PSP@ SWAP - ( add to the stack pointer ) + @ ( and fetch ) +; + +( With the looping constructs, we can now write SPACES, which writes n spaces to stdout. ) +: SPACES ( n -- ) + BEGIN + DUP 0> ( while n > 0 ) + WHILE + SPACE ( print a space ) + 1- ( until we count down to 0 ) + REPEAT + DROP +; + +( Standard words for manipulating BASE. ) +: DECIMAL ( -- ) 10 BASE ! ; +: HEX ( -- ) 16 BASE ! ; + +( + PRINTING NUMBERS ---------------------------------------------------------------------- + + The standard FORTH word . (DOT) is very important. It takes the number at the top + of the stack and prints it out. However first I'm going to implement some lower-level + FORTH words: + + U.R ( u width -- ) which prints an unsigned number, padded to a certain width + U. ( u -- ) which prints an unsigned number + .R ( n width -- ) which prints a signed number, padded to a certain width. + + For example: + -123 6 .R + will print out these characters: + - 1 2 3 + + In other words, the number padded left to a certain number of characters. + + The full number is printed even if it is wider than width, and this is what allows us to + define the ordinary functions U. and . (we just set width to zero knowing that the full + number will be printed anyway). + + Another wrinkle of . and friends is that they obey the current base in the variable BASE. + BASE can be anything in the range 2 to 36. + + While we're defining . &c we can also define .S which is a useful debugging tool. This + word prints the current stack (non-destructively) from top to bottom. +) + +( This is the underlying recursive definition of U. ) +: U. ( u -- ) + BASE @ /MOD ( width rem quot ) + ?DUP IF ( if quotient <> 0 then ) + RECURSE ( print the quotient ) + THEN + + ( print the remainder ) + DUP 10 < IF + '0' ( decimal digits 0..9 ) + ELSE + 10 - ( hex and beyond digits A..Z ) + 'A' + THEN + + + EMIT +; + + diff --git a/src/lib.fs b/src/lib.fs deleted file mode 100644 index 652709e..0000000 --- a/src/lib.fs +++ /dev/null @@ -1,29 +0,0 @@ - -: / /MOD SWAP DROP ; -: MOD /MOD DROP ; - -: '\n' 10 ; -: BL 32 ; - -: CR '\n' emit ; -: SPACE BL emit ; - -: NEGATE 0 swap - ; - -: TRUE -1 ; -: FALSE 0 ; -: NOT 0= ; - -: LITERAL IMMEDIATE ' LIT , , ; - -: ':' [ CHAR : ] LITERAL ; - -: ';' [ CHAR ; ] LITERAL ; -: '(' [ CHAR ( ] LITERAL ; -: ')' [ CHAR ) ] LITERAL ; -: '"' [ CHAR " ] LITERAL ; -: 'A' [ CHAR A ] LITERAL ; -: '0' [ CHAR 0 ] LITERAL ; -: '-' [ CHAR - ] LITERAL ; -: '.' [ CHAR . ] LITERAL ; -