(2012-08-01) Forth Beginnings

Again with the blogging...

So because I was bored of having my main homepage as just a holding page, I decided to have this really simple blog.

I'm still (mas)debating exactly what content the site shall hold, but for now I'll just focus on programming.

The first thing I should mention, I work and play entirely on Linux (Arch Linux to be specific); I did own Windows 7 (semi-legit) and I have tried the Windows 8 BETA (which sucks wang, big time) -- however, I was sick of two things, the first being Windows and the second was dual booting; I really couldn't be 
arsed to reboot my computer just to play a couple of games.

Right now, I'm going to document a series of trials with brand new programming languages (to me) and how I struggle to work them out for the first few days. First up is FORTH (http://forth.com).

*** This is not a tutorial! Despite how much it may look like one! I do not profess to know what I am doing, that it is the easiest or best way to do it! ***

So I handily run a search in yaourt (-Ss, just like pacman) for forth, and the only relevant entry that will compile is gforth and run it from a terminal:
~$ gforth
I spent the next 5 minutes trying to work out what the hell you do now. (I've trimmed the output slightly, it's was stupidly long for an "undefined word" error).
help :1: Undefined word HELP :2: Undefined word h :3: Undefined word H :4: Undefined word
Nope, nothing normal. I then paid a little attention to the internet and suddenly got it.
A very long and close to unreadable list of commands in no particular order appear. Doing a brief read though I see nothing that amounts to an echo or a print. So I just start taking stabs in the dark and quickly discovered it.
." oh I am sorry." oh I am sorry. ok
Yay! I can now print to screen. But what's that "ok" about? Is it condescendingly telling me I'm forgiven if it must? We'll it turns out that's FORTH being verbose about it understanding your command. So how do I get some line breaks in there? I don't want everything to print out to the same line. More guess work later.
CR CR ." Oh hello there." CR Oh hello there. ok
Awesome -- so CR is break or, as I found out later, carriage return and ." " is equivalent of echo -n in BASH (suppress apprended new line). Right now, I'm using about 0.1% of what this language is capable of, but I'm feeling good. Next up, how to clear the screen. There must be something to do that.
Ah good. That's it. Now, how the hell do you make a script, or in FORTH; a word? I had to Google that.
: newWord compiled CR CR ." Doing Maths!" CR compiled ." 1 + 5 = " 1 5 + . compiled CR compiled ." 7 - 4 = " 7 4 - . compiled CR ; ok newWord Doing Maths! 1 + 5 = 6 7 - 4 = 3 ok
I can do science! But what the hell just happened there? Most of it is me padding the script out with CR and ." " all over the place. But the real meat is simply the "1 5 + ." and "7 4 - .". Why is the addition or subtraction sign at the end? How does it know which way to subtract? The issue here is that FORTH is a stack based language. Now, I'm probably going to do a terrible job at explaining this, but I'll give it a shot anyway. Imagine that FORTH has two hands and lots of bricks. Each hand can hold two bricks. However he can stack the bricks to save them for later, he doesn't need to keep hold of them one they are stacked, so he can put as many bricks as he likes on to the stack. However, if he wants to pick up some bricks from the stack, he can only access the last two he put on -- because they are all stuck on top of each other; pulling from the bottom would metaphorically topple the stack. So to move on from bricks, let's do some work with it. To enter an integer (the only thing the stack knows what to do with) we can simply type the number at the interpreter:
5 ok
Okay, so we've now just added 5 to the stack. Let's get some more numbers on there cause 5 is looking lonely.
8 ok 4 ok 11 ok 24 ok
Okay, so we now have 5, 8, 4, 11 and 24 on the stack. What next? Well we can print the entire stack to screen if we want:
.S 5 8 4 11 24 ok
So we can see what's on the stack, and we can run the .S command over and over with the same results. But we want to start using those numbers. What if we want to add 24 and 11, then subtract everything else? Being stack based you assume it's as simple as working your way backwards on the stack, so we add, subtract, subtract and subtract:
+ - - - ok .S -34 ok
That isn't right... What happened there? Well to go back to the brick guy above, he's also left handed. So the second to last stack item will always be used first. So, we initially did 11 + 24 (not 24 + 11) and then did 4 - 35 (-31); this led to 8 - -31 (39) which was then 5 - 39 which gave us the final result of -34 (check it on a real calculator if you want). What we need to do, is swap the two last stack items around each time. For this we can use the SWAP command; to break it down lets start again and I'll use the inbuilt comments system ( \ ) to explain it:
.S ok 4 5 3 12 6 \ put some numbers in the stack ok + \ 12 + 6 = 18 ok SWAP \ we now change where 12 and 6 are on the stack ok \ lets see what the stack looks like now ok .S 4 5 18 3 ok - \ 18 - 3 = 15 ok SWAP \ 5 and 15 change places ok - \ 15 - 5 = 10 ok SWAP \ 4 and 10 change places ok - \ 10 - 4 = 6 ok \ Double check the stack ok .S 6 ok
Okay, so we now know how to do maths (wow!) with one set. But back in my original maths example I used a command I've not covered yet. the period ("."). The period works very differently to ".S" though it looks similar. Instead of displaying all of the stack, it just displays the top item (or right most). And instead of being able to reuse that number, it will be removed from the stack. So as an example:
5 8 56 34 27 18 ok \ stick some numbers in the stack \ lets check the stack .S 5 8 56 34 27 18 ok . 18 ok \ output the top item in the stack . 27 ok . 34 ok . 56 ok . 8 ok . 5 ok \ lets check the stack again .S ok
So how about multiplication and division? Well, as I said before the stack can only handle integers; so by using the normal "*" and "/", FORTH will round down to the nearest whole number where possible. For example:
5 13 * . 65 ok 20 5 / . 4 ok 21 5 / . 4 ok 24 5 / . 4 ok
As you can see, it handles the first 2 no problem, as there are no remainders, however it rounds down on both 21 / 5 (4.20) and 24 / 5 (4.80). So how can we get an accurate result from a division? Well, we can get the remainders as more integers using the "/MOD" command.
23 5 /MOD . . 4 3 ok
This outputs the original "/" result on the top of the stack, and the remainders underneath, so we use the period twice to call both. You can then put it into a new word with something like:
: quarters compiled DUP compiled CR ." In the number " . ." there are " compiled 4 /MOD compiled . ." wholes and " compiled . ." quarters." compiled ; ok
This allows you to enter any value then the word quarters and it will add your number, then 4 to the stack and run the "/MOD" command on them. It also outputs everything into a tidier looking format:
25 quarters In the number 25 there are 6 wholes and 1 quarters. ok
Okay, so the grammar isn't great. We'll look at how we can improve that in a minute. First though, what the shit is the "DUP" command you just did? Well, dupe (as it's pronounced) duplicates the last stack entry. I needed this so I could then use the "In the number" sentence and output our initial number without loosing it. You can do "DUP" as many times as you want, but you need to make sure that when you are displaying them again that you are accounting for the fact they push all other numbers back. In the above example, two "DUP" commands shouldn't break anything as it enters the 4 afterwards, but you don't want a load of duplicate stack items eating all your memory. Before we go any further, lets get some conditional statements in here -- you know, the "if condition then do something, else do something different" statements. We'll take the above example as a base and redefine the word so "wholes" and "quarters" are dependant on if there are 1 or more. We need to rewrite the quarters command which is not as easy in FORTH as it is in say, PHP. You can't re-edit something you've compiled, so you need to start again. Now the IF statements are a little bit clunky, first you need to DUP the value you are checking because the IF statement will destroy it in the process of checking. Also, THEN is the end of the if equivilant to "}" or "fi" etc.
: quarters compiled DUP \ duplicate the input value compiled CR ." In the number " . ." there are " \ output the input value for verbose compiled 4 /MOD \ divide the input value by 4 compiled DUP \ duplicate the whole number compiled 1 = IF \ check to see if the whole number is 1 compiled . ." whole " \ output singular grammar compiled ELSE \ if the whole number is anything else compiled . ." wholes " \ output plural grammar compiled THEN \ finish the whole number checking compiled DUP \ duplicate the remainder compiled ." and " \ prettiness compiled 1 = IF \ if the remainder is 1 compiled . ." quarter" \ output singular grammar compiled ELSE \ if the remainder is anything else compiled . ." quarters" \ output plural gramma compiled THEN \ finish checking the remainder compiled ; redefined quarters ok
Okay so what happens if you run that?
20 quarters In the number 20 there are 5 wholes and 0 quarters ok 5 quarters In the number 5 there are 1 whole and 1 quarter ok 7 quarters In the number 7 there are 1 whole and 3 quarters ok 3 quarters In the number 3 there are 0 wholes and 3 quarters ok
That looks good to me! Well, still not real English (there are could be there is for singular instances of whole for example), but you get the idea. Doing an IF ELSE THEN can be really handy even for something as simple as correct formatting (hell, I've used it to justify lines in bash scripts!); I don't care if you shouldn't do that, I just do... So how can we get some kind of arbitrary decimal numbers going on? Thanks to how FOTH (gforth at least) handles decimals being entered into the stack it gives us some consistent results:
.S ok 5.4 ok .S 54 0 ok
Because of that we can surmise that diving by 10 will move the stack number back into decimal format and we can start off on the basics of just one decimal place like this:
: showdecimals compiled DUP \ duplicate the last number on the state compiled 0 = IF \ if 0, assume we are dealing with a decimal compiled DROP \ drop the remaining 0 compiled 10 /MOD \ dive the number by 10 compiled . ." point " . \ output the results compiled ELSE \ if the number is not 0 compiled . ." -- no decimal" \ give me some feedback compiled THEN \ end compiled ; ok
Then you can run the command with a number that has a single decimal place -- more than that is out of the scope of the "divide by 10" line. You'll get something like this:
.S ok 8.9 showdecimals 8 point 9 ok 7.3 showdecimals 7 point 3 ok 21.6 showdecimals 21 point 6 ok 377.2 showdecimals 377 point 2 ok .S ok
So all you need to do to get more decimal places is to input the number of decimal places prior to the WORD, ergo:
: showDec compiled SWAP compiled DUP compiled 0 = IF compiled DROP compiled /MOD compiled . ." point " . compiled ELSE compiled . ." -- no decimal" compiled THEN compiled ; ok
This is strictly for 2 decimal places, so if you use one, decimal place it will be a wonky output. To get around that, you can enter "***.0*" instead, like in my last output:
5.423 1000 showDec 5 point 423 ok 5.2 10 showDec 5 point 2 ok 8584.6532 10000 showDec 8584 point 6532 ok 5 1 showDec 5 -- no decimal ok 6.000 1000 showDec 6 point 0 ok
So how the hell can we use that in a practical sense? Maybe to multiply them? Or divide? Well, have no fear, here is a pretty straight forward set of scripts to do just that.
: dec compiled \ Tidy up the decimal into usable numbers compiled DUP compiled 0 = IF compiled DROP compiled THEN compiled ; ok : decConMul compiled \ use a number before the word to determine \ how many decimals there are /MOD compiled . ." point " . compiled ; ok
So now we can run a command like these:
8.54 dec 6.7 dec * 1000 decConMul 57 point 218 ok 6.4 dec 3.54124 dec * 1000000 decConMul 22 point 663936 ok 2.123456 dec 4 * 1000000 decConMul 8 point 493824 ok 1078.42 dec 8 dec * 100 decConMul 8627 point 36 ok
The number of decimal places in use by the first to numbers must always add up to the number of zeros in the 1*** before "decConMul" though. It is, of course, quite limiting in that you can get a more accurate result without adding more decimal places to the factoring numbers, but this is only day two of my experience with FORTH, so there may very well be an easier way to do this. Division is harder. We can't just use the same command but using a "/" instead of the "*" (believe me, I tried). With the division of decimals, we need to use a couple of techniques. First though, the code:
: decConDiv compiled DUP \ duplicate the denominator compiled ROT \ move the numerator to the top compiled SWAP \ move the numerator to the middle compiled /MOD \ divide with remainders compiled SWAP \ move the whole numbers to the middle compiled 100 * \ time the remainders by 100 compiled ROT \ move the denominator to the top compiled /MOD \ divide with remainders compiled SWAP \ move the remainders back compiled DROP \ drop the whole number compiled SWAP \ push the decimal to the back compiled . ." point " . \ output the results compiled ; ok
We need to do some elongated duplication and swapping, but from all my tests, this works. As the comments mention we duplicate the denominator first; this is so later on we can essentially divide the remainder of the initial division by the denominator to work out the "percent" of it; eg 5/10 is just 5 divided by 10 which is 0.5 or 50%. SO because FORTH doesn't use decimals, we then have to work out how it will actually displays the remainders in some way; I decided to just got for an accuracy of 2 decimal places -- all you do is times the numerator by 100 then divide that by the denominator. so it would be 500 divided by 10, which equals 50. We can then just reformat that with text to represent .50. The drawback to this little script is that it will not work for unbalanced decimals; so 5.4 divided by 4.3232 will not work, but 5.4000 divided by 4.3232 will. Here are some examples:
6.3 dec 4.1 dec decConDiv 1 point 53 ok 15.4 dec 2.7 dec decConDiv 5 point 70 ok 45.37 dec 12.78 dec decConDiv 3 point 55 ok 63.3269493 dec 52.291112 dec decConDiv 12 point 11 ok 12.35753 dec 4.76000 dec decConDiv 2 point 59 ok
Then an example of a fail, using the last entry above without the padding decimals:
12.35753 dec 4.76 dec decConDiv 2596 point 11 ok
This division calculator isn't as flexible as the multiplication one as it can only calculate to 2 decimal places (or the fixed amount defined by the script). There is a small "bug" in it as well (that I didn't find until after I created the previous code), that if the result was say, 4.01 (from 63.5 / 15.8) you would see something like this:
63.5 dec 15.8 dec decConDiv 4 point 1 ok
The point 1 is obviously wrong, it needs to be point 01; we can add an if statement to try and account for that, with something like this:
DUP \ duplicate the denominator compiled ROT \ move the numerator to the top compiled SWAP \ move the numerator to the middle compiled /MOD \ divide with remainders compiled SWAP \ move the whole numbers to the middle compiled 100 * \ time the remainders by 100 compiled ROT \ move the denominator to the top compiled /MOD \ divide with remainders compiled SWAP \ move the remainders back compiled DROP \ drop the whole number compiled SWAP \ push the decimal to the back compiled . ." point " \ output the first part compiled DUP \ duplicate the decimal compiled 10 < IF \ check if decimal is less than 10 compiled ." 0" . \ stick a zero in front compiled ELSE \ otherwise it's okay compiled . \ just output the number as is compiled THEN \ done checking compiled ; redefined decConDiv ok
Remember, when dealing with numbers that the two numbers go first, then the symbol and it will work left to right.
63.5 dec 15.8 dec decConDiv 4 point 01 ok 63.5 dec 15.3 dec decConDiv 4 point 15 ok 63.5 dec 15.5 dec decConDiv 4 point 09 ok
Now that we've got that out of the way, I can go back and explain what the new command "ROT" is. ROT basically allows you to play Jenga with the top 3 stack items. It moves the third from top item onto the top and pushes the previously top two down to fill the gap. The easiest way to see what is does is:
1 2 3 ok .S 1 2 3 ok ROT ok .S 2 3 1 ok
This was required for getting the correct numbers to divide in our script above, while keeping hold of a spare denominator for later use. One last thing on this division stuff; because lets be honest, adding "dec" after each number and then typing "decConDiv" is a pain in the balls. Fortunately the script also works on completely whole numbers (it's totally pointless going through it on whole numbers, but it does work). So what we can do is use a fancy "alias" style script like this:
: divide compiled DUP \ duplicate the number to check compiled 0 = IF \ check we are dealing with decimals compiled dec \ clean the second number compiled SWAP \ push the clean number back compiled dec \ clean the first number compiled THEN \ done cleaning compiled decConDiv \ run the divider compiled ; redefined divide ok
All that does is confirms the numbers are decimals and drops the extra 0's (using the dec command) then calls the decConDiv command. The results end up looking like this:
1 2 divide 0 point 50 ok 2.5 2.0 divide 1 point 25 ok 76.542 12.498 divide 6 point 12 ok
Okay, so enough with the numbers now. Let's move on to something more interesting... Strings! And blocks! So first up we need to do the basics, which is work with limited strings. Thanks to the plethora of computer memory nowadays we can use the RAM to store near enough unlimited strings. However, some of the syntax can be a little bit backward (certainly in gforth).
: VAR VARIABLE compiled ; ok : STR \ string compiled 32 ALLOT \ allocate some memory ok ; ok : DEF \ define string compiled CR compiled ." >>" \ to show we are defining a variable compiled 32 ACCEPT \ get the new input compiled ; ok : READ \ print the value compiled CR compiled 32 TYPE \ display the memory block assigned compiled ; ok
So what did I just do? Well I first of all added an alias for VARIABLE, so I can use the shorter VAR. Then I made a couple of really simple words to allocate 32 Bytes to whatever variable entered. Then we need to get the input for said variable, and finally we need some way of reading it. The ALLOT, ACCEPT and TYPE variables all require a memory amount which makes sense but also causes some issues (which I'll talk about in a minute). However, we can do the following:
VAR $newVariable ok $newVariable STR ok $newVariable DEF >>This is a test of the new WORDS. ok $newVariable READ This is a test of the new WORDS. ok
So we defined "$newVariable" as a variable; and the name could be anything, I just like making it clear what is a 'word' and what is a variable. Then we allocated 32 bytes to that variable, then we defined those 32 bytes as the exactly 32 character sentence "This is a test of the new WORDS." and finally we reprinted it. I can reprint that variable as many times as I want and nothing is on the stack:
$newVariable READ This is a test of the new WORDS. ok $newVariable READ This is a test of the new WORDS. ok $newVariable READ This is a test of the new WORDS. ok .S ok
However, the type command goes a bit screwy when you try using a string that is less than the 32 assigned bytes:
VAR $newVar ok $newVar STR ok $newVar DEF >> Short variable. ok $newVar READ Short variable._�_3 �b������� ok
So we need to limit the amount of bytes that the TYPE command outputs, in the above example the string is 14 characters, so we can use:
$newVar 14 CR TYPE short variable ok
This obviously negates the previous READ command. So we need some way of counting how many characters are in the string and then using that number with the TYPE command. Fortunately, will a little bit of internet digging I found something that mostly works (it's written for Pforth), I just hacked it a little to make it work nicely with gforth. We need to redefine STR, DEF and READ to accommodate a better value than 32:
: STR compiled 127 ALLOT \ allocate a long memory address compiled ; redefined STR ok : DEF compiled PAD 1+ compiled CR compiled ." >>" 127 ACCEPT compiled PAD 1+ compiled SWAP compiled ; redefined DEF ok : READ compiled CR compiled TYPE compiled ; redefined READ ok
Now, you can run the following:
VAR $aVariable ok $aVariable STR ok $aVariable DEF >>I am a variable with length! ok READ I am a variable with length! ok
This however does cause an issue when you have multiple variables; as the READ command relies on DEF entries being the last things on the stack so you would have to work backwards making sure to clear each related stack item once you are done outputting the contents of the variable. That is it for now; I will revisit forth with some more improvements, hopefully within the next few days!