|
Warning: this is an htmlized version!
The original is here, and the conversion rules are here. |
These notes are the very bare beginnings of a technical report. I felt
that it would be immoral to keep them to myself until I could publish
them, as that could take months or years; so, here they are. Enjoy,
and please get in touch if you have any comments.
Eduardo Ochs
http://angg.twu.net/
edrx@mat.puc-rio.br
2001nov29
Bootstraping a Forth-like language in 50 lines of Lua code
==========================================================
If we define a Forth-like language as being one in which the
interpreter parses a word, executes it immediately and repeats the
process indefinitely, then the code below is an implementation of a
Forth-like language:
res = {}
re = function( res, name, def )
if res[name] then return res[name] end
res[name] = regex(def)
return res[name]
end
re(res, "getline", "^([^\n]*)(\n?)")
re(res, "getspaces", "^([^ \t]*)")
re(res, "getword", "^[ \t]*([^ \t\n]*)")
program = {}
program.string = readfile(arg[1])
program.pos = 0
getword = function( )
local _, mall, m1 = regmatch(res.getword, program.string, program.pos)
program.pos = program.pos + strlen(mall)
return m1
end
getline = function( )
local _, mall, m1, nl = regmatch(res.getline, program.string, program.pos)
program.pos = program.pos + strlen(mall)
if mall ~= "" then return m1 end
end
getuntilre = function( delimre )
local offset, mdelim =
regmatch(re(res, delimre, delimre), program.string, program.pos)
local m1 = strsub(program.string, program.pos+1, program.pos+offset)
program.pos = program.pos+offset+strlen(mdelim)
return m1
end
dict = {}
dict[""] = function( ) getline() end
dict["lua-until"] = function( )
dostring(getuntilre(getword()))
end
while 1 do
dict[getword()]()
end
The last block is the main loop, that parses a word with getword(),
converts it to a function by looking it up in a dictionary, and
executes the function; the second-to-last block defines the two only
words with which the dictionary starts: "", that is executed every
time the parser reaches an end of line, and that simply advances the
parser pointer (that is stored in program.pos) past the end-of-line
char, and "lua-until", that parses a string until a certain delimiter
and evaluates that string as Lua code; the idea is that we can use
that code to add more words to dictionary, to replace the interpreter
main loop by something else, or whatever; thus, "lua-until" is
essentially all what is needed to bootstrap a more powerful system.
The execution of lua-until is a bit tricky, so let's see it in detail.
Consider the following miniforth program:
lua-until EOL
print("Hello")
exit()
EOL
this is not executed
The meaning of "lua-until" is given by
dict["lua-until"] = function( )
dostring(getuntilre(getword()))
end
so the execution of lua-until in the block above consists on parsing a
word ("EOL", in that case), then running getuntilre("EOL") to parse
everything up to its next occurrence -- getuntilre("EOL") will return
the string '\n print("Hello")\nexit()\n' -- and evaluating that with
dostring, which will print "Hello" and leave miniforth. Note that the
parser won't ever touch what comes after the second EOL -- the "this
is not executed".
This is an example of a slightly less trivial miniforth program in
which the lua-until block is used to define two new words:
lua-until EOL
dict["hello"] = function( ) print("hello") end
dict["bye"] = exit
EOL
hello
bye
This is another one, in which we define two words that parse the
following words themselves (actually `#' parses all the rest of the
current line). Note that `p' evaluates the word as Lua code, and so it
is fairy versatile; "p exit()", for example, leaves miniforth.
lua-until EOL
dict["p"] = function( ) pa(eval(getword())) end
dict["#"] = getline
EOL
p "Hello" p 1+2 p dict # comment
p exit()
and this is the classical ": square dup * ; : cube dup square * ;"
example -- but without bytecodes.
lua-until EOL
dstack = {}
rstack = {program}
dpush = function( val ) tinsert(dstack, 1, val) end
dpop = function( ) return tremove(dstack, 1) end
rpush = function( prog ) tinsert(rstack, 1, prog); program = prog end
rpop = function( ) tremove(rstack, 1); program = rstack[1] end
dict[""] = function( )
getline()
if program.pos == strlen(program.string) then rpop() end
end
f = function( code ) rpush({string=code, pos=0}) end
re(res, ";;", "[ \t\n];;([ \t\n]|$)")
dict["::"] = function( )
local word, code = getword(), getuntilre(";;")
dict[word] = function( ) f(%code) end
end
dict["::lua"] = function()
local word, code = getword(), getuntilre(";;")
dict[word] = dostring(format("return function() %s\nend", code))
end
EOL
::lua * dpush(dpop()*dpop()) ;;
::lua dup dpush(dstack[1]) ;;
::lua . pa(dpop()) ;;
::lua val dpush(eval(getword())) ;;
:: square dup * ;;
:: cube dup square * ;;
val 5 cube .
val exit()
# (find-fline "~/miniforth/")
# (find-fline "~/miniforth/miniforth1.lua")