258 lines
10 KiB
Org Mode
258 lines
10 KiB
Org Mode
|
#+SETUPFILE: ../../config.org
|
||
|
#+TAGS: Serene(s) Languages(l)
|
||
|
#+CATEGORY: Engineering
|
||
|
#+DATE: 2020-01-03
|
||
|
#+TITLE: Serene (simple versoin)
|
||
|
|
||
|
As you might already know I'm working on [[file:my-new-programming-language.org][my own programming language]]
|
||
|
for a while now. I'm still on early stages and working on [[file:choosing-the-target-platform.org][choosing the right platform]]
|
||
|
for [[https://serene-lang.org][Serene]] and trying to spend time on doing enough research and make decision based
|
||
|
on facts, scientific papers and collected data from experiments
|
||
|
rather than rushing into things and end up with a mess.
|
||
|
I believe those languages that take their time and move slowly
|
||
|
but with great research, plan and design are more successful in
|
||
|
the long term (Thanks to *Pouya* for pointing it out). Take *Clojure*
|
||
|
as an example. They are taking their time, experimenting and validating
|
||
|
their hypothesis. As a result, Clojure is a well designed, stable
|
||
|
and highly backward compatible language with amazing and productive
|
||
|
pace. However, some other languages like Python are extremely
|
||
|
popular and consequently has more contributors. Dealing with all
|
||
|
those contributors caused Python to move faster than it should and
|
||
|
they ended up with some bad choices and horrible designs that
|
||
|
fixing them requires an humongous effort. Gradually, it becomes
|
||
|
harder and harder to fix those and move away from them. GIL is a good example,
|
||
|
instead of fixing the issue and removing the GIL, they are introducing
|
||
|
(at the of writing this article they added some basic support to latest
|
||
|
python release but far from what they want) [[https://lwn.net/Articles/754162/][something else]] to fix the original
|
||
|
problem but it might become a pain point itself. In order to avoid these kind
|
||
|
of problem as much as possible I'm trying to take my time and do as
|
||
|
many as experiments as I need.
|
||
|
|
||
|
As I mentioned [[file:choosing-the-target-platform.org][earlier]] I think *GraalVM* and *Truffle* is the right answer for
|
||
|
Serene. But to verify my initial Idea I decided to run an experiment.
|
||
|
The experiment is all about implementing a Lisp in two environments.
|
||
|
A pure Java implementation vs a *Truffle* implementation.
|
||
|
|
||
|
I spent several days and implementing the pure java version. The repository
|
||
|
of the simple version is available in the [[https://devheroes.codes/Serene/serene-simple][repo]]
|
||
|
This is a dummy version, but good enough lisp that I didn't paid too much attention
|
||
|
to the details and just created a very simple lisp with the following specification.
|
||
|
|
||
|
#+BEGIN_QUOTE
|
||
|
Note: In this post whereever I use the name **Serene** for the implementation,
|
||
|
I'm referring to the simple version.
|
||
|
#+END_QUOTE
|
||
|
|
||
|
* Data structures :Serene:Languages:
|
||
|
Since I tried to avoid unnecessary work, I didn't do much and implemented
|
||
|
just one collection type which is the most important and essential data
|
||
|
structure in any Lisp, the mighty List. While my final goal is to have functional
|
||
|
data structures, this List is not a functional one and is a simple linked
|
||
|
list. You can find the implementation under =serene.simple.ListNode=.
|
||
|
|
||
|
For the number types I just added support for =Long= and =Double=
|
||
|
via `serene.simple.SNumber` class which acts as a dispatcher between two inner
|
||
|
classes.
|
||
|
|
||
|
For Strings, boolean types and =nil=, I just used the equivalent Java data
|
||
|
structures directly.
|
||
|
|
||
|
* Reader/Parser
|
||
|
Instead of using a parser generator or a sophisticated parser, I just created
|
||
|
a simple read ahead of position based parser that reads two chars and calls the
|
||
|
appropriate method to create the corresponding =Node=. the =serene.simple.Node=
|
||
|
is an abstract class that has just one important method, =eval=. The whole
|
||
|
purpose of the reader is to parse the code and create an AST like data structure
|
||
|
which each node extends the =Node= class (I should've create the interface for
|
||
|
it but too lazy to change it now). The =eval= method of `ListNode` is a bit
|
||
|
special. It calls the =eval= method on all the elements on the list
|
||
|
and then calls the first element as a function and pass the rest of the elements
|
||
|
as the arguments to that function. First rule of lisp :))
|
||
|
|
||
|
The =eval= method of =ListNode= contains lots more details regarding to java
|
||
|
interop as well which I leave it out of this blog post.
|
||
|
|
||
|
* Scope
|
||
|
Scopes are simply a mapping between symbol names and values. Serene consists of two
|
||
|
different scopes, both implemented in =serene.simple.IScope= and extend
|
||
|
=serene.simple.AScope= abstract class that contains the logic for symbol
|
||
|
lookup and insertion. These two classes are =serene.simple.Scope= which
|
||
|
is the general scope and it has a parent/child type of relationship with
|
||
|
other instances of the same class or =serene.simple.RootScope= that is
|
||
|
the top level scope. Beside that, =RootScope= is pre-populated with all
|
||
|
the built-in functions and types.
|
||
|
|
||
|
* Special forms
|
||
|
Serene's [[https://courses.cs.northwestern.edu/325/readings/special-forms.php][special forms]] are pretty limited. All of them all classes which extend
|
||
|
=serene.simple.SpecialForm= abstract class and inherit from =Node= indirectly. The
|
||
|
difference between special form evaluation and function evaluation is that in case
|
||
|
of special forms, Serene does not evaluate the arguments and leaves the evaluation
|
||
|
to the special form itself. Here is the list of Serene's special forms:
|
||
|
|
||
|
=def=: Creates a binding between a symbol name and the given value:
|
||
|
|
||
|
#+BEGIN_SRC lisp
|
||
|
(def name "serene")
|
||
|
#+END_SRC
|
||
|
|
||
|
=fn=: Creates an anonymous function:
|
||
|
#+BEGIN_SRC lisp
|
||
|
(def inc (fn (x) (+ 1 x)))
|
||
|
#+END_SRC
|
||
|
|
||
|
=quote=: Prevents the evaluation of the given argument and return it as it is:
|
||
|
|
||
|
#+BEGIN_SRC lisp
|
||
|
(quote (1 2 3 4)) ;; => (1 2 3 4)
|
||
|
#+END_SRC
|
||
|
|
||
|
=if=: Evaluates the body based on the return value of the given predicate.
|
||
|
|
||
|
#+BEGIN_SRC lisp
|
||
|
(if (= x 1)
|
||
|
(...) ;; if x is 1
|
||
|
(...)) ;; if x is not 1
|
||
|
#+END_SRC
|
||
|
|
||
|
=let=: Sets a local scope and runs its body using that scope.
|
||
|
|
||
|
#+BEGIN_SRC lisp
|
||
|
(let ((x 1)
|
||
|
(y 2))
|
||
|
(println x y))
|
||
|
#+END_SRC
|
||
|
|
||
|
=do=: Simply groups several expressions together.
|
||
|
|
||
|
#+BEGIN_SRC lisp
|
||
|
(do
|
||
|
(println ...)
|
||
|
(if ....))
|
||
|
#+END_SRC
|
||
|
|
||
|
=cond=: Gets several predicates and only evaluates the body corresponding
|
||
|
to the first truthy predicate.
|
||
|
|
||
|
#+BEGIN_SRC lisp
|
||
|
(cond
|
||
|
((= x 1) (body1...)
|
||
|
((= x 2) (body2...))
|
||
|
(true (else...))))
|
||
|
#+END_SRC
|
||
|
|
||
|
* Builtin Function
|
||
|
All the build in function are created by extending the =serene.simple.builtin.AFn=
|
||
|
abstract class and follow the same =Node= convention. Here is a list of the most
|
||
|
important built in functions:
|
||
|
|
||
|
=(println ....)=: Prints all the arguments on the stdout.
|
||
|
|
||
|
=(quit)=: Quits the program.
|
||
|
|
||
|
=(conj coll x...)=: Returns a new list by adding the given arguments.
|
||
|
|
||
|
=(count coll)=: Returns the number of elements in the given COLL.
|
||
|
|
||
|
=(reverse coll)=: Returns a new list which is the reverse of COLL.
|
||
|
|
||
|
=(list 1 2 3..)=: Creates a list from the given arguments.
|
||
|
|
||
|
=(first coll)=: Returns the first element of the given COLL.
|
||
|
|
||
|
=(rest coll)=: Returns all the elements beside the first element of the given COLL.
|
||
|
|
||
|
=(doc fn)=: Returns the documentation for the given symbol if any.
|
||
|
|
||
|
=(reduce f coll initial)=: Reduces the COLL by applying F to its elements with the
|
||
|
INITIAL as the default value. F takes two arguments 1) the accumulation 2) the element.
|
||
|
|
||
|
=(new Class arg1 arg2...)=: Create a new instance of the given CLASS by passing the given
|
||
|
arguments to its constructor.
|
||
|
|
||
|
** Example program
|
||
|
Here is an example program in Serene simple version (=benchmarks/fib.srns= in the repo):
|
||
|
|
||
|
#+BEGIN_SRC scheme
|
||
|
;; We have a reduce function but just in case...
|
||
|
(def reduce1
|
||
|
(fn (f xs initial-value)
|
||
|
(cond
|
||
|
((first xs) (reduce f (rest xs) (f initial-value (first xs))))
|
||
|
(true initial-value))))
|
||
|
|
||
|
;; A simple map function implementation in serene
|
||
|
(def map
|
||
|
(fn (f xs)
|
||
|
(reduce (fn (acc x) (cons acc (f x))) xs (list))))
|
||
|
|
||
|
(def range-list
|
||
|
(fn (x y init)
|
||
|
(if (< y x)
|
||
|
(do
|
||
|
(conj (range-list x (+ y 1) init) y))
|
||
|
init)))
|
||
|
|
||
|
(def range
|
||
|
(fn (x)
|
||
|
(range-list x 0 (list))))
|
||
|
|
||
|
(def fib
|
||
|
(fn (n)
|
||
|
(def fib-iter
|
||
|
(fn (x y z)
|
||
|
(if (= x 0)
|
||
|
z
|
||
|
(fib-iter (- x 1)
|
||
|
z
|
||
|
(+ y z)))))
|
||
|
(fib-iter n 0 1)))
|
||
|
|
||
|
(def benchmark-fn
|
||
|
(fn (x)
|
||
|
(let ((start (now)))
|
||
|
(println (fib x))
|
||
|
(- (now) start))))
|
||
|
|
||
|
|
||
|
(def run-benchmark
|
||
|
(fn (times)
|
||
|
(map (fn (x)
|
||
|
(println "Benchmark: " x)
|
||
|
(println "Took: " (benchmark-fn 500)))
|
||
|
(range times))))
|
||
|
|
||
|
(run-benchmark 20)
|
||
|
#+END_SRC
|
||
|
|
||
|
|
||
|
* What is missing ?
|
||
|
Since Serene (simple) is an experimental language and I'll abandon it eventually.
|
||
|
I didn't want to fall into the rabbit hole and just tried to get to the point as soon as possible.
|
||
|
So I sacrificed lots of details. Here is a list of the most important missing
|
||
|
details:
|
||
|
|
||
|
- A namespace functionality. Because creating and compiling dynamic classes
|
||
|
is a massive task and needs tons of work which doesn't make sense for a toy
|
||
|
project.
|
||
|
- Unified function interface.
|
||
|
- Requiring different namespaces.
|
||
|
- A sophisticated parser. My Reader implementation is really cheap that
|
||
|
suits a toy project. It might worth investigating on different solutions
|
||
|
including using a parser generator or ahead of time read implementation.
|
||
|
- Primitive functions in Serene. I used lots of primitive functions
|
||
|
from java rather than implementing them inSerene itself, mostly because
|
||
|
of two reasons. Lack of
|
||
|
macros and namespaces.
|
||
|
- Decent [functional] data structures. The only data structure I implemented
|
||
|
is list.
|
||
|
- Quality code. The general quality of this implementation is not great, I
|
||
|
sacrificed quality for time.
|
||
|
|
||
|
* Conclusion
|
||
|
I'm not going to improve the simple version anymore at this stage. I'm going to run
|
||
|
some benchmarks and measure different aspects of the current implementation and then
|
||
|
I'll move to the *Truffle* version and continue experimenting.
|
||
|
|
||
|
Please let me know if you have any comments or questions on this topic. As always
|
||
|
I'm available throw social media and email.
|