lxhome/_posts/2020-01-03-serene-simple-ve...

9.8 KiB

layout title date categories tags theme
post Step 2: Serene (Simple) version 2020-01-03 Programming Serene language dark

As you might already know I'm working on my own programming language for a while now. I'm still on early stages and I'm working on choosing the right platform for #Serence and trying to spend time on doing enough research and make decision based on facts, scientific papers and collected data from experiments rather than rushing into things and end up with a mess. I believe those languages that take their time and move slowly but with great research, plan and design are more successful in the long term (Thanks to Pouya for pointing it out). Take Clojure as an example. They are taking their time experimenting and validating their hypothesis. As a result, Clojure is a well designed, stable and highly backward compatible language with amazing productivity pace. However, some other languages like Python are extremely popular and consequently has more contributors. Dealing with all those contributors caused Python to move faster than it should and they ended up with some bad choices and horrible designs that fixing them requires an humongous effort. Little by little, it becomes harder and harder to fix those and move away from them. GIL is a good example, instead of fixing the issue and removing the GIL, they are introducing something else to fix the original problem but it might become a pain point itself. In order to avoid these kind of problem as much as possible I'm trying to take my time and do as many as experiments that I need.

As I mentioned earlier I think GraalVM and Truffle is the right answer for Serene. But to verify my initial Idea I decided to run an experiment. The experiment is all about implementing a Lisp in two environments. A pure Java implementation vs a Truffle implementation. I'll update the experiment files accordingly at the experiment reports repository.

I spend several days and implemented the pure java version. There repository of the simple version is available in the gitlab repo. This version is a dumb but good enough lisp that I didn't paid too much attention to the details and created a very simple lisp which follows the specification below.

Node: In this post where ever I use the name Serene for the implementation, I'm referring to the simple version.

Data structures

Since I tried to avoid unnecessary work I didn't do much and implemented only one collection type which is the most important and essential data structure of Lisp, the mighty List. While my final goal is to have functional data structures, this List is not a functional one and is a simple linked list. You can find the implementation under serene.simple.ListNode.

For the number types I just added the support for Long and Double numbers via serene.simple.SNumber class which act as a dispatcher between two inner classes.

For Strings, boolean types and nil, I just used the equivalent Java data structures directly.

Reader/Parser

Instead of using a parser generator or a sophisticated parser, I just created a simple read ahead position based parser that reads two chars and call the appropriate method to create the corresponding Node. the serene.simple.Node is an abstract class that has just one important method, eval. The whole purpose of the reader is to parse the code and create an AST like data structure which each node extends the Node class (I should've create the interface for it but too lazy to change it now). The eval method of ListNode is a bit special. It calls the eval method on all the elements on the list and the call the first element as a function and pass the rest of elements as the arguments to that function. First rule of lisp :))

The eval method of ListNode contains more details regarding to java interop as well which I leave it out of this blog post.

Scope

Scope are simply a mapping between symbol names and values. Serene consists of two different scopes, both implemented serene.simple.IScope and extends serene.simple.AScope abstract class that contains the logic for symbol lookup and insertion. These two classes are serene.simple.Scope which is the general scope and it has a parent/child type of relationship with other instances of the same class or serene.simple.RootScope that is the top level scope. Beside that, RootScope is pre-populated with all the builtin functions and types.

Special forms

Serene's special forms are pretty limited. All of them all classes which extend serene.simple.SpecialForm abstract class and inherit from Node indirectly. The difference between special form evaluation and function evaluation is that in case of special forms Serene does not evaluate the arguments and leaves the evaluation to the special form itself. Here is the list of Serene's special forms:

def: Creates a binding between a symbol name and the given value:

(def name "serene")

fn: Creates an anonymous function:

(def inc (fn (x) (+ 1 x)))

quote: Prevent the evaluation of the given argument and return it as it is:

(quote (1 2 3 4)) ;; => (1 2 3 4)

if: Evaluated the body based on the return value of the given predicate.

(if (= x 1)
  (...) ;; if x is 1
  (...)) ;; if x is not 1

let: Setup a local scope and runs its body using that scope.

(let ((x 1)
      (y 2))
  (println x y))

do: Simple group several expressions together.

(do
  (println ...)
  (if ....))

cond: Gets several predicates and only evaluates the body corresponding to the first truthy predicate.

(cond
  ((= x 1)  (body1...)
  ((= x 2)  (body2...))
  (true (else...))))

Builtin Function

All the build in function are created by extending the serene.simple.builtin.AFn abstract class and follow the same Node convention. Here is a list of the most important built in functions:

(println ....): Prints all the arguments on the stdout.

(quit): Quits the program.

(conj coll x...): Returns a new list by adding the given arguments.

(count coll): Returns the number of elements in the given COLL.

(reverse coll): Return a new list which is the reverse of COLL.

(list 1 2 3..): Creates a list from the given arguments.

(first coll): Returns the first element of the given COLL.

(rest coll): Return all the elements beside the first of the given COLL.

(doc fn): Return the documentation for the given symbol if any.

(reduce f coll initial): Reduce the COLL by applying F to its elements with the INITIAL as the default value. F takes two arguments 1) the accumulation 2) the element.

(new Class arg1 arg2...): Create a new instance of the given CLASS by passing the given arguments to its constructor.

Example program

Here is an example program in Serene simple version (benchmarks/fib.srns in the repo):

;; We have a reduce function but just in case...
(def reduce1
     (fn (f xs initial-value)
         (cond
          ((first xs) (reduce f (rest xs) (f initial-value (first xs))))
          (true initial-value))))

;; A simple map function implementation in serene
(def map
     (fn (f xs)
         (reduce (fn (acc x) (cons acc (f x))) xs (list))))

(def range-list
     (fn (x y init)
         (if (< y x)
             (do
                 (conj (range-list x (+ y 1) init) y))
             init)))

(def range
     (fn (x)
         (range-list x 0 (list))))

(def fib
  (fn (n)
    (def fib-iter
      (fn (x y z)
        (if (= x 0)
            z
            (fib-iter (- x 1)
                      z
                      (+ y z)))))
    (fib-iter n 0 1)))

(def benchmark-fn
     (fn (x)
         (let ((start (now)))
           (println (fib x))
           (- (now) start))))


(def run-benchmark
     (fn (times)
         (map (fn (x)
                  (println "Benchmark: " x)
                  (println "Took: " (benchmark-fn 500)))
              (range times))))

(run-benchmark 20)

What is missing ?

Since Serene (simple) is an experimental language and I'll abandon it eventually. I didn't want to go into a rabbit hole and tried to get to the point as soon as possible. So I sacrificed lots of details. Here is a list of the most important missing details:

  • A namespace functionality. Because creating and compiling dynamic classes is a massive task and needs tons of work which doesn't make sense for a toy project.
  • Unified function interface.
  • Requiring different namespaces
  • A sophisticated parser. My Reader implementation is really cheap that suits a toy project. It might worth investigating on different solutions including using a parser generator or a head of time read implementation.
  • Primitive functions in Serene. I implemented lots of primitive functions in java rather than Serene itself mostly because of two reasons. Lack of macros and namespaces.
  • Decent [functional] data structures. The only data structure I implemented is list.
  • Quality code. The general quality of this implementation is not great, I sacrificed quality for time.

Conclusion

I'm not going to improve the simple version anymore at this stage. I'm going to run some benchmarks to measure different aspect of the current implementation and then I'll move to the Truffle version to continue the experiment(001). Please let me know if you have any comment or question on this topic. As always I'm available throw social media and email.