lxhome/orgs/essays/choosing-the-target-platfor...

12 KiB
Raw パーマリンク Blame 履歴

Choosing the target platform

After wrapping my head around the rational of My new programming language, I have a big decision to make. Choose a platform.

As programmers, we have a tough life when it comes to making a decision that has direct impact on our product. I'm pretty sure you went through this process at least once. From choosing a semantically great name for a variable to choosing a right technology for your next billion dollar startup. It is always hard to pick a tech stack for a new project. Especially when the new product happens to be a new programming language. If I get my hands dirty with a wrong tech stack for a simple web application, no big deal. I still can rewrite the whole thing and pay a penalty. But in the case of programming languages that's not the case. Wrong platform can easily destroy you. From the dawn of computers, many smart people created tons of languages. But only few of them made it to the top. While there are so many reasons for their success, going with the right platform is one the most important ones.

The obvious question that comes to mind when we're talking about "The platform" is, should we build a platform from scratch or should we piggyback on others? Creating a programming language and a virtual machine from scratch is gigantic and bone crushing task. It needs a crazy set of skills and knowledge. Even with such wisdom and experience people who went through it has made many mistakes and had to constantly iterate to come up with the right implementation. The evolution of programming languages such as Scheme is a good example of it (for more information take a look at R6R5.

Building a VM is hard and Building a fast VM is even harder. While I think creating a programming language and a VM from scratch is really fun, but it can be really frustrating as well. I don't want to get annoyed with myself during the process and abandon my goal. I should ride on the shoulders of the giants to gain benefit from their great work. I should choose a platform that helps me to move faster and iterate through different ideas quicker.

From a technical perspective, Starting from scratch means that I have to write a program that includes at least a parser and a compiler. Building a compiler is no joke. Hypothetically let's say we have a working compiler and parser, What about use libraries and ecosystem ?? It would be really hard to convince people to use a programming language that does not have any useful library and they have to build everything by themselves. It might have been the case 30 years ago but it is not the case in the modern age of programming languages anymore.

So the idea of creating Serene from scratch is out of the picture. We need to find a good platform for it. But what are the options ???

Racket   Serene Languages

Racket is a general-purpose programming language as well as the worlds first ecosystem for language-oriented programming. Make your dream language, or use one of the dozens already available.

Racketis a dialect of Lisp which allows us to build our own language by extending it. While Racket is really cool and have a long list of pros and many reasons why to use it (It's Lisp after all), it has the disadvantage that forced me to stop thinking about it for Serene. As I mentioned in the rational I'm not trying to build a toy language or a domain specific one and Racket's ecosystem isn't as great as a battle tested and well-known ecosystem like Java or Python (or other popular ecosystems).

Javascript

We're living in the age of Web and one of the big players in this era is Javascript. The number of the programming languages that compile to Javascript is increasing rapidly. Javascript as a language sucks but as platform it is amazing. Lots of money and engineering effort has been spent on improving Javascript engines. As a result Javascript is a crappy language with well engineered engines such as V8.

Creating a language based on Javascript platform means that I have to be involved with the whole transpiling scenario and deal with the fact that this new language can be used on different browsers or on the backend. Or even on IE6 (Just kidding). I don't want to deal with all this. I think Javascript platform can't be a good fit for what I need. So I won't go into details about it

Python

Python is another famous platform form creating programming languages. Many people have built programming languages on top of Python (Checkout Lispy if you're a Python fan). Python is super popular these days and you'll see it everywhere. Creating a language on top of Python (just like Javascript) gives me access to a rich ecosystem with huge number of libraries and a robust ecosystem.

But as I mentioned in the rational I want support for built-in concurrency and parallelism. Python isn't even good when it comes to parallelism and concurrency. I'm using python for more than 10 years now and I'm very familiar with it. I know about all the effort to create useful concurrency and parallelism such as asyncIO. But the fact is Python is not designed for this job. The GIL is a huge problem in Python that literally prevents us from Running two pieces of code in parallel in two kernel space thread. It is a problem for me. If you can't do a decent concurrency and parallel execution you have no chance against modern languages like Clojure, Go, Elixir and others. Python is fine now despite of its problems because it is good at other stuff and people accepted it for what it is. Python is out there for about 25 years now and it has established a big community. If Guido van Rossum created Python a year ago, I'm pretty sure that it would've failed because it can't compete with modern languages. Don't get me wrong, I'm not trying to trash Python. It is great and it has many good qualities but a good Concurrency and parallel execution model ain't one of them.

BEAM

Erlang ecosystem is amazing, Robust and well tested. I have read a lot about it and when ever I'm studying anything around computer science that can be related to Erlang, I ask myself "How is Erlang doing it?". Erlang ecosystem truly had a huge impact on the world today.

The problem with Erlang ecosystem for me is that I always read about it and my knowledge around it is only theoretical. Building a language on top a platform needs a good level of practical experience on the platform as well which I don't have that. So it's obvious that I have to pass.

The JVM

As much as I dislike Java (Mostly because of the syntax and the fact that it is an object oriented language), I like JVM a lot. The JVM is battle tested, well design (Well, sort of. But it's certainly evolving.) and fast VM. It should be the most popular VM in the world (I'm just guessing). It is one of the world's most heavily-optimized pieces of software. Plenty of researches have been made to make it better and better.

The JVM has a mature ecosystem and a massive community of developers that resulted in an unbelievable number of libraries (not the largest though, NPM is the largest artifact repository. But it has a huge amount of useless BS as well). By targeting the JVM, users will have an easy time adopting the new languages because of the rich tools set provided by the Java ecosystem and all the languages that targeted JVM as well. For example, it will be possible to use libraries written in Scala or Clojure as well.

Long story short, I think the JVM is the right platform for me. The fact that many languages have chosen it as their base platform shows how useful it can be. But there is a problem. Targeting a higher level virtual machines like the JVM means that I'll have an easier job to create a compiler. But I still have to write one. A compiler that takes the code and produces JVM bytecode. As I mentioned earlier, writing a compiler is an enormous task and the chance of doing it wrong with someone like me who has never built a compiler before is very high.

One VM to rule them all

Luckily there is a solution. I can write an interpreter in a VM that is designed to optimize my interpreter with all that wonderful JIT compilation magic. Oracle has released a new VM that hopes to make writing language interpreters both easy and fast. It can also leverage the huge ecosystem of the JVM. It is an enhanced JVM that contains a new JIT compiler which can speed up interpreters to near Java speed. The new JIT compiler is called Graal. To use the Graals JIT magic we can use the Truffle library to create the interpreter. We will annotate the interpreter and give Graal some hints on invariants and type information. According to Graal's documents, By doing this integration effort we get significant speedups in out interpreter without having to resort to writing a bytecode compiler.

GraalVM is a Java VM and JDK based on HotSpot/OpenJDK, implemented in Java. It supports different execution modes, like ahead-of-time compilation of Java applications for fast startup and low memory footprint.

GraalVM is a universal virtual machine for running applications written in JavaScript, Python, Ruby, R, JVM-based languages like Java, Scala, Groovy, Kotlin, Clojure, and LLVM-based languages such as C and C++.

GraalVM removes the isolation between programming languages and enables interoperability in a shared runtime. It can run either standalone or in the context of OpenJDK, Node.js or Oracle Database.

I copied the above paragraph from GraalVM's official website. It is truly a VM to rule them all.

Truffle library is one the key players in GraalVM. The initial results of Truffle are super exciting. Implementations of Ruby in Truffle has a performance on the same order of magnitude as the much bigger projects of JRuby. Just checkout Truffle Ruby's website to get amazed by it. There is a javascript implementation as well which showed great progress as well. Lots of research has been dedicated to this topic and the result is mind blowing. The interesting thing is that these Truffle implementations were done with fewer people in a shorter period of time. This means you can create your own language on the JVM that takes advantage of all its existing libraries, native threading, JIT compiler without having to write your own compiler, and you get speeds that took other languages years to achieve.

Using GraalVM as the platform for my new language will help me to be much faster because All I need to do is to build an AST interpreter and Graal will handle the rest. It means that I can start by building what is important and use a very well engineered toolkit in my advantage to get to my goal quicker and then later on replace any part that I like with my own implementation. How cool is that???

But as an engineer and a wannabe scientist I'd like to see the proof with my own eyes. Not because I don't trust academic work, Just because it feels good to experience the proof.

So to begin with I'm going to create a dead simple Serene interpreter in Java and OpenJDK and then build the same interpreter using Java on GraalVM using Truffle library and compare the results and prove myself that choosing GraalVM is the right choice.