#+SETUPFILE: ../../config.org #+TAGS: Serene Languages #+CATEGORY: Engineering #+DATE: 2021-04-13 #+TITLE: Serene on the LLVM #+DESC: The rational behind Serene As you may know, I'm trying to build [[./my-new-programming-language.org][my new programming language]], after a ton of study and many experiments, I finally made the decision on what platform I'll target for *Serene*. Here are the history and the rational behind this decision. * A little bit of history :Languages:Serene: After the initial effort on [[./choosing-the-target-platform.org][choosing the right platform]]. I studied a bit about the GraalVM and experiment with it. While it's a nice tool and I see a bright future for it, I wasn't happy with some aspects of it. The most important one being the fact that Oracle is behind it (Why? Well, don't open that door :D) and some other technical reasons which I get to them later. So I looked around again and re-evaluated my choices. I came across the [[https://llvm.org][LLVM]]. Previously I didn't pay much attention to the LLVM because I was blinded by the *GraalVM* and the fact the both work the same theoretically. I mean using both, we need to create the compiler frontend and they would take care of the backend for us (more or less). Initially, one of the reasons why I've picked the *GraalVM* over *LLVM* was due to its support for the *LLVM* itself, and it seemed obvious that later on we can bridge the LLVM world to *Serene*'s world via *GraalVM*. But It was quite the opposite. This time, I looked into the *LLVM* more thoroughly and boy I was (still am) Impressed, well designed tools and libraries to build a compiler. In compare to the *GraalVM* it is very mature, well documented and quite modular. Aaand using the *LLVM* I still can use *GraalVM* via its support for LLVM IR. Long story short the more I've read about *LLVM* the more I got obsessed with it. So I've decide to move away from *GraalVM* and start playing with *LLVM*. * The challenge of the language again With moving away from the *GraalVM*, I had to choose a host language again. While the official language of the **LLVM* is *C++* I tried to avoid it, since I'm not skilled enough in *C++*, So after a series of experiments (which all of them are available in dedicated branches on the repo) I tried, *Rust*, *C*, *C++* (First attempt) and *Golang*. I wrote the parser and an interpreter as an experiment and also to evaluate the facilities of the language when it comes to working with the **LLVM API**. After many iterations, I ended up using *Golang* to create an interpreter with a *FFI* interface so we can write the compiler in *Serene* itself. At the same time I started a journey into mathematics to learn more about the different type systems in theory and different options that we might have for *Serene* (I'll write about that separately in the future). Most of my day went to my studies and I felt really good. But I always had a voice in my head that kept bugging me about [[https://mlir.llvm.org][MLIR]]. I kinda watched a few introductory talks on it before and I had a rough idea about what it is and what it does. In order to shut that voice up, I've decided to look it up and read more about it, while I'm blocked by my math study and to my surprise, it totally blew me away. MLIR is such a brilliant tool, made out of the experience gained in making several languages and compilers, and follows some conventional and well designed principles to build intermediate representation languages. After I read more and more about the *MLIR* which by the way it's a sub project of the *LLVM*, I still firmly believed that using *Golang* with should create an interpreter as a bootstrap language an then provide a FFI interface via the interpreter to use *MLIR*'s *C API* to interact with it. How naive I was. During the course of my study on *MLIR*, I came across a beautiful thing called [[https://llvm.org/docs/TableGen/][TableGen]]. It's part of the LLVM and designed to generate *C++* based on some description in general. It's a generic tool which developers write backends for, in order to generate code for specific purposes and in the case of *MLIR* to generate IR [[https://mlir.llvm.org/docs/Dialects/LLVM/][dialects]]. The way MLIR utilizes the TableGen to generate dialects and a majority of the operations and types is truly amazing. It makes the cumbersome task of making a multi-layer IR quite straightforward. *MLIR* singlehandedly changed my mind about the approach I want to take to build the compiler. All of a sudden *C++* seemed like a reliable option. So I've decided to give it a go. I revived the old C++ branch, forked into a branch called =mlir= and started to work with it a bit. Made a prototype and enhanced it. After a lot of consideration I finally decided to merge the =mlir= branch into the =master= and move the *Golang* implementation into its own branch =golang-impl=. I'm cleaning up the C++ implementation at the moment and I'll be adding a semantic analysis phase to the compiler and I'll be aiming for a minimal lambda calculus implementation to wire up everything in their most minimal state as the foundation and build upon it. Also I'll write another essay dedicated to the technical aspects of why LLVM and MLIR are great for our use cases in more detail.