serene/docs/videos.org

4.2 KiB

How to build a compiler with LLVM and MLIR

DONE Episode 1 - Introduction

What is it all about?

  • Create a programming lang
  • Guide for contributors
  • A LLVM/MLIR guide

The Plan

  • Git branches
  • No live coding
  • Feel free to contribute

Serene and a bit of history

DONE Episode 2 - Basic Setup

CLOSED: [2021-07-10 Sat 09:04]

Installing Requirements

LLVM and Clang

  • mlir-tblgen

ccache (optional)

Building Serene and the builder

  • git hooks

Source tree structure

dev.org resources and TODOs

DONE Episode 3 - Overview

CLOSED: [2021-07-19 Mon 09:41]

Generic Compiler

Common Steps

  • Frontend

    • Lexical analyzer (Lexer)
    • Syntax analyzer (Parser)
    • Semantic analyzer
  • Middleend

    • Intermediate code generation
    • Code optimizer
  • Backend

    • Target code generation

LLVM

/Serene/serene/src/commit/54b4458a8d98b076f0d07a1aa5938a3fac1b426f/docs/llvm.org

Quick overview

Deducted from https://www.aosabook.org/en/llvm.html /Serene/serene/media/commit/54b4458a8d98b076f0d07a1aa5938a3fac1b426f/docs/imgs/llvm_dia.svg

  • It's a set of libraries to create a compiler.
  • Well engineered.
  • we can focus only on the fronted of the compiler and what is actually important to us and leave the tricky stuff to LLVM.
  • LLVM IR enables us to use multiple languages together.
  • It supports many targets.
  • We can benefit from already made IR level optimizers.
  • ….

MLIR

/Serene/serene/src/commit/54b4458a8d98b076f0d07a1aa5938a3fac1b426f/docs/mlir.llvm.org /Serene/serene/media/commit/54b4458a8d98b076f0d07a1aa5938a3fac1b426f/docs/imgs/mlir_dia.svg

  • With MLIR dialects provide higher level semantics than LLVM IR.
  • It's easier to reason about higher level IR that is modeled after the AST rather than a low level IR.
  • We can use the pass infrastructure to efficiently process and transform the IR.
  • With many ready to use dialects we can really focus on our language and us the other dialect when ever necessary.

Serene

A Compiler frontend

Flow

  • serenec in parses the command lines args
  • reader reads the input file and generates an AST
  • semantic analyzer walks the AST and generates a new AST and rewrites the necessary nodes.
  • slir generator generates slir dialect code from AST.
  • We lower slir to other dialects of the MLIR which we call the result mlir.
  • Then, We lower everything to the LLVMIR dialect and call it lir (lowered IR).
  • Finally we fully lower lir to LLVM IR and pass it to the object generator to generate object files.
  • Call the default c compiler to link the object files and generate the machine code.

DONE Episode 4 - The reader

CLOSED: [2021-07-27 Tue 22:50]

What is a Parser ?

To put it simply, Parser converts the source code to an AST

Our Parser

  • We have a hand written LL(1.5) like parser/lexer since lisp already has a structure.
  ;; pseudo code
  (def some-fn (fn (x y)
                   (+ x y)))
  (defn main ()
    (println "Result: " (some-fn 3 8)))
  • LL(1.5)?
  • O(n)

Episode 5 - The Abstract Syntax Tree

What is an AST?

Ast is a tree representation of the abstract syntactic structure of source code. It's just a tree made of nodes that each node is a data structure describing the syntax.

  ;; pseudo code
  (def main (fn () 4))
  (prn (main))

/Serene/serene/media/commit/54b4458a8d98b076f0d07a1aa5938a3fac1b426f/docs/imgs/ast.svg

The Expression abstract class

Expressions

  • Expressions vs Statements
  • Serene(Lisp) and expressions

Node & AST