5.3 KiB
How to build a compiler with LLVM and MLIR
- Episode 1 - Introduction
- Episode 2 - Basic Setup
- Episode 3 - Overview
- Episode 4 - The reader
- Episode 5 - The Abstract Syntax Tree
- Episode 6 - The Semantic Analyzer
- Episode 7 - The Context and Namespace
DONE Episode 1 - Introduction
What is it all about?
- Create a programming lang
- Guide for contributors
- A LLVM/MLIR guide
The Plan
- Git branches
- No live coding
- Feel free to contribute
Serene and a bit of history
- Other Implementations
-
Requirements
- C++ 14
- CMake
- Repository: https://devheroes.codes/Serene
- Website: lxsameer.com Email: lxsameer@gnu.org
DONE Episode 2 - Basic Setup
CLOSED: [2021-07-10 Sat 09:04]
Installing Requirements
LLVM and Clang
- mlir-tblgen
ccache (optional)
Building Serene and the builder
- git hooks
Source tree structure
dev.org
resources and TODOs
DONE Episode 3 - Overview
CLOSED: [2021-07-19 Mon 09:41]
Generic Compiler
- Modern Compiler Implementation in ML: Basic Techniques
- Compilers: Principles, Techniques, and Tools (The Dragon Book)
Common Steps
-
Frontend
- Lexical analyzer (Lexer)
- Syntax analyzer (Parser)
- Semantic analyzer
-
Middleend
- Intermediate code generation
- Code optimizer
-
Backend
- Target code generation
LLVM
/Serene/serene/src/commit/860cb81a269c03d2157e037a5f93a0472283db24/docs/llvm.org
Watch Introdution to LLVM
Quick overview
Deducted from https://www.aosabook.org/en/llvm.html
- It's a set of libraries to create a compiler.
- Well engineered.
- we can focus only on the fronted of the compiler and what is actually important to us and leave the tricky stuff to LLVM.
- LLVM IR enables us to use multiple languages together.
- It supports many targets.
- We can benefit from already made IR level optimizers.
- ….
MLIR
/Serene/serene/src/commit/860cb81a269c03d2157e037a5f93a0472283db24/docs/mlir.llvm.org
- With MLIR dialects provide higher level semantics than LLVM IR.
- It's easier to reason about higher level IR that is modeled after the AST rather than a low level IR.
- We can use the pass infrastructure to efficiently process and transform the IR.
- With many ready to use dialects we can really focus on our language and us the other dialect when ever necessary.
- …
Serene
A Compiler frontend
Flow
serenec
in parses the command lines argsreader
reads the input file and generates anAST
semantic analyzer
walks theAST
and generates a newAST
and rewrites the necessary nodes.slir
generator generatesslir
dialect code fromAST
.- We lower
slir
to other dialects of the MLIR which we call the resultmlir
. - Then, We lower everything to the
LLVMIR dialect
and call itlir
(lowered IR). - Finally we fully lower
lir
toLLVM IR
and pass it to the object generator to generate object files. - Call the default
c compiler
to link the object files and generate the machine code.
DONE Episode 4 - The reader
CLOSED: [2021-07-27 Tue 22:50]
What is a Parser ?
To put it simply, Parser converts the source code to an AST
Algorithms
- LL(k)
- LR
- LALR
- PEG
- …..
Read More:
Our Parser
- We have a hand written LL(1.5) like parser/lexer since lisp already has a structure.
;; pseudo code
(def some-fn (fn (x y)
(+ x y)))
(defn main ()
(println "Result: " (some-fn 3 8)))
- LL(1.5)?
- O(n)
DONE Episode 5 - The Abstract Syntax Tree
CLOSED: [2021-07-30 Fri 14:01]
What is an AST?
Ast is a tree representation of the abstract syntactic structure of source code. It's just a tree made of nodes that each node is a data structure describing the syntax.
;; pseudo code
(def main (fn () 4))
(prn (main))
The Expression
abstract class
Expressions
- Expressions vs Statements
- Serene(Lisp) and expressions
Node & AST
DONE Episode 6 - The Semantic Analyzer
CLOSED: [2021-08-21 Sat 18:44]
Qs
- Why didn't we implement a linked list?
- Why we are using the
std::vector
instead of llvm collections?
What is Semantic Analysis?
- Semantic Analysis makes sure that the given program is semantically correct.
- Type checkr works as part of this step as well.
;; pseudo code
(4 main)
Semantic Analysis and rewrites
We need to reform the AST to reflect the semantics of Serene closly.
;; pseudo code
(def main (fn () 4))
(prn (main))
Let's run the compiler to see the semantic analysis in action.