2021-07-02 23:37:40 +01:00
|
|
|
|
#+TITLE: How to build a compiler with LLVM and MLIR
|
|
|
|
|
#+SEQ_TODO: TODO(t/!) NEXT(n/!) BLOCKED(b@/!) | DONE(d%) CANCELLED(c@/!) FAILED(f@/!)
|
|
|
|
|
#+TAGS: READER(r) MISC(m)
|
2021-09-05 19:40:18 +01:00
|
|
|
|
#+STARTUP: logdrawer logdone logreschedule indent content align constSI entitiespretty overview
|
2021-07-02 23:37:40 +01:00
|
|
|
|
|
2021-07-03 16:52:57 +01:00
|
|
|
|
* DONE Episode 1 - Introduction
|
2021-07-02 23:37:40 +01:00
|
|
|
|
** What is it all about?
|
2021-12-19 21:45:49 +00:00
|
|
|
|
- Create a programming lang
|
|
|
|
|
- Guide for contributors
|
|
|
|
|
- A LLVM/MLIR guide
|
2021-07-02 23:37:40 +01:00
|
|
|
|
** The Plan
|
2021-12-19 21:45:49 +00:00
|
|
|
|
- Git branches
|
|
|
|
|
- No live coding
|
|
|
|
|
- Feel free to contribute
|
2021-07-02 23:37:40 +01:00
|
|
|
|
** Serene and a bit of history
|
2021-12-19 21:45:49 +00:00
|
|
|
|
- Other Implementations
|
|
|
|
|
- Requirements
|
|
|
|
|
- C++ 14
|
|
|
|
|
- CMake
|
|
|
|
|
- Repository: https://devheroes.codes/Serene
|
|
|
|
|
- Website: lxsameer.com
|
|
|
|
|
Email: lxsameer@gnu.org
|
2021-07-10 17:52:53 +01:00
|
|
|
|
* DONE Episode 2 - Basic Setup
|
|
|
|
|
CLOSED: [2021-07-10 Sat 09:04]
|
2021-07-03 16:52:57 +01:00
|
|
|
|
** Installing Requirements
|
|
|
|
|
*** LLVM and Clang
|
|
|
|
|
- mlir-tblgen
|
|
|
|
|
*** ccache (optional)
|
|
|
|
|
** Building Serene and the =builder=
|
|
|
|
|
- git hooks
|
|
|
|
|
** Source tree structure
|
|
|
|
|
** =dev.org= resources and TODOs
|
2021-07-19 15:44:39 +01:00
|
|
|
|
* DONE Episode 3 - Overview
|
|
|
|
|
CLOSED: [2021-07-19 Mon 09:41]
|
2021-07-10 17:52:53 +01:00
|
|
|
|
** Generic Compiler
|
|
|
|
|
- [[https://www.cs.princeton.edu/~appel/modern/ml/whichver.html][Modern Compiler Implementation in ML: Basic Techniques]]
|
|
|
|
|
- [[https://suif.stanford.edu/dragonbook/][Compilers: Principles, Techniques, and Tools (The Dragon Book)]]
|
|
|
|
|
*** Common Steps
|
2021-07-10 18:43:33 +01:00
|
|
|
|
- Frontend
|
|
|
|
|
- Lexical analyzer (Lexer)
|
|
|
|
|
- Syntax analyzer (Parser)
|
|
|
|
|
- Semantic analyzer
|
2021-07-12 23:39:29 +01:00
|
|
|
|
- Middleend
|
2021-07-10 18:43:33 +01:00
|
|
|
|
- Intermediate code generation
|
|
|
|
|
- Code optimizer
|
|
|
|
|
- Backend
|
|
|
|
|
- Target code generation
|
2021-07-10 17:52:53 +01:00
|
|
|
|
** LLVM
|
|
|
|
|
[[llvm.org]]
|
|
|
|
|
*** Watch [[https://www.youtube.com/watch?v=J5xExRGaIIY][Introdution to LLVM]]
|
|
|
|
|
*** Quick overview
|
|
|
|
|
Deducted from https://www.aosabook.org/en/llvm.html
|
|
|
|
|
[[./imgs/llvm_dia.svg]]
|
2021-07-10 18:43:33 +01:00
|
|
|
|
- It's a set of libraries to create a compiler.
|
|
|
|
|
- Well engineered.
|
|
|
|
|
- we can focus only on the fronted of the compiler and what is
|
|
|
|
|
actually important to us and leave the tricky stuff to LLVM.
|
|
|
|
|
- LLVM IR enables us to use multiple languages together.
|
|
|
|
|
- It supports many targets.
|
|
|
|
|
- We can benefit from already made IR level optimizers.
|
|
|
|
|
- ....
|
|
|
|
|
|
2021-07-10 17:52:53 +01:00
|
|
|
|
** MLIR
|
|
|
|
|
[[mlir.llvm.org]]
|
|
|
|
|
[[./imgs/mlir_dia.svg]]
|
2021-07-10 18:43:33 +01:00
|
|
|
|
|
|
|
|
|
- With MLIR dialects provide higher level semantics than LLVM IR.
|
|
|
|
|
- It's easier to reason about higher level IR that is modeled after
|
|
|
|
|
the AST rather than a low level IR.
|
|
|
|
|
- We can use the pass infrastructure to efficiently process and transform the IR.
|
|
|
|
|
- With many ready to use dialects we can really focus on our language and us the other
|
|
|
|
|
dialect when ever necessary.
|
|
|
|
|
- ...
|
2021-07-10 17:52:53 +01:00
|
|
|
|
** Serene
|
|
|
|
|
*** A Compiler frontend
|
2021-07-10 18:43:33 +01:00
|
|
|
|
*** Flow
|
2021-07-10 17:52:53 +01:00
|
|
|
|
- =serenec= in parses the command lines args
|
|
|
|
|
- =reader= reads the input file and generates an =AST=
|
2021-07-10 18:43:33 +01:00
|
|
|
|
- =semantic analyzer= walks the =AST= and generates a new =AST= and rewrites
|
2021-07-10 17:52:53 +01:00
|
|
|
|
the necessary nodes.
|
|
|
|
|
- =slir= generator generates =slir= dialect code from =AST=.
|
|
|
|
|
- We lower =slir= to other dialects of the *MLIR* which we call the result =mlir=.
|
|
|
|
|
- Then, We lower everything to the =LLVMIR dialect= and call it =lir= (lowered IR).
|
|
|
|
|
- Finally we fully lower =lir= to =LLVM IR= and pass it to the object generator
|
|
|
|
|
to generate object files.
|
|
|
|
|
- Call the default =c compiler= to link the object files and generate the machine code.
|
2021-07-30 12:17:41 +01:00
|
|
|
|
* DONE Episode 4 - The reader
|
|
|
|
|
CLOSED: [2021-07-27 Tue 22:50]
|
2021-07-17 16:40:02 +01:00
|
|
|
|
** What is a Parser ?
|
2021-07-19 15:44:39 +01:00
|
|
|
|
To put it simply, Parser converts the source code to an [[https://en.wikipedia.org/wiki/Abstract_syntax_tree][AST]]
|
2021-07-17 16:40:02 +01:00
|
|
|
|
*** Algorithms
|
|
|
|
|
- LL(k)
|
|
|
|
|
- LR
|
|
|
|
|
- LALR
|
|
|
|
|
- PEG
|
|
|
|
|
- .....
|
|
|
|
|
|
|
|
|
|
Read More:
|
|
|
|
|
- https://stereobooster.com/posts/an-overview-of-parsing-algorithms/
|
|
|
|
|
- https://tomassetti.me/guide-parsing-algorithms-terminology/
|
|
|
|
|
*** Libraries
|
|
|
|
|
- https://en.wikipedia.org/wiki/Comparison_of_parser_generators
|
|
|
|
|
*** Our Parser
|
|
|
|
|
- We have a hand written LL(1.5) like parser/lexer since lisp already has a structure.
|
2021-07-19 09:38:07 +01:00
|
|
|
|
#+BEGIN_SRC lisp
|
|
|
|
|
;; pseudo code
|
|
|
|
|
(def some-fn (fn (x y)
|
|
|
|
|
(+ x y)))
|
|
|
|
|
(defn main ()
|
|
|
|
|
(println "Result: " (some-fn 3 8)))
|
|
|
|
|
#+END_SRC
|
2021-07-19 15:44:39 +01:00
|
|
|
|
- LL(1.5)?
|
2021-07-10 18:43:33 +01:00
|
|
|
|
- O(n)
|
2021-08-07 17:41:19 +01:00
|
|
|
|
* DONE Episode 5 - The Abstract Syntax Tree
|
|
|
|
|
CLOSED: [2021-07-30 Fri 14:01]
|
2021-07-30 12:17:41 +01:00
|
|
|
|
** What is an AST?
|
|
|
|
|
Ast is a tree representation of the abstract syntactic structure of source code. It's just a tree made of nodes that each node is
|
|
|
|
|
a data structure describing the syntax.
|
|
|
|
|
|
|
|
|
|
#+BEGIN_SRC lisp
|
|
|
|
|
;; pseudo code
|
|
|
|
|
(def main (fn () 4))
|
|
|
|
|
(prn (main))
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[[./imgs/ast.svg]]
|
|
|
|
|
** The =Expression= abstract class
|
|
|
|
|
*** Expressions
|
|
|
|
|
- Expressions vs Statements
|
|
|
|
|
- Serene(Lisp) and expressions
|
|
|
|
|
** Node & AST
|
2021-08-21 18:46:49 +01:00
|
|
|
|
* DONE Episode 6 - The Semantic Analyzer
|
|
|
|
|
CLOSED: [2021-08-21 Sat 18:44]
|
2021-08-07 17:41:19 +01:00
|
|
|
|
** Qs
|
|
|
|
|
- Why didn't we implement a linked list?
|
|
|
|
|
- Why we are using the =std::vector= instead of llvm collections?
|
|
|
|
|
** What is Semantic Analysis?
|
|
|
|
|
- Semantic Analysis makes sure that the given program is semantically correct.
|
|
|
|
|
- Type checkr works as part of this step as well.
|
|
|
|
|
|
|
|
|
|
#+BEGIN_SRC lisp
|
|
|
|
|
;; pseudo code
|
|
|
|
|
(4 main)
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
[[./imgs/incorrct_semantic.svg]]
|
|
|
|
|
** Semantic Analysis and rewrites
|
|
|
|
|
We need to reform the AST to reflect the semantics of Serene closly.
|
|
|
|
|
|
|
|
|
|
#+BEGIN_SRC lisp
|
|
|
|
|
;; pseudo code
|
|
|
|
|
(def main (fn () 4))
|
|
|
|
|
(prn (main))
|
|
|
|
|
#+END_SRC
|
|
|
|
|
[[./imgs/ast.svg]]
|
|
|
|
|
|
|
|
|
|
[[./imgs/semantic.svg]]
|
|
|
|
|
|
|
|
|
|
Let's run the compiler to see the semantic analysis in action.
|
|
|
|
|
** Let's check out the code
|
2021-09-05 19:40:18 +01:00
|
|
|
|
* DONE Episode 7 - The Context and Namespace
|
|
|
|
|
CLOSED: [2021-09-04 Sat 10:53]
|
2021-08-21 18:46:49 +01:00
|
|
|
|
** Namespaces
|
|
|
|
|
*** Unit of compilation
|
|
|
|
|
*** Usually maps to a file
|
|
|
|
|
*** keeps the state and evironment
|
|
|
|
|
** SereneContext vs LLVM Context vs MLIR Context
|
|
|
|
|
*** Compilers global state
|
|
|
|
|
*** The owner of LLVM/MLIR contexts
|
|
|
|
|
*** Holds the namespace table
|
|
|
|
|
*** Probably will contain the primitive types as well
|
2021-09-17 13:49:55 +01:00
|
|
|
|
* DONE Episode 8 - MLIR Basics
|
|
|
|
|
CLOSED: [2021-09-17 Fri 10:18]
|
2021-09-05 19:40:18 +01:00
|
|
|
|
** Serene Changes
|
|
|
|
|
- Introducing a SourceManager
|
|
|
|
|
- Reader changes
|
|
|
|
|
- *serenec* cli interface in changing
|
|
|
|
|
|
|
|
|
|
** Disclaimer
|
|
|
|
|
*I'm not an expert in MLIR*
|
|
|
|
|
|
|
|
|
|
** Why?
|
|
|
|
|
- A bit of history
|
|
|
|
|
- LLVM IR is to low level
|
|
|
|
|
- We need an IR to implement high level concepts and flows
|
|
|
|
|
*MLIR* is a framework to build a compiler with your own IR. kinda :P
|
|
|
|
|
|
|
|
|
|
- Reusability
|
|
|
|
|
- ...
|
|
|
|
|
** Language
|
|
|
|
|
*** Overview
|
|
|
|
|
- SSA Based (https://en.wikipedia.org/wiki/Static_single_assignment_form)
|
|
|
|
|
- Typed
|
|
|
|
|
- Context free(for lack of better words)
|
|
|
|
|
|
|
|
|
|
*** Dialects
|
|
|
|
|
- A collection of operations
|
|
|
|
|
- Custom types
|
|
|
|
|
- Meta data
|
|
|
|
|
- We can use a mixture of different dialects
|
|
|
|
|
**** builtin dialects:
|
|
|
|
|
- std
|
|
|
|
|
- llvm
|
|
|
|
|
- math
|
|
|
|
|
- async
|
|
|
|
|
- ...
|
|
|
|
|
|
|
|
|
|
*** Opetations
|
|
|
|
|
- Higher level of abstraction
|
|
|
|
|
- Not instructions
|
|
|
|
|
- SSA forms
|
|
|
|
|
- Tablegen backend
|
|
|
|
|
- Verifiers and printers
|
|
|
|
|
*** Attributes
|
|
|
|
|
*** Blocks & Regions
|
|
|
|
|
*** Types
|
|
|
|
|
- Extesible
|
|
|
|
|
** Pass Infrastructure
|
|
|
|
|
Analysis and transformation infrastructure
|
|
|
|
|
|
|
|
|
|
- We will implement most of our semantic analysis logic and type checker as passes
|
|
|
|
|
|
|
|
|
|
** Pattern Rewriting
|
|
|
|
|
- Tablegen backed
|
|
|
|
|
** Operation Definition Specification
|
|
|
|
|
** Examples
|
|
|
|
|
*Not*: You need =mlir-mode= and =llvm-mode= available to you for the code highlighting of
|
|
|
|
|
the following code blocks. Both of those are distributed with the LLVM.
|
|
|
|
|
|
|
|
|
|
*** General syntax
|
|
|
|
|
#+BEGIN_SRC mlir
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%result:2 = "somedialect.blah"(%x#2) { some.attribute = true, other_attribute = 3 }
|
|
|
|
|
: (!somedialect<"example_type">) -> (!somedialect<"foo_s">, i8)
|
|
|
|
|
loc(callsite("main" at "main.srn":10:8))
|
2021-09-05 19:40:18 +01:00
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
*** Blocks and Regions
|
|
|
|
|
#+BEGIN_SRC mlir
|
|
|
|
|
func @simple(i64, i1) -> i64 {
|
|
|
|
|
^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a
|
2021-12-19 21:45:49 +00:00
|
|
|
|
cond_br %cond, ^bb1, ^bb2
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
|
|
|
|
^bb1:
|
2021-12-19 21:45:49 +00:00
|
|
|
|
br ^bb3(%a: i64) // Branch passes %a as the argument
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
|
|
|
|
^bb2:
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%b = addi %a, %a : i64
|
|
|
|
|
br ^bb3(%b: i64) // Branch passes %b as the argument
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
|
|
|
|
// ^bb3 receives an argument, named %c, from predecessors
|
|
|
|
|
// and passes it on to bb4 along with %a. %a is referenced
|
|
|
|
|
// directly from its defining operation and is not passed through
|
|
|
|
|
// an argument of ^bb3.
|
|
|
|
|
^bb3(%c: i64):
|
2021-12-19 21:45:49 +00:00
|
|
|
|
//br ^bb4(%c, %a : i64, i64)
|
|
|
|
|
"serene.ifop"(%c) ({ // if %a is in-scope in the containing region...
|
|
|
|
|
// then %a is in-scope here too.
|
|
|
|
|
%new_value = "another_op"(%c) : (i64) -> (i64)
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
2021-12-19 21:45:49 +00:00
|
|
|
|
^someblock(%new_value):
|
|
|
|
|
%x = "some_other_op"() {value = 4 : i64} : () -> i64
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
2021-12-19 21:45:49 +00:00
|
|
|
|
}) : (i64) -> (i64)
|
2021-09-05 19:40:18 +01:00
|
|
|
|
^bb4(%d : i64, %e : i64):
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%0 = addi %d, %e : i64
|
|
|
|
|
return %0 : i64 // Return is also a terminator.
|
2021-09-05 19:40:18 +01:00
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
*** SLIR example
|
|
|
|
|
Command line arguments to emir =slir=
|
|
|
|
|
#+BEGIN_SRC sh
|
|
|
|
|
./builder run --build-dir ./build -emit slir `pwd`/docs/examples/hello_world.srn
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
Output:
|
|
|
|
|
#+BEGIN_SRC mlir
|
|
|
|
|
module @user {
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%0 = "serene.fn"() ( {
|
|
|
|
|
%2 = "serene.value"() {value = 0 : i64} : () -> i64
|
|
|
|
|
return %2 : i64
|
|
|
|
|
}) {args = {}, name = "main", sym_visibility = "public"} : () -> i64
|
|
|
|
|
|
|
|
|
|
%1 = "serene.fn"() ( {
|
|
|
|
|
%2 = "serene.value"() {value = 0 : i64} : () -> i64
|
|
|
|
|
return %2 : i64
|
|
|
|
|
}) {args = {n = i64, v = i64, y = i64}, name = "main1", sym_visibility = "public"} : () -> i64
|
2021-09-05 19:40:18 +01:00
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
*** Serene's MLIR (maybe we need a better name)
|
|
|
|
|
|
|
|
|
|
Command line arguments to emir =mlir=
|
|
|
|
|
#+BEGIN_SRC sh
|
|
|
|
|
./builder run --build-dir ./build -emit mlir `pwd`/docs/examples/hello_world.srn
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
Output:
|
|
|
|
|
#+BEGIN_SRC mlir
|
2021-12-19 21:45:49 +00:00
|
|
|
|
module @user {
|
2021-09-05 19:40:18 +01:00
|
|
|
|
func @main() -> i64 {
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%c3_i64 = constant 3 : i64
|
|
|
|
|
return %c3_i64 : i64
|
2021-09-05 19:40:18 +01:00
|
|
|
|
}
|
|
|
|
|
func @main1(%arg0: i64, %arg1: i64, %arg2: i64) -> i64 {
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%c3_i64 = constant 3 : i64
|
|
|
|
|
return %c3_i64 : i64
|
|
|
|
|
}
|
2021-09-05 19:40:18 +01:00
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
*** LIR
|
|
|
|
|
Command line arguments to emir =lir=
|
|
|
|
|
#+BEGIN_SRC sh
|
|
|
|
|
./builder run --build-dir ./build -emit lir `pwd`/docs/examples/hello_world.srn
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
Output:
|
|
|
|
|
#+BEGIN_SRC mlir
|
2021-12-19 21:45:49 +00:00
|
|
|
|
module @user {
|
2021-09-05 19:40:18 +01:00
|
|
|
|
llvm.func @main() -> i64 {
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%0 = llvm.mlir.constant(3 : i64) : i64
|
|
|
|
|
llvm.return %0 : i64
|
2021-09-05 19:40:18 +01:00
|
|
|
|
}
|
|
|
|
|
llvm.func @main1(%arg0: i64, %arg1: i64, %arg2: i64) -> i64 {
|
2021-12-19 21:45:49 +00:00
|
|
|
|
%0 = llvm.mlir.constant(3 : i64) : i64
|
|
|
|
|
llvm.return %0 : i64
|
|
|
|
|
}
|
2021-09-05 19:40:18 +01:00
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
*** LLVMIR
|
|
|
|
|
Command line arguments to emir =llvmir=
|
|
|
|
|
#+BEGIN_SRC sh
|
|
|
|
|
./builder run --build-dir ./build -emit ir `pwd`/docs/examples/hello_world.srn
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
Output:
|
|
|
|
|
#+BEGIN_SRC llvm
|
2021-12-19 21:45:49 +00:00
|
|
|
|
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
|
|
|
|
|
target triple = "x86_64-unknown-linux-gnu"
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
2021-12-19 21:45:49 +00:00
|
|
|
|
declare i8* @malloc(i64 %0)
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
2021-12-19 21:45:49 +00:00
|
|
|
|
declare void @free(i8* %0)
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
2021-12-19 21:45:49 +00:00
|
|
|
|
define i64 @main() !dbg !3 {
|
2021-09-05 19:40:18 +01:00
|
|
|
|
ret i64 3, !dbg !7
|
2021-12-19 21:45:49 +00:00
|
|
|
|
}
|
2021-09-05 19:40:18 +01:00
|
|
|
|
|
2021-12-19 21:45:49 +00:00
|
|
|
|
define i64 @main1(i64 %0, i64 %1, i64 %2) !dbg !9 {
|
2021-09-05 19:40:18 +01:00
|
|
|
|
ret i64 3, !dbg !10
|
2021-12-19 21:45:49 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
!llvm.dbg.cu = !{!0}
|
|
|
|
|
!llvm.module.flags = !{!2}
|
|
|
|
|
|
|
|
|
|
!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "mlir", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
|
|
|
|
|
!1 = !DIFile(filename: "LLVMDialectModule", directory: "/")
|
|
|
|
|
!2 = !{i32 2, !"Debug Info Version", i32 3}
|
|
|
|
|
!3 = distinct !DISubprogram(name: "main", linkageName: "main", scope: null, file: !4, type: !5, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !6)
|
|
|
|
|
!4 = !DIFile(filename: "REPL", directory: "/home/lxsameer/src/serene/serene/build")
|
|
|
|
|
!5 = !DISubroutineType(types: !6)
|
|
|
|
|
!6 = !{}
|
|
|
|
|
!7 = !DILocation(line: 0, column: 10, scope: !8)
|
|
|
|
|
!8 = !DILexicalBlockFile(scope: !3, file: !4, discriminator: 0)
|
|
|
|
|
!9 = distinct !DISubprogram(name: "main1", linkageName: "main1", scope: null, file: !4, line: 1, type: !5, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !6)
|
|
|
|
|
!10 = !DILocation(line: 1, column: 11, scope: !11)
|
|
|
|
|
!11 = !DILexicalBlockFile(scope: !9, file: !4, discriminator: 0)
|
2021-09-05 19:40:18 +01:00
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
** Resources
|
|
|
|
|
- [[https://www.youtube.com/watch?v=Y4SvqTtOIDk][2020 LLVM Developers’ Meeting: M. Amini & R. Riddle “MLIR Tutorial”]]
|
|
|
|
|
- [[https://www.youtube.com/watch?v=qzljG6DKgic][2019 EuroLLVM Developers’ Meeting: T. Shpeisman & C. Lattner “MLIR: Multi-Level Intermediate Repr..”]]
|
|
|
|
|
- https://mlir.llvm.org/docs
|
|
|
|
|
- https://mlir.llvm.org/docs/LangRef
|
|
|
|
|
- https://en.wikipedia.org/wiki/Basic_block
|
2021-09-05 15:57:28 +01:00
|
|
|
|
|
2021-10-06 18:48:48 +01:00
|
|
|
|
* DONE Episode 9 - IR (SLIR) generation
|
|
|
|
|
CLOSED: [2021-10-01 Fri 18:56]
|
2021-09-23 19:24:51 +01:00
|
|
|
|
** Updates:
|
|
|
|
|
- Source manager
|
|
|
|
|
- Diagnostic Engine
|
|
|
|
|
- JIT
|
|
|
|
|
|
|
|
|
|
There will be an episode dedicated to eache of these
|
|
|
|
|
** How does IR generation works
|
|
|
|
|
- Pass around MLIR context
|
|
|
|
|
- Create Builder objects that creates operations in specific
|
|
|
|
|
locations
|
|
|
|
|
- ModuleOp
|
|
|
|
|
- Namespace
|
|
|
|
|
** How to define a new dialect
|
|
|
|
|
- Pure C++
|
|
|
|
|
- Tablegen
|
|
|
|
|
** SLIR
|
|
|
|
|
*** The SLIR goal
|
|
|
|
|
- An IR that follows the AST
|
|
|
|
|
- Rename?
|
|
|
|
|
*** Steps
|
|
|
|
|
- [X] Define the new dialect
|
|
|
|
|
- [X] Setup the tablegen
|
|
|
|
|
- [X] Define the operations
|
|
|
|
|
- [X] Walk the AST and generate the operations
|
2021-10-06 18:48:48 +01:00
|
|
|
|
|
2021-10-16 16:15:56 +01:00
|
|
|
|
* DONE Episode 10 - Pass Infrastructure
|
|
|
|
|
CLOSED: [2021-10-15 Fri 14:17]
|
2021-10-06 18:48:48 +01:00
|
|
|
|
** The next Step
|
|
|
|
|
** Updates:
|
|
|
|
|
*** CMake changes
|
|
|
|
|
** What is a Pass
|
|
|
|
|
*** Passes are the unit of abstraction for optimization and transformation in LLVM/MLIR
|
|
|
|
|
*** Compilation is all about transforming the input data and produce an output
|
|
|
|
|
|
|
|
|
|
Source code -> IR X -> IR Y -> IR Z -> ... -> Target Code
|
|
|
|
|
|
|
|
|
|
*** Almost like a function composition
|
|
|
|
|
*** The big picture
|
|
|
|
|
*** Pass Managers (Pipelines) are made out of a collection of passes and can be nested
|
|
|
|
|
*** The most of the interesting parts of the compiler reside in Passes.
|
|
|
|
|
*** We will probably spend most of our time working with passes
|
|
|
|
|
|
|
|
|
|
** Pass Infrastructure
|
|
|
|
|
*** ODS or C++
|
|
|
|
|
*** Operation is the main abstract unit of transformation
|
|
|
|
|
*** OperationPass is the base class for all the passes.
|
|
|
|
|
*** We need to override =runOnOperation=
|
|
|
|
|
*** There's some rules you need to follow when defining your Pass
|
|
|
|
|
**** Must not maintain any global mutable state
|
|
|
|
|
**** Must not modify the state of another operation not nested within the current operation being operated on
|
|
|
|
|
**** ...
|
|
|
|
|
|
|
|
|
|
*** Passes are either OpSpecific or OpAgnostic
|
|
|
|
|
**** OpSpecific
|
|
|
|
|
#+BEGIN_SRC C++
|
|
|
|
|
struct MyFunctionPass : public PassWrapper<MyFunctionPass,
|
|
|
|
|
OperationPass<FuncOp>> {
|
|
|
|
|
void runOnOperation() override {
|
|
|
|
|
// Get the current FuncOp operation being operated on.
|
|
|
|
|
FuncOp f = getOperation();
|
|
|
|
|
|
|
|
|
|
// Walk the operations within the function.
|
|
|
|
|
f.walk([](Operation *inst) {
|
|
|
|
|
// ....
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/// Register this pass so that it can be built via from a textual pass pipeline.
|
|
|
|
|
/// (Pass registration is discussed more below)
|
|
|
|
|
void registerMyPass() {
|
|
|
|
|
PassRegistration<MyFunctionPass>();
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
**** OpAgnostic
|
|
|
|
|
#+BEGIN_SRC C++
|
|
|
|
|
struct MyOperationPass : public PassWrapper<MyOperationPass, OperationPass<>> {
|
|
|
|
|
void runOnOperation() override {
|
|
|
|
|
// Get the current operation being operated on.
|
|
|
|
|
Operation *op = getOperation();
|
|
|
|
|
// ...
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
#+END_SRC
|
|
|
|
|
*** How transformation works?
|
|
|
|
|
*** Analyses and Passes
|
|
|
|
|
*** Pass management and nested pass managers
|
|
|
|
|
#+BEGIN_SRC C++
|
|
|
|
|
// Create a top-level `PassManager` class. If an operation type is not
|
|
|
|
|
// explicitly specific, the default is the builtin `module` operation.
|
|
|
|
|
PassManager pm(ctx);
|
|
|
|
|
|
|
|
|
|
// Note: We could also create the above `PassManager` this way.
|
|
|
|
|
PassManager pm(ctx, /*operationName=*/"builtin.module");
|
|
|
|
|
|
|
|
|
|
// Add a pass on the top-level module operation.
|
|
|
|
|
pm.addPass(std::make_unique<MyModulePass>());
|
|
|
|
|
|
|
|
|
|
// Nest a pass manager that operates on `spirv.module` operations nested
|
|
|
|
|
// directly under the top-level module.
|
|
|
|
|
OpPassManager &nestedModulePM = pm.nest<spirv::ModuleOp>();
|
|
|
|
|
nestedModulePM.addPass(std::make_unique<MySPIRVModulePass>());
|
|
|
|
|
|
|
|
|
|
// Nest a pass manager that operates on functions within the nested SPIRV
|
|
|
|
|
// module.
|
|
|
|
|
OpPassManager &nestedFunctionPM = nestedModulePM.nest<FuncOp>();
|
|
|
|
|
nestedFunctionPM.addPass(std::make_unique<MyFunctionPass>());
|
|
|
|
|
|
|
|
|
|
// Run the pass manager on the top-level module.
|
|
|
|
|
ModuleOp m = ...;
|
|
|
|
|
if (failed(pm.run(m))) {
|
|
|
|
|
// Handle the failure
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
2021-11-02 22:05:34 +00:00
|
|
|
|
* DONE Episode 11 - Lowering SLIR
|
|
|
|
|
CLOSED: [2021-11-01 Mon 15:14]
|
2021-10-16 16:15:56 +01:00
|
|
|
|
** Overview
|
2021-10-16 20:48:14 +01:00
|
|
|
|
- What is a Pass?
|
|
|
|
|
- Pass Manager
|
2021-10-06 18:48:48 +01:00
|
|
|
|
** Dialect lowering
|
|
|
|
|
*** Why?
|
|
|
|
|
*** Transforming a dialect to another dialect or LLVM IR
|
|
|
|
|
*** The goal is to lower SLIR to LLVM IR directly or indirectly.
|
2021-10-16 16:15:56 +01:00
|
|
|
|
** Dialect Conversions
|
|
|
|
|
This framework allows for transforming a set of illegal operations to a set of legal ones.
|
|
|
|
|
*** Target Conversion
|
|
|
|
|
*** Rewrite Patterns
|
|
|
|
|
*** Type Converter
|
|
|
|
|
** Full vs Partial Conversion
|
2021-11-04 00:59:38 +00:00
|
|
|
|
* DONE Episode 12 - Target code generation
|
|
|
|
|
CLOSED: [2021-11-04 Thu 00:57]
|
2021-11-02 22:05:34 +00:00
|
|
|
|
** Updates:
|
|
|
|
|
*** JIT work
|
|
|
|
|
*** Emacs dev mode
|
|
|
|
|
** So far....
|
|
|
|
|
** Next Step
|
|
|
|
|
*** Compile to object files
|
|
|
|
|
*** Link object files to create an executable
|
2021-11-03 14:06:45 +00:00
|
|
|
|
** End of wiring for static compilers
|
2021-11-02 22:05:34 +00:00
|
|
|
|
** What is an object file?
|
|
|
|
|
*** Symbols
|
|
|
|
|
- A pair of a name and a value
|
|
|
|
|
- Value of a *defined symbol* is an offset in the =Content=
|
|
|
|
|
- *undefined symbols*
|
|
|
|
|
*** Relocations
|
|
|
|
|
Are computation to perform on the =Content=. For example, "set
|
|
|
|
|
this location in the contents to the value of this symbol plus this addend".
|
|
|
|
|
|
|
|
|
|
Linker will apply all the *relocations* in an object file on link time and if
|
|
|
|
|
it can not resolve an undefined symbol most of the time it will raise an
|
|
|
|
|
error (depending on the relocation and the symbol).
|
|
|
|
|
|
|
|
|
|
*** Contents
|
|
|
|
|
- Are what memory should look like during the execution
|
|
|
|
|
- Have a size
|
|
|
|
|
- Have a type
|
|
|
|
|
- Have an array of bytes
|
|
|
|
|
- Has sections like:
|
|
|
|
|
+ .text: The target code generated by the compiler
|
|
|
|
|
+ .data: The values of initialized variables
|
|
|
|
|
+ .rdata: Static unnamed data like literal strings, protocol tables and ....
|
|
|
|
|
+ .bss: Uninitialized variables (the content can be omitted or striped and assume
|
|
|
|
|
to contain only zeros)
|
|
|
|
|
** Linking process
|
|
|
|
|
During the linking process, linker assigns an address to each *defined* symbol
|
|
|
|
|
and tries to =resolve= *undefined* symbols
|
|
|
|
|
*** Linker will
|
|
|
|
|
- reads the object files
|
|
|
|
|
- reads the contents
|
|
|
|
|
+ as raw data
|
|
|
|
|
+ figures out the length
|
|
|
|
|
+ reads the symbols and create a symbol table
|
|
|
|
|
+ link undefined symbols to their definitions (possibly from other obj fils or libs)
|
|
|
|
|
+ decide where all the content should go in the memory
|
|
|
|
|
+ sort them based on the *type*
|
|
|
|
|
+ concat them together
|
|
|
|
|
+ apply relocations
|
|
|
|
|
+ write the result to a file as an executable
|
|
|
|
|
** AOT vs JIT
|
|
|
|
|
** Let's look at some code
|
2021-11-03 14:06:45 +00:00
|
|
|
|
** Resources:
|
|
|
|
|
- [[https://lwn.net/Articles/276782/][20 part linker essay]]
|
2021-12-19 21:45:49 +00:00
|
|
|
|
* DONE Episode 13 - Source Managers
|
|
|
|
|
CLOSED: [2021-12-18 Sat 11:17]
|
2021-11-27 17:49:24 +00:00
|
|
|
|
** FAQ:
|
|
|
|
|
- What tools are you using?
|
2021-11-26 15:43:47 +00:00
|
|
|
|
|
2021-11-04 00:59:38 +00:00
|
|
|
|
** Updates:
|
2021-11-26 15:43:47 +00:00
|
|
|
|
- Still JIT
|
|
|
|
|
- We're going to start the JIT discussion from next EP
|
|
|
|
|
|
2021-11-04 00:59:38 +00:00
|
|
|
|
** Forgot to show case the code generation
|
|
|
|
|
I didn't show it in action
|
2021-11-26 15:43:47 +00:00
|
|
|
|
|
|
|
|
|
** What is a source manager
|
|
|
|
|
- It owns and manages are the source buffers
|
|
|
|
|
- All of our interactions with source files will happen though Source manager
|
|
|
|
|
- Including reading files
|
|
|
|
|
- Loading namespaces
|
|
|
|
|
- Including namespaces
|
|
|
|
|
- ...
|
|
|
|
|
|
|
|
|
|
- LLVM provides a =SourceMgr= class that we're not using it
|
2022-01-09 11:55:34 +00:00
|
|
|
|
* DONE Episode 14 - JIT Basics
|
|
|
|
|
CLOSED: [2022-01-05 Wed 17:37]
|
2021-12-19 21:45:49 +00:00
|
|
|
|
** Updates:
|
|
|
|
|
- Lost my data :_(
|
|
|
|
|
- Fixed some compatibility issues
|
|
|
|
|
- New video series on *How to build an editor with Emacs Lisp*
|
|
|
|
|
|
|
|
|
|
** What is Just In Time Compilation?
|
|
|
|
|
- Compiling at "runtime" (air quote)
|
|
|
|
|
Or it might be better to say, "on demand compilation"
|
|
|
|
|
- Usually in interpreters and Runtimes
|
|
|
|
|
|
|
|
|
|
#+NAME: ep-14-jit-1
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/jit.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
digraph {
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box"]
|
|
|
|
|
edge [color=gray80]
|
|
|
|
|
rankdir = "LR"
|
|
|
|
|
|
|
|
|
|
a[label="Some kind of input code"]
|
|
|
|
|
b[label="JIT"]
|
|
|
|
|
c[label="Some sort of target code"]
|
|
|
|
|
d[label="Execute the result"]
|
|
|
|
|
a -> b -> c -> d
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-14-jit-1
|
|
|
|
|
[[file:/tmp/jit.svg]]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#+NAME: ep-14-jit-2
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/jit-2.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
|
|
|
|
|
digraph G {
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box"]
|
|
|
|
|
edge [color=gray80]
|
|
|
|
|
rankdir = "LR"
|
|
|
|
|
|
|
|
|
|
a[label="Source code"]
|
|
|
|
|
b[label="Parser"]
|
|
|
|
|
c[label="Semantic Analyzer"]
|
|
|
|
|
d[label="IR Generator"]
|
|
|
|
|
e[label="Pass Manager"]
|
|
|
|
|
f[label="Object Layer"]
|
|
|
|
|
g[label="Native Code"]
|
|
|
|
|
z[label="Preload Core libs"]
|
|
|
|
|
a -> b
|
|
|
|
|
b -> c {label="AST"}
|
|
|
|
|
c -> d
|
|
|
|
|
z -> f
|
|
|
|
|
subgraph cluster0 {
|
|
|
|
|
color=lightgrey;
|
|
|
|
|
d -> e -> f
|
|
|
|
|
label = "JIT Engine";
|
|
|
|
|
}
|
|
|
|
|
f -> g
|
|
|
|
|
g -> Store
|
|
|
|
|
g -> Execute
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-14-jit-2
|
|
|
|
|
[[file:/tmp/jit-2.svg]]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Trade off
|
|
|
|
|
Compilation speed vs Execution speed
|
|
|
|
|
*** JIT vs Typical interpreters
|
|
|
|
|
|
|
|
|
|
** Why to use JIT?
|
|
|
|
|
- Make the interpreter to run "faster" (air quote again)
|
|
|
|
|
- Speed up the compilation
|
|
|
|
|
Avoid generating the target code and generate some byte-code instead
|
|
|
|
|
and then use a JIT in runtime to execute the byte-code.
|
|
|
|
|
- Use runtime data to find optimization opportunities.
|
|
|
|
|
- Support more archs
|
|
|
|
|
- And many other reasons
|
|
|
|
|
** How we're going to use a JIT?
|
|
|
|
|
- We need a JIT engine to implement Lisp Macros
|
|
|
|
|
- Compile time vs Runtime
|
|
|
|
|
+ Abstraction
|
|
|
|
|
- A JIT engine to just compile Serene code
|
|
|
|
|
- Our compiler will be a fancy JIT engine
|
|
|
|
|
|
|
|
|
|
#+NAME: ep-14-jit-3
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/jit-3.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
digraph {
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box"]
|
|
|
|
|
edge [color=gray80]
|
|
|
|
|
rankdir = "LR"
|
|
|
|
|
|
|
|
|
|
a[label="Serene AST"]
|
|
|
|
|
b[label="For every node"]
|
|
|
|
|
c[label="is it a macro call?" shapp="diamond"]
|
|
|
|
|
d[label="Add it to JIT"]
|
|
|
|
|
e[label="Expand it (call it)"]
|
|
|
|
|
f[label="Generate target code"]
|
|
|
|
|
a -> b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
subgraph cluster0 {
|
|
|
|
|
color=lightgrey;
|
|
|
|
|
b -> c
|
|
|
|
|
c -> d [label="NO"]
|
|
|
|
|
c -> e [label="YES"]
|
|
|
|
|
e -> b
|
|
|
|
|
d -> f
|
|
|
|
|
label = "JIT Engine";
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
f -> Store
|
|
|
|
|
f -> Execute
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-14-jit-3
|
|
|
|
|
[[file:/tmp/jit-3.svg]]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
** LLVM/MLIR and JIT
|
|
|
|
|
- 3 different approaches
|
|
|
|
|
- MLIR's JIT
|
|
|
|
|
- LLVM JITs
|
|
|
|
|
+ MCJIT (Deprecated)
|
|
|
|
|
+ LLJIT (Based on ORCv2)
|
|
|
|
|
+ LazyLLJIT (Based on LLJIT)
|
|
|
|
|
- Use LLVM's ORCv2 directly to create an engine
|
2022-01-30 20:01:32 +00:00
|
|
|
|
* DONE Episode 15 - LLVM ORC JIT
|
|
|
|
|
CLOSED: [2022-01-28 Fri 12:15]
|
2022-01-09 11:55:34 +00:00
|
|
|
|
** Uptades:
|
|
|
|
|
- Created a bare min JIT that:
|
|
|
|
|
- Eagrly compiles namespaces
|
|
|
|
|
- Reload namespacs
|
|
|
|
|
- I guess this the time to start the Serene's Spec
|
|
|
|
|
|
|
|
|
|
** What is ORCv2?
|
|
|
|
|
- On request compiler
|
|
|
|
|
- Replaces MCJIT
|
|
|
|
|
- ORCv2 docs and examples
|
|
|
|
|
- Kaleidoscope tutorial (not complete)
|
|
|
|
|
|
|
|
|
|
Before We can move to Serene's code we need to understand ORC first
|
|
|
|
|
|
|
|
|
|
** Terminology
|
|
|
|
|
*** Execution Session
|
|
|
|
|
A running JIT program. It contains the JITDylibs, error reporting mechanisms, and dispatches
|
|
|
|
|
the materializers.
|
|
|
|
|
|
|
|
|
|
*** JITAddress
|
|
|
|
|
It's just an address of a JITed code
|
|
|
|
|
|
|
|
|
|
*** JITDylib
|
|
|
|
|
Represents a JIT'd dynamic library.
|
|
|
|
|
|
|
|
|
|
This class aims to mimic the behavior of a shared object, but without requiring
|
|
|
|
|
the contained program representations to be compiled up-front. The JITDylib's
|
|
|
|
|
content is defined by adding MaterializationUnits, and contained MaterializationUnits
|
|
|
|
|
will typically rely on the JITDylib's links-against order to resolve external references.
|
|
|
|
|
|
|
|
|
|
JITDylibs cannot be moved or copied. Their address is stable, and useful as
|
|
|
|
|
a key in some JIT data structures.
|
|
|
|
|
|
|
|
|
|
*** MaterializationUnit
|
|
|
|
|
A =MaterializationUnit= represents a set of symbol definitions that can
|
|
|
|
|
be materialized as a group, or individually discarded (when
|
|
|
|
|
overriding definitions are encountered).
|
|
|
|
|
|
|
|
|
|
=MaterializationUnits= are used when providing lazy definitions of symbols to
|
|
|
|
|
JITDylibs. The JITDylib will call materialize when the address of a symbol
|
|
|
|
|
is requested via the lookup method. The =JITDylib= will call discard if a
|
|
|
|
|
stronger definition is added or already present.
|
|
|
|
|
|
|
|
|
|
MaterializationUnit stores in JITDylibs.
|
|
|
|
|
|
|
|
|
|
*** MaterializationResponsibility
|
|
|
|
|
Represents and tracks responsibility for materialization and mediates interactions between
|
|
|
|
|
=MaterializationUnits= and =JITDylibs=. It provides a way for Dylib to find out about the outcome
|
|
|
|
|
of the materialization.
|
|
|
|
|
|
|
|
|
|
*** Memory Manager
|
|
|
|
|
A class that manages how JIT engine should use memory, like allocations and deallocations.
|
|
|
|
|
=SectionMemoryManager= is a simple memory manager that is provided by ORC.
|
|
|
|
|
|
|
|
|
|
*** Layers
|
|
|
|
|
ORC based JIT engines are constructed from several layers. Each layer has a specific responsiblity
|
|
|
|
|
and passes the result of its operation to the next layer. E.g Compile Layer, Link layer and ....
|
|
|
|
|
|
|
|
|
|
*** Resource Tracker
|
|
|
|
|
The API to remove or transfer the ownership of JIT resources. Usually, a resource is a module.
|
|
|
|
|
|
|
|
|
|
*** ThreadSafeModule
|
|
|
|
|
A thread safe container for the LLVM module.
|
|
|
|
|
|
|
|
|
|
** ORC highlevel API
|
|
|
|
|
- ORC provides a Layer based design to that let us create our own JIT engine.
|
|
|
|
|
- It comes with two ready to use engines:
|
|
|
|
|
We will look at their implementaion later
|
|
|
|
|
+ LLJIT
|
|
|
|
|
+ LLLazyJIT
|
|
|
|
|
|
|
|
|
|
** Two major solutions to build a JIT
|
|
|
|
|
- Wrap LLJIT or LLLazyJIT
|
|
|
|
|
- Create your own JIT engine and the wrapper
|
|
|
|
|
|
|
|
|
|
** Resources
|
|
|
|
|
*** Docs
|
|
|
|
|
- https://www.llvm.org/docs/ORCv2.html
|
|
|
|
|
|
|
|
|
|
*** Examples
|
|
|
|
|
- https://github.com/llvm/llvm-project/tree/main/llvm/examples/HowToUseLLJIT
|
|
|
|
|
- https://github.com/llvm/llvm-project/tree/main/llvm/examples/OrcV2Examples/LLJITDumpObjects
|
|
|
|
|
- https://github.com/llvm/llvm-project/tree/main/llvm/examples/OrcV2Examples/LLJITWithInitializers
|
|
|
|
|
- https://github.com/llvm/llvm-project/tree/main/llvm/examples/OrcV2Examples/LLJITWithLazyReexports
|
|
|
|
|
|
|
|
|
|
*** Talks
|
|
|
|
|
- [[https://www.youtube.com/watch?v=i-inxFudrgI][ORCv2 -- LLVM JIT APIs Deep Dive]]
|
|
|
|
|
- [[https://www.youtube.com/watch?v=MOQG5vkh9J8][Updating ORC JIT for Concurrency]]
|
|
|
|
|
- [[https://www.youtube.com/watch?v=hILdR8XRvdQ][ORC -- LLVM's Next Generation of JIT API]]
|
2022-02-27 19:10:30 +00:00
|
|
|
|
* DONE Eposide 16 - ORC Layers
|
|
|
|
|
CLOSED: [2022-02-26 Sat 12:50]
|
2022-01-30 20:01:32 +00:00
|
|
|
|
** Updates:
|
|
|
|
|
*** Support for adding AST directly to the JIT
|
|
|
|
|
*** Minor change to SLIR (big changes are coming)
|
|
|
|
|
*** Started to unify the llvm::Errors with Serene errors
|
|
|
|
|
*** Tablegen backend for error classes
|
|
|
|
|
** The plan for today
|
|
|
|
|
- We had a brief look at LLJIT/LLLazyJIT
|
|
|
|
|
- Better understanding
|
|
|
|
|
To understand them better we need to understand other components first. Starting from *layers*.
|
|
|
|
|
- We'll have a look at how to define our own layers in the future episodes.
|
|
|
|
|
|
|
|
|
|
** What are Layers?
|
|
|
|
|
- Layers are the basic blocks of an engine
|
|
|
|
|
- They are composable (kinda)
|
|
|
|
|
- Each layer has it's own requirements and details
|
|
|
|
|
- Each layer holds a reference to it's downstream layer
|
|
|
|
|
|
|
|
|
|
#+NAME: ep-16-jit-1
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/ep16-1.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
digraph {
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box"]
|
|
|
|
|
edge [color=gray80]
|
|
|
|
|
rankdir = "LR"
|
|
|
|
|
|
|
|
|
|
a[label="Input Type A"]
|
|
|
|
|
b[label="Input Type B"]
|
|
|
|
|
c[label="Layer A"]
|
|
|
|
|
d[label="Layer B"]
|
|
|
|
|
e[label="Layer C"]
|
|
|
|
|
f[label="Layer D"]
|
|
|
|
|
g[label="Layer E"]
|
|
|
|
|
h[label="Target Code"]
|
|
|
|
|
a -> c
|
|
|
|
|
b -> d
|
|
|
|
|
|
|
|
|
|
subgraph cluster0 {
|
|
|
|
|
color=lightgrey;
|
|
|
|
|
c -> e
|
|
|
|
|
d -> e
|
|
|
|
|
e -> f
|
|
|
|
|
f -> g
|
|
|
|
|
|
|
|
|
|
label = "JIT Engine";
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
g -> h
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-16-jit-1
|
|
|
|
|
[[file:/tmp/ep16-1.svg]]
|
|
|
|
|
|
|
|
|
|
** Kaleidoscope JIT
|
|
|
|
|
|
|
|
|
|
- Chapter 1
|
|
|
|
|
|
|
|
|
|
#+NAME: ep-16-jit-2
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/ep16-2.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
digraph {
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box"]
|
|
|
|
|
edge [color=gray80]
|
|
|
|
|
rankdir = "LR"
|
|
|
|
|
|
|
|
|
|
a[label="LLVM IR Module"]
|
|
|
|
|
b[label="Compiler Layer"]
|
|
|
|
|
c[label="Object Layer"]
|
|
|
|
|
d[label="Target Code"]
|
|
|
|
|
|
|
|
|
|
a -> b
|
|
|
|
|
|
|
|
|
|
subgraph cluster0 {
|
|
|
|
|
color=lightgrey;
|
|
|
|
|
b -> c
|
|
|
|
|
|
|
|
|
|
label = "Kaleidoscope JIT";
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
c -> d
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-16-jit-2
|
|
|
|
|
[[file:/tmp/ep16-2.svg]]
|
|
|
|
|
|
|
|
|
|
- Chapter 2
|
|
|
|
|
#+NAME: ep-16-jit-3
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/ep16-3.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
digraph {
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box"]
|
|
|
|
|
edge [color=gray80]
|
|
|
|
|
rankdir = "LR"
|
|
|
|
|
|
|
|
|
|
a[label="LLVM IR Module"]
|
|
|
|
|
b[label="Compiler Layer"]
|
|
|
|
|
c[label="Object Layer"]
|
|
|
|
|
e[label="Target Code"]
|
|
|
|
|
d[label="Optimize layer"]
|
|
|
|
|
a -> d
|
|
|
|
|
|
|
|
|
|
subgraph cluster0 {
|
|
|
|
|
color=lightgrey;
|
|
|
|
|
d -> b
|
|
|
|
|
b -> c
|
|
|
|
|
|
|
|
|
|
label = "Kaleidoscope JIT";
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
c -> e
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-16-jit-3
|
|
|
|
|
[[file:/tmp/ep16-3.svg]]
|
2022-02-27 19:10:30 +00:00
|
|
|
|
|
2022-03-29 19:55:42 +01:00
|
|
|
|
* DONE Episode 17 - Custom ORC Layers
|
|
|
|
|
CLOSED: [2022-03-28 Mon 14:00]
|
2022-02-27 19:10:30 +00:00
|
|
|
|
** Updates:
|
|
|
|
|
- Finished the basic compiler wiring
|
|
|
|
|
- Restructured the source tree
|
|
|
|
|
- Tweaked the build system mostly for install targets
|
|
|
|
|
- Refactoring, cleaning up the code and writing tests
|
|
|
|
|
** Quick overview an ORC based JIT engine
|
|
|
|
|
- JIT engines are made out of layers
|
|
|
|
|
- Engines have a hierarchy of layers
|
|
|
|
|
- Layers don't know about each other
|
|
|
|
|
- Layers wrap the program representation in a =MaterializationUnit=, which is
|
|
|
|
|
then stored in the =JITDylib=.
|
|
|
|
|
- =MaterializationUnits= are responsible for describing the definitions they provide,
|
|
|
|
|
and for unwrapping the program representation and passing it back to the layer when
|
|
|
|
|
compilation is required.
|
|
|
|
|
- When a =MaterializationUnit= hands a program representation back to the layer it comes
|
|
|
|
|
with an associated =MaterializationResponsibility= object. This object tracks the
|
|
|
|
|
definitions that must be materialized and provides a way to notify the =JITDylib= once
|
|
|
|
|
they are either successfully materialized or a failure occurs.
|
|
|
|
|
|
|
|
|
|
** In order to build a custom layer we need:
|
|
|
|
|
*** A custom materialization unit
|
|
|
|
|
Let's have a look at the =MaterializationUnit= class.
|
|
|
|
|
|
|
|
|
|
*** And the layer class itself
|
|
|
|
|
The layer classes are not special but conventionally the come with few functions
|
|
|
|
|
like: =add=, =emit= and =getInterface=.
|
2022-03-29 19:55:42 +01:00
|
|
|
|
|
2022-03-29 19:56:48 +01:00
|
|
|
|
* DONE Episode 18 - JIT Engine Part 1
|
|
|
|
|
CLOSED: [2022-03-29 Tue 19:56]
|
2022-03-29 19:55:42 +01:00
|
|
|
|
** =Halley= JIT Engine
|
|
|
|
|
- It's not the final implementation
|
|
|
|
|
- Wraps LLJIT and LLLazyJIT
|
|
|
|
|
- Uses object cache layer
|
|
|
|
|
- Supports ASTs and Namespaces
|
2022-06-13 12:45:06 +01:00
|
|
|
|
|
|
|
|
|
* DONE Episode 19 - JIT Engine Part 2
|
|
|
|
|
CLOSED: [2022-05-04 Wed 21:30]
|
|
|
|
|
** How Serene is different from other programming langs?
|
|
|
|
|
- Serene is just a JIT engine
|
|
|
|
|
- Compiletime vs Runtime
|
|
|
|
|
+ The borderline is not clear in case of Serene
|
|
|
|
|
|
|
|
|
|
The Big picture
|
|
|
|
|
#+NAME: ep-19-jit-1
|
|
|
|
|
#+BEGIN_SRC graphviz-dot :file /tmp/ep19-1.svg :cmdline -Kdot -Tsvg
|
|
|
|
|
digraph {
|
|
|
|
|
fontcolor="gray80"
|
|
|
|
|
|
|
|
|
|
graph [bgcolor=transparent]
|
|
|
|
|
node [color=gray80 shape="box", fontcolor="gray80"]
|
|
|
|
|
edge [color=gray80, fontcolor="gray80"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
input_ns[label="Input Namespace"]
|
|
|
|
|
|
|
|
|
|
ast[label="AST"]
|
|
|
|
|
vast[label="Valid AST"]
|
|
|
|
|
ir[label="LLVM IR"]
|
|
|
|
|
|
|
|
|
|
subgraph cluster_2 {
|
|
|
|
|
label="REPL"
|
|
|
|
|
color="gray80"
|
|
|
|
|
|
|
|
|
|
graph [bgcolor=transparent, fontcolor="gray80"]
|
|
|
|
|
node [color=gray80 shape="box", fontcolor="gray80"]
|
|
|
|
|
|
|
|
|
|
input_form[label="Input Form"]
|
|
|
|
|
result[label="Evaluation result"]
|
|
|
|
|
result -> input_form[label="Loop"]
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
subgraph cluster_0 {
|
|
|
|
|
label="JIT"
|
|
|
|
|
color="gray80"
|
|
|
|
|
|
|
|
|
|
graph [bgcolor=transparent, fontcolor="gray80"]
|
|
|
|
|
node [color=gray80 shape="box", fontcolor="gray80"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
execute[label="Execute native code"]
|
|
|
|
|
binary[label="Binary File"]
|
|
|
|
|
|
|
|
|
|
subgraph cluster_1 {
|
|
|
|
|
label="AddAst/Ns"
|
|
|
|
|
color="gray80"
|
|
|
|
|
|
|
|
|
|
graph [bgcolor=transparent, fontcolor="gray80"]
|
|
|
|
|
node [color=gray80 shape="box", fontcolor="gray80"]
|
|
|
|
|
|
|
|
|
|
vast -> ir [label="Compile (No optimization)"]
|
|
|
|
|
|
|
|
|
|
subgraph cluster_4 {
|
|
|
|
|
label="Macro Expansion"
|
|
|
|
|
color="gray80"
|
|
|
|
|
|
|
|
|
|
graph [bgcolor=transparent, fontcolor="gray80"]
|
|
|
|
|
node [color=gray80 shape="box", fontcolor="gray80"]
|
|
|
|
|
|
|
|
|
|
vast -> macros [label=" Find the macros"]
|
|
|
|
|
macros -> JITDylib [label=" look up the required\n Symbols in compiled code"]
|
|
|
|
|
JITDylib -> symbols [label=" lookup"]
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
wrapped_code[lable="Wrapped IR"]
|
|
|
|
|
ir -> wrapped_code[label= " Wrap top level Forms in \nfunctions/calls"]
|
|
|
|
|
wrapped_code -> native [label=" Compile (No optimization)"]
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
symbols -> execute [label="Execute the functions mapped to the symbols"]
|
|
|
|
|
execute -> vast
|
|
|
|
|
execute -> result [label=" Print"]
|
|
|
|
|
execute -> binary [label=" Dump"]
|
|
|
|
|
|
|
|
|
|
native -> execute [label="invoke"]
|
|
|
|
|
native -> JITDylib [label="Add"]
|
|
|
|
|
JITDylib -> Context [label="Store as part the namespace"]
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
subgraph cluster_3 {
|
|
|
|
|
label="CLI interface"
|
|
|
|
|
color="gray80"
|
|
|
|
|
|
|
|
|
|
graph [bgcolor=transparent, fontcolor="gray80"]
|
|
|
|
|
node [color=gray80 shape="box", fontcolor="gray80"]
|
|
|
|
|
|
|
|
|
|
input_ns -> file [label=" resolve to file"]
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
input_form -> ast [label=" read"]
|
|
|
|
|
file -> ast [label=" read"]
|
|
|
|
|
ast -> vast [label=" Semantic Analysis"]
|
|
|
|
|
}
|
|
|
|
|
#+END_SRC
|
|
|
|
|
|
|
|
|
|
#+RESULTS: ep-19-jit-1
|
|
|
|
|
[[file:/tmp/ep19-1.png]]
|
|
|
|
|
** Let's look at some code
|
2022-06-13 18:26:14 +01:00
|
|
|
|
* Episode 20 - Future Roadmap
|
|
|
|
|
** So Far
|
|
|
|
|
- We created a bare bone and minimal compiler
|
|
|
|
|
+ That is capable of just in time and ahead of time compilation
|
|
|
|
|
- We had an over of MLIR and pass management
|
|
|
|
|
- We didn't spend time on fundamentals
|
|
|
|
|
** Design change
|
|
|
|
|
- The current implementation is suitable for a static compiler
|
|
|
|
|
- We want to move toward a more dynamic compiler
|
|
|
|
|
** What's next?
|
|
|
|
|
- Part 2
|
|
|
|
|
- We're going to focus on some of the compiler fundamentals
|
|
|
|
|
- We will create simple utilities to help us in our journey
|
|
|
|
|
- Hopefully we will talk about type systems
|
|
|
|
|
- We're going to sharpen our skills on LLVM/MLIR
|
|
|
|
|
- I'm going to work on the new design
|