EDAN65 - Compilers

Ludde127 2024-10-20
L

Links

Course Homepage

F1

We use moodle and there is a week by week summary of what to do. The book is available online. The book is: Modern compiler implementation in java, A.W Appel, Jens Palberg. Available online through lund university.

There is a assignment 0 for the first week which is not mandatory. Then the real, mandatory labs start (there are 6 mandatory labs).

There are optional quizzes after almost all lectures, which can be used to reinforce what has been learned. There are some optional exercises which are good for the exam. There are some exercises in the book without solutions in the book.

We can ask and answer questions on moodle.

Estimated time per lab.

  1. 1-4h
  2. 5h
  3. 15h
  4. 12h
  5. 18h
  6. 15h
  7. 12h

What happens after compilation?

Source code -> compiler -> assembly ode -> assembler -> object code -> [lib object code ] -> linker -> executable -> loader

Object code contains global symbols and relocatable addresses. In executable code global symbols and relocatable addresses have been replaced by absolute addresses.

What about java?

A.java -> javac compiler -> A.class (class file, java bytecode)

Example csum = a + b + 1; -> javac compiler -> iload_1, iload_2, iadd, iconst_1, iadd, istore_3

Running java code?

A.class -> bytecode -> machine code (jit) -> optimized machine code

Sometimes interpret it without jit?

Why have the JVM

Becomes more dynamic. It could theoretically run faster than a precompiled program as the compiler has access to more information.

Inside the compiler

  1. Source code
  2. scanning (lexical analysis) [tokens]
  3. Syntactic analysis (parsing) [AST]
  4. Semantic analysis [Attributed AST]
  5. Intermediate code generation [Attributed AST]
  6. Optimization [Intermediate code] (Start of backend, independent on source language, while front end was independent on target language)
  7. Target code generation [target code]

Multiple languages and target machines

Different machines can use the same intermediate representation.

LLVM example

If you can compile your language to LLVM you can use al its optimizations.

Java example

For java there exists different jvms which can be used when some specific features are needed.

Lexical analysis

Takes the source code which is just text and turns it into tokens. We have a fixed amount of tokens which the language is made up of. The scanner then tries to identify each one of those tokens. The scanner turns a sequence of text into a sequence of tokens. Examples of tokens "while" "lpar" "id" "leq".

Syntactic analysis

The parser takes the tokens and tries to turn them into a ST (Syntax tree). It goes through the tokens and tries to build a tree which fits some rules.

A AST (Abstract syntax tree) is an abstraction of the ST. When we have a AST we can do a lot of checks and start to do interesting stuff with it.

Intermediate code generation

To do this we add attributes.

Generating the compiler

We can use regular expressions -> scanner generator -> lexical analyser

context-free grammar -> parser generator -> parser

attribute grammar -> attribute evaluator generator -> semantic analysis

...

Program errors

lexical errors when text could not be interpreted as a token. During parsing you could detect tokens in the wrong order. In semantic analysis wrong use of names or types could be detected. The intepreter/machine can then detect runtime errors such as null pointer exception or division by zero.

Flashcards for the exam

Card: 1/23 Score:  0.0
Press to start