Sunday, 19 May 2013

How does a compiler works

A compiler is a program that translates the source code for another program from a programing language into executable code.

The source code is typically in a high-level programming language (e. g. Pascal, C, C++, Java, Perl, C#, etc.). The executable code may be a sequence of machine instructions that can be executed by the CPU directly, or it may be an intermediate representation that is interpreted by a virtual machine (e. g. Java byte code).

In short, a compiler converts a program from a human-readable format into a machine-readable format.

As to how a compiler works, that is indeed complicated. There are books and university courses on the subject. I will attempt to briefly outline the main stages of the process, but this will be a very cursory overview.

1.    Lexing - break up the text of the program into "tokens"

2.    Parsing -
convert the sequence of tokens into a parse tree, which is a data structure representing various language constructs: type declarations, variable declarations, function definitions, loops, conditionals, expressions, etc.

3.    Optimization -
evaluate constant expressions, optimize away unused variables or unreachable code, unroll loops if possible, etc.

4.    Translate the parse tree into machine instructions (or JVM byte code)

Again I stress that this is a very brief descriptions. Modern compilers are very smart, and consequently very complicated.


EmoticonEmoticon