Re: [sdcc-devel] Dissection of a compiler

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Erik, Borut,

Thanks for your comments.

>> - do all compilation steps process the whole source file before passing to
>> the next step? (My guess is - not, this is valid only for preprocessing,
>> assembling and linking - the rest is called from the parser
>> function-wise???).
>
>The preprocessing and the rest of the compiling (but not the linking and
>assembling) effectively operate in parallel. Although the preprocessor
>runs as a separate program, its output is piped back to the main compiler
>which parses the preprocessed results as they are generated.

Line-by-line?
But, would there be any difference if the preprocessor would be run standalone, then the result as a whole fed to the rest (parser etc.)? I.e. is there any other information communicated between the preprocessor and "the rest" than just the preprocessed line itself?

AFAI understand, "rest" shares a complete set of variables, so they cannot be simply split; and after the lexer/parser chews up a bit of code (how much - see a question below) it submits it to the following processing steps (as you described below), which all are accomplished one after other on this particular bit of code; correct?

However, a question on the "processing bit": sdcc/doc/random-notes.txt mentions briefly "eBBlock" (basic block). Is this the "processing bit"? Is this equal to "function"?

>> parser:
>> - any other sources for the parser (.y, .lex, port->keywords)?
>
>The only other thing I can think of is there's a hook that allows a port
>to look at the #pragma directives; it can parse that however it likes.

OK thanks, I noticed that already.

>
>> - any further reading on AST?
>
>The concept of the annotated syntax tree is discussed in many compiler
>textbooks. The implementation details, however, can widely vary because
>they are dependent on the language's syntax and what sort of annotations
>are desired/useful. The AST is essentially just a data structure to
>represent a program's source code so that the rest of the compiler doesn't
>have to worry about the details of parsing.
>

Well I see the purpose, I just was hoping :-) there is some explanation for the format of output when using --dumptree...

>> - what is the purpose of the .y and the .lex? Any documentation for the
>> syntax?
>
>The .lex file generates the lexer which is what recognizes groups of
>characters as "words" (tokens). The .y file generates the parser which
>takes the output of the lexer and attempts to fit them into the defined
>grammar. The traditional programs that processed these files are lex and
>yacc; common alternatives are flex and bison.
>
>I have a paper copy of _lex & yacc_ (ISBN: 1-56592-000-7) that I use for
>reference. Some online references can also be found at:
>
>  http://dinosaur.compilertools.net/
>

Thanks I'll try to have a glimpse...

>> What is the next step in processing of the code?
>
>After each function is parsed: 1) the AST for the function is converted to
>intermediate code, 2) processor independent optimizations are performed on
>the intermediate code, 3) processor dependent optimizations are performed
>on the intermediate code and register usage is determined, 4) assembly
>code is generated from the intermediate code, and 5) the assembly code is
>peephole optimized.
>
>After reaching the end of the source file, the initializers for any
>non-const global or static variables also go through these same steps.

I'll upgrade the "processing steps" in wiki according to this.
However, one more request, I know I am asking for quite a lot now: can anybody describe in terms of source files/functions this process?

Jan Waclawek

Re: [sdcc-devel] Dissection of a compiler

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Re: [sdcc-devel] Dissection of a compiler