Hey all,

This is my first time posting on GoExpert, so I'm excited to see what the community here has to offer! Anyway, I am currently working on creating a compiler (using Flex and Bison) in C for a subset of the Pascal programming language in my Compilers course... and to be honest - I'm a bit stuck!

I think I've made a lot of progress but, I'm getting stuck on this transition from Semantic Analysis to Code Generation... and what's worse is I'm not entirely sure I've gotten the semantic Analysis part all down!

I've created a Lex file (.l), a Bison/YACC file for my grammar (parser), a symbol table file that holds my insert() and lookup() methods for the symbol records, as well as a code generation file as well (although, there are some pieces of code in here that I'm unsure about - on how I translated the code ( ie. STORE/JUMP instructions).

Anyway, let me get to the specific question/request: I need some help modifying my .y Bison file to install, generate code, and back patch labels the different parts of the Pascal grammar. For Example:

Here's a small snippet of my grammar (starting symbol):

program:            program_header   block   PERIOD
;
program_header: PROGRAM IDENT SEMICOLON { install( $2 ); }
;


** where install( $2 ) installs the identifier (symbol name) into the symbol table. **
** where lowercase are non-terminals and UPPERCASE are terminals (Keywords in Pascal) **

I really have plenty of questions and would probably bog down this forum post if I went into any more random detail... if you have any clue about Compiler Design using Flex/Lex and Bison/YACC, I would REALLY appreciate your attention and help. I could and probably would need to go into more detail if I get some responses.

Just let me know what I need to elaborate on or to upload any files you need to view like: the mini-Pascal grammar, my lex file, my bison file, my symbol table, code generator, etc.

Thanks in advance,

-Krysis09

This is no GoExpert, yet we do "have a clue".
So, what is your question?

This is no GoExpert, yet we do "have a clue".
So, what is your question?

Heh, sorry for the slip up there... anyway, on to my question. I've been referencing Anthony A. Aaby's book: Compiler Construction using Flex and Bison at the following link: Aaby's Book

I am at the point in my compiler creation at which I need to modify my parser file (.y file) to both install and generate code/labels for the grammar (pg. 38 in Aaby's book) for the subset the Pascal language... but, I'm confused as to how to go about it. More specifically, I don't understand WHEN to "install" identifiers, generate code, or, as Aaby put it, "content_check" identifiers in the grammar. I tried starting on doing my calls to install and gen_code, but I doubt I'm doing it correctly...

Install, gen_code(), and context_check() are all methods described by Mr. Aaby which:
1. Place a symbol into the symbol table
2. Generate the code for a particular terminal symbol (I think!)
and
3. Check whether a particular symbol exists in the symbol table... (or something to that effect).

Here are the methods I've written (basically based off of Aaby's methods) to install, generate code, and check the context of symbols of the grammar:

install( char *sym_name )
{
        symrec *s;

        s = lookup( sym_name );
        if(s == 0)
                s = insert(sym_name);
        else
        {
                errors++;
                printf( "%s is already defined\n", sym_name );
        }
}

context_check( enum code_ops operation, char *sym_name )
{
        symrec *identifier;

        identifier = lookup( sym_name );

        if( identifier == 0 )
        {
                errors++;
                printf( "%s", sym_name );
                printf( "%s\n", " is an undeclared identifier." );
        }
        else
                gen_code( operation, identifer->offset );
}
/* Generates code at the current offset */
void gen_code( enum code_ops operation, int arg )
{
        code[code_offset].op = operation;
        code[code_offset++].arg = arg;
}

Where insert and lookup() are methods in my symbol table header file that inserts a new symbol and looks up a symbol name in the symbol table (respectively).

Attached is my .y file and the grammar for the subset of Pascal that I'll be creating the compiler for.

Hopefully, I've explained myself well enough.

EDIT: Symtab.h contains my symbol table, codeGen.h = code generator, stackMachine = contains instructions, pascalScanner = lex file, pascalParser = parser (Bison/YACC) file, pascalGrammar = actual Pascal grammar.

I only had a brief look at the grammar file.

One thing that strikes immediately is that you call context_check in the variable_decl rule. This is definitely wrong. The variable declaration introduces a new identifier, so you want to install it into a symbol table, rather than make sure it exists.

Another is that you do not need to (and in fact, cannot) context_check on expressions (as you do in the 2nd and 3rd choices in the statement rule. You are correctly checking at the IDENT choice, and that should be enough.

Regarding gen_code, in my opinion, it does not belong to the context_check. A check is a check, no more, no less. Code should be generated later, when you have a sensible piece of code (such as a statement) parsed into an AST. But, at the state the project is right now, I wouldn't worry about code generation at all.

Regarding gen_code, in my opinion, it does not belong to the context_check. A check is a check, no more, no less. Code should be generated later, when you have a sensible piece of code (such as a statement) parsed into an AST. But, at the state the project is right now, I wouldn't worry about code generation at all.

Hmm, I see.

Soo, I made the suggested corrections to the grammar (removed the unnecessary context_checks), but when I came to this part of your reply... I was confused. What do you mean "gen_code does not belong to the context_check"?

Can you clarify as to what direction I need to move towards to get my grammar looking more "sensible"? Other than removing the context_checks which were placed in error, I really didn't know where to progress from there.

Thanks.

Hmm, I see.

Soo, I made the suggested corrections to the grammar (removed the unnecessary context_checks), but when I came to this part of your reply... I was confused. What do you mean "gen_code does not belong to the context_check"?

You have the context_check function which calls gen_code(). In my opinion this is wrong.

context_check( enum code_ops operation, char *sym_name )
{
        symrec *identifier;
        identifier = lookup( sym_name );
        if( identifier == 0 )  {
            ...
        } else
                gen_code( operation, identifer->offset );
}

The said opinion is in part due to the functionality of context_check, which only validates the existence of the symbol. Really checking the context is much more involved (see below).

Can you clarify as to what direction I need to move towards to get my grammar looking more "sensible"? Other than removing the context_checks which were placed in error, I really didn't know where to progress from there.

The "sensible" related to the Pascal code being parsed. Speaking of direction, I'd concentrate efforts first on a symbol table. Right now you only store symbol names there. You need much more of course.
First of all, symbol types. I am not sure if your grammar supports any type system, but at least you have to somehow tell apart variables from procedures.
Second, local variables. Currently your symbol table is flat, which means that every symbol belongs to a global scope. Each procedure should have a private symbol table, and a proper name lookup discipline should be devised.

That will give you a more or less solid validator.

In parallel, you must create a bunch of testcases (the more the better). Think of the most incomprehensible yet valid constructs; think of most subtle syntax errors. When you have your validator to correctly parse valid programs, and correctly flags erroneous ones, then you may start worry of code generation.

Thanks.

Always welcome.

Hey Krysis

I have a similar project for one of my courses now with a changes grammar. WIll you be able to upload the whole project files so that i can use it to do my project.

Thanks in advance

If you check the dates, you would see that this thread is four years old, and that Krysis has not been on Daniweb since then.

Please avoid thread necromancy in the future, thank you.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.