wlp4cc

Project site for WLP to MIPS compiler

As this project contains course assignment material, due to the University of Waterloo Policy 71 academic freedom rules, the source code will be available by request. Please send me an email with your GitHub account ID.

Source code: Repository URL

Overview

The Waterloo Programming Language v4 (WLP4) is developed for the CS241 course. The language is a subset of C (and so can be compiled with minimum modifications with an off-the-shelf C compiler).

IR and code optimization was not taught during the course and all material below was created on my own volition. This is separate from assignment material.

The program input will undergo a series of steps:

Tokenized
Parsed into a concrete syntax tree (CST)
CST is streamed into our wlp4cc program
1. The CST is walked and an IR graph is emitted
2. The IR is represented internally by a graph but can be printed linearly for debugging purposes.
Optimizations are performed (see below)
1. Initialize control flow graph (CFG)
2. Convert to SSA form
3. Perform constant folding + copy propagation
4. Perform dead code elimination (DCE)
5. Remote PHI nodes
IR is lowered into machine code (MC) and placed through the backend

The following snippet of the compiler is shown below. This is not critical in solving the assignment (so hopefully not a Policy 71 violation) but details the extra optimization steps implemented:

// 2. Generate codegen procs and begin optimizations
Codegen::Context cg_ctx(ctx);
for (const auto &p : cg_ctx.procs) {
    // 2.1. Initialize CFG
    p.second->compute_predecessors();
    log_ir("1-preopt.ir", &ctx);
    log_ir("1-preopt.dot", &ctx, true);

    // 2.2. Convert to SSA form
    p.second->compute_dominators();
    p.second->compute_post_dominators();
    p.second->insert_phi();
    p.second->rename_vars();
    log_ir("2-" + p.first + "-ssa.ir", p.second);
    log_ir("2-" + p.first + "-ssa.dot", p.second, true);

    // 2.3. Perform constant folding and copy propagation
    p.second->recompute_all();
    p.second->propagate_const_mov();
    log_ir("3-" + p.first + "-prop.ir", p.second, p.second->liveness_annotator);
    log_ir("3-" + p.first + "-prop.dot", p.second, true);

    // 2.4. Perform DCE
    p.second->dce();
    log_ir("4-" + p.first + "-dce.ir", p.second, p.second->liveness_annotator);
    log_ir("4-" + p.first + "-dce.dot", p.second, true);

    // 2.5. Lower PHI and analyze live intervals
    p.second->lower_phi();
    p.second->recompute_all();
    p.second->analyze_liveness();
    log_ir("5-" + p.first + "-nophi.ir", p.second, p.second->pc_annotator);
    log_ir("5-" + p.first + "-nophi.dot", p.second, true);

    // 2.6. Allocate registers
    p.second->allocate_registers();
    // 2.7. Emit
#ifdef DEBUG
    {
        std::ofstream f;
        f.open(logging_dir + "/out." + p.first + ".s");
        p.second->emit_machinecode(f);
        f.close();
    }
#else
    p.second->emit_machinecode(std::cout);
#endif
}

Example Program

Consider the following input program:

int wain(int a, int b) {
    int c = 0;
    while(a > b) {
        c = c + 1;
    }
    c = 2 + 3;
    return a + b + c;
}

We will present the IR at each optimization step, in graphical form.

1-preopt

This is the pre-optimization IR, the IR emitted from the CST. You can see .L0 forms line 2, .L1 and .L2 form the loop and body, while .L3 forms lines 6 to 7.

2-wain-ssa

PHI nodes are inserted while converting IR into SSA form.

3-wain-prop

Constants and copy propagation happens. Most notably, c=2+3; return a+b+c; is simplified into return a+b+5; in block .L3

4-wain-dce

Dead code elimination. Since values assigned to c is not used outside the loop, the entire loop is optimized away and blocks .L1 and .L2 are replaced with NOPs.

5-wain-nophi

PHI nodes are eliminated, returning IR into non SSA form allowing for MC lowering.

The final IR emitted looks like:

wain(int, int) { 
.L0:
    NOP                            // 0
.L1:
    NOP                            // 1
    BRANCH LE -1 -2 .L3            // 2
.L2:
    NOP                            // 3
    NOP                            // 4
    NOP                            // 5
    BRANCH NONE 0 0 .L1            // 6
.L3:
    NOP                            // 7
    NOP                            // 8
    5 = imm 5                      // 9
    NOP                            // 10
    6 = ADD -1 -2                  // 11
    7 = ADD 6 5                    // 12
    RET 7                          // 13
}

Some notes about the IR:

Syntax is similar to LLVM IR
Each operation is OP x y where x and y are registers
Negative register numbers indicate parameter values
Labels are prefixed with .

Previouscc3k NextMicrokernel Design

Last updated 2 years ago