A highly challenging but rewarding project: build a compiler for a small subset of the C language. The compiler will take a source file written in your 'Toy C' dialect and produce corresponding x86-64 assembly code. This provides a fundamental understanding of how high-level code is transformed into machine-executable instructions.
What you'll build
This project entails the creation of a 'Toy C' compiler, a sophisticated program that translates a simplified subset of the C language into functional x86-64 assembly code. You will design your own small programming language, defining its grammar and features from the ground up. The compiler will perform lexical analysis to tokenize the source code, parsing to construct an Abstract Syntax Tree (AST), and finally, code generation to produce an assembly file. This file can then be assembled and linked using standard tools like NASM and ld to create a native executable. The initial scope will cover fundamental features: the int data type, arithmetic operations (+, -, *, /), comparison operators (==, !=, <, >, <=, >=), variable declaration and assignment, if/else statements, while loops, and function definitions with parameters and return values. This project is a deep dive into the core mechanics of how programming languages work. It is designed to be a significant portfolio piece, and the roadmap includes clear enhancement paths for adding features like pointers, arrays, and new data types, potentially evolving the project into a more complete language toolchain.
What you'll learn
Roadmap
11 steps · 97 tasks