pengc is a tiny C-to-NASM compiler for the little penguin (x86_64 Linux).
Lexer → Parser → AST → Codegen → NASM
- Lexer: tokenizes input C subset
- Parser: recursive descent expression parser
- AST: structured representation of expressions
- Codegen: emits NASM x86_64 assembly
- Integer literals
- Arithmetic operations (
+ - * /) - Bitwise operations (
>> | & ^ ~) - Relational operators (
< >=) - Logic operations (
|| && != ==) - Ternary conditional (
? :) - Unary operations (
+ - ! ~) - Indirection (
*) - Address-of (
&) - Size-of (
sizeof) - Cast (
(type)) - Prefix increment and decrement (
++ --) - Sufix increment and decrement (
++ --) - Parentheses support
- Correct operator precedence
- Return statement parsing
- AST construction
- Variables
- Assignments
- Control flow (
if,while,for) - Function calls
- Types beyond implicit 64-bit int
- Strings
- Floats
- Arrays
- Structures
- Unions
- Enums
- Multiple source files
- Libraries
- Including other files
- Macros (
define undef) - Conditionals (
if ifdef else elif ...) - Implementation defined behaviour (
pragma) - Throw error (
error) - File name and line information
- Token dump mode (
--dump-tokens) - Parser trace mode (
--trace-parser) - AST dump (
--dump-ast) - Improved error reporting (line/column + context)
- Colored diagnostics
- Source highlighting (caret-based errors)
- Panic recovery mode
mkdir build
cd build
cmake ..
# optional:
ctest