///
The Zen compiler is a sophisticated, multi-stage pipeline designed to translate Zen source code into executable binaries. Built in Rust, it adheres to the language's core principles, from its "zero ke
97 views
~97 views from guests
Guest views are estimated from total page views. These include anonymous visitors and users who weren't logged in when they viewed the page.
The Zen compiler is a sophisticated, multi-stage pipeline designed to translate Zen source code into executable binaries. Built in Rust, it adheres to the language's core principles, from its "zero keywords" philosophy to allocator-driven concurrency. The compiler's architecture is modular, with distinct components handling lexical analysis, syntactic parsing, semantic validation, compile-time execution, and efficient code generation via LLVM.
The compilation process generally follows these steps:
.zen source file is fed into the compiler.@meta blocks and comptime expressions are evaluated, potentially modifying the AST or generating new code before runtime compilation.src/lexer.rs)The Lexer is the first stage of the compiler. Its primary responsibility is to read the Zen source code character by character and convert it into a sequence of tokens. These tokens are the fundamental building blocks for the parser.
Tokens (identifiers, literals, operators, symbols, and special @ keywords).Vec<TokenWithSpan>, where each token includes its location in the source code for error reporting.? for pattern matching, ::= for mutable assignment, range operators (.., ..=), and string interpolation ("${expr}"). It also handles pub, @std, @this, @meta, and @export special identifiers.src/parser/ module)The Parser consumes the token stream produced by the lexer and constructs the Abstract Syntax Tree (AST). The AST is a hierarchical representation of the program's structure, abstracting away the concrete syntax.
Program AST from tokens. This involves recognizing declarations (functions, structs, enums, traits, imports), statements, and expressions.TokenWithSpan from the Lexer.ast::Program structure, containing a list of ast::Declarations and top-level statements.?, |, :=, ::=), and contextual lookaheads to infer intent where keywords would traditionally guide parsing (e.g., value ? | pattern { ... } instead of match). It also includes dedicated logic for parsing complex constructs like structs, enums, closures, pattern matches, and method calls.src/module_system/ module)The Module System handles the organization, loading, and resolution of Zen code across multiple files, including the standard library.
{ io, maths } = @std), resolves module paths, loads .zen files, extracts exported symbols, and merges program components.ast::Program potentially containing ModuleImport declarations, and filesystem paths.ast::Program with all imported declarations integrated and references resolved.@std and custom module imports. It performs a recursive loading process to handle nested module dependencies. It's crucial for providing context for semantic analysis and code generation.src/typechecker/ module)The Type Checker is the semantic analysis phase, validating the AST for type correctness, variable usage, and adherence to language rules. It ensures the program is semantically sound before code generation.
scopes) to track variable and function information.ast::Program from the parser, including all declarations.error::CompileError) if semantic issues are found. It also enriches the AST with inferred type information where necessary (e.g., Void return types).Generic { name: "MyStruct" } to Struct { name: "MyStruct", fields: [...] }). This is crucial for correctly modeling complex data structures and resolving self-referential types.scopes) to manage local variable declarations and their visibility.src/typechecker/behaviors.rs): Manages trait definitions and implements/requires blocks, ensuring that types correctly implement required behaviors. It handles replacing Self types with concrete types in trait implementations.src/typechecker/inference.rs): Automatically deduces the types of expressions and variables when not explicitly provided, adhering to Zen's type inference rules.src/comptime/mod.rs)The Comptime Interpreter is a unique feature of Zen, allowing a subset of the language to be executed during compilation. This enables powerful metaprogramming and compile-time code generation.
comptime { ... } blocks and comptime expr expressions within the AST. It can generate new AST nodes, evaluate constants, and perform complex compile-time computations.ast::Program containing ComptimeBlock or Comptime expressions.ast::Program where compile-time logic has been evaluated and its effects integrated into the AST.ComptimeValue and Environment to ensure deterministic, side-effect-free execution. It allows for advanced scenarios like generating data structures, validating configurations, and optimizing code based on compile-time knowledge.src/ffi/mod.rs)The FFI module provides a safe and structured way to interact with external C libraries, aligning with Zen's philosophy of explicit control.
ast::ExternalFunction declarations and runtime library paths.libloading to load dynamic libraries and resolve symbols.RawPtr<T> for unsafe operations, FnSignature for function types, and robust error handling. It allows specifying calling conventions, safety checks, and platform-specific configurations.src/type_system/monomorphization.rs)Monomorphization is the process of converting generic code (e.g., List<T>) into concrete code for each specific type it's used with (e.g., List_i32, List_String). This happens before LLVM code generation.
ast::Program which might still contain generic constructs.ast::Program where all generic types and functions have been replaced by concrete, specialized implementations, ready for LLVM code generation.TypeEnvironment and TypeInstantiator to manage generic definitions and their instantiated versions. It uses a "pending instantiations" queue to iteratively discover and specialize generics used within other specialized generics until all are resolved.src/codegen/llvm/ module)The LLVM Code Generator is the backend of the Zen compiler, translating the monomorphized AST into LLVM Intermediate Representation (IR).
? operator, loops) into LLVM basic blocks and branches, and generating LLVM instructions for expressions and declarations.ast::Program.inkwell::module::Module object containing the LLVM IR, which can then be optimized, linked, and compiled into an executable binary or run via a JIT engine.inkwell (Rust bindings for LLVM) to manage the LLVM Context, Module, and Builder objects for IR construction.object.method() calls by looking up implementations in impl blocks or by checking if method(object, args) exists.GPA.init(), AsyncPool.init()) which determine the synchronous or asynchronous nature of memory operations, eliminating the need for explicit async/await keywords in Zen.src/codegen/llvm/behaviors.rs): Currently implements static dispatch for traits. A future goal might include dynamic dispatch via vtables for runtime polymorphism.Ptr<T>, MutPtr<T>, and RawPtr<T> to LLVM pointer types, with safety semantics handled at the language level rather than by the backend.malloc, free, memcpy, strlen, sprintf).The Zen compiler's components are highly interconnected:
This modular yet integrated architecture allows Zen to achieve its minimalist design goals while providing a robust and extensible compilation pipeline.