MLP-MLIR

MLP-MLIR is a research compiler built on LLVM MLIR for experimenting with neural-network dialects, lowering pipelines, and heterogeneous CPU/GPU partitioning. It demonstrates end-to-end compilation of synthetic MLP (Multi-Layer Perceptron) programs through custom MLIR dialects to executable code.

Features

Custom MLIR Dialect: Defines mlp operations for neural network primitives
Progressive Lowering: Multi-stage compilation pipeline from high-level ops to LLVM IR
Heterogeneous Partitioning: Automatic CPU/CUDA placement for operations
JIT Compilation: Runtime code generation and execution for CPU path
GPU Dialect Generation: CUDA operations lowered to MLIR GPU dialect
Extensible Backend System: Support for CPU, CUDA, Metal, ROCm, and RISC-V targets

Architecture

The compiler follows a layered architecture:

High-Level IR (mlp dialect)
    ↓ Lowering
Linalg/Arith/Tensor Operations
    ↓ Partitioning
CPU/CUDA Annotated Operations
    ↓ Bufferization
Memory Operations + GPU Launch
    ↓ Target Lowering
LLVM IR / GPU Kernels

Key Components

Dialect Definition (include/Ops.td): TableGen definitions for MLP operations
Builder (src/Builder.cpp): Constructs synthetic neural network programs
Passes: Lowering and optimization passes in src/
Backends (targets/): Target-specific code generation
JIT Runtime (src/Jit.cpp): Execution engine for CPU path

Requirements

C++17 compatible compiler (GCC 7+, Clang 5+, MSVC 2017+)
CMake 3.13.4 or later
LLVM/MLIR development build with the following components:
- MLIR Core libraries
- LLVM Core libraries
- TableGen
- OrcJIT

LLVM/MLIR Setup

This project requires a full LLVM/MLIR build. The code has been developed against recent LLVM trunk. If your MLIR installation uses different library names, update the target_link_libraries in CMakeLists.txt.

Installation

Clone the repository:
```
git clone <repository-url>
cd mlp_mlir
```
Set up LLVM/MLIR environment: Ensure LLVM_DIR and MLIR_DIR point to your LLVM build directory in CMakeLists.txt:
```
set(LLVM_DIR "/path/to/llvm/build/lib/cmake/llvm")
set(MLIR_DIR "/path/to/llvm/build/lib/cmake/mlir")
```

Build

Initial Build

mkdir -p build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)

Rebuild

cd build
make -j$(nproc)

Clean Build

cd build
make clean
make -j$(nproc)

Quick Start

After building, run the CPU JIT to execute a synthetic MLP program:

cd build
./mlp_mlir -emit=jit

Expected Output:

8.000000 17.000000
12.000000 14.000000

This executes a linear → relu → print neural network computation.

Usage

The main executable mlp_mlir supports various emit modes for inspecting the compilation pipeline:

Basic Commands

# Inspect initial MLIR
./mlp_mlir -emit=mlir

# View linalg lowering
./mlp_mlir -emit=mlir-linalg

# See CPU/CUDA partitioning
./mlp_mlir -emit=mlir-hetero

# Inspect GPU dialect lowering
./mlp_mlir -emit=mlir-gpu

# Generate LLVM IR
./mlp_mlir -emit=llvm

# JIT compile and run
./mlp_mlir -emit=jit

With Optimizations

Enable MLIR optimizations:

./mlp_mlir -emit=jit -opt

Emit Modes

Mode	Description
`mlir`	Initial MLIR module with custom `mlp` operations
`mlir-linalg`	Lowered to `linalg`, `arith`, and tensor operations
`mlir-hetero`	Operations annotated with CPU/CUDA device placement
`mlir-gpu`	CUDA operations lowered to `gpu.launch` kernels
`mlir-llvm`	CPU path lowered to LLVM dialect
`llvm`	Translated to LLVM IR text format
`jit`	JIT-compiled and executed on CPU

Heterogeneous Support

MLP-MLIR demonstrates early heterogeneous compilation by partitioning operations across CPU and CUDA devices:

CUDA Partition: linalg.matmul operations marked for GPU execution
CPU Partition: Element-wise operations (relu) and I/O (print) on CPU

Example Partitioned IR

module attributes {mlp.targets = ["cpu", "cuda"]} {
  %0 = linalg.matmul {device = "cuda"} ...  // GPU matrix multiplication
  %1 = linalg.generic {device = "cpu"} ...   // CPU element-wise ReLU
  mlp.print {device = "cpu"} ...             // CPU output
}

GPU Lowering

CUDA-marked operations are lowered to MLIR GPU dialect:

gpu.launch ... {
  scf.for ... {
    // Matrix multiplication kernel
    memref.load ...
    arith.mulf ... arith.addf ...
    memref.store ...
  }
  gpu.terminator
} {device = "cuda"}

Note: The CUDA path currently generates MLIR GPU IR for inspection. Full CUDA runtime integration (kernel launching, memory transfers) is planned for future development.

Current Pipeline

The synthetic program (src/Builder.cpp) creates a simple MLP:

mlp.constant  →  mlp.linear  →  mlp.relu  →  mlp.print

Lowering Flow:

mlp dialect
  ↓ Shape inference, canonicalization
linalg + arith + tensor operations
  ↓ Partitioning pass
CPU/CUDA placement annotations
  ↓ Bufferization
memref operations + gpu.launch
  ↓ Target-specific lowering
LLVM IR (CPU) / GPU kernels (CUDA)

Contributing

We welcome contributions! Please follow these guidelines:

Fork the repository
Create a feature branch: git checkout -b feature/your-feature
Commit changes: git commit -am 'Add your feature'
Push to the branch: git push origin feature/your-feature
Submit a Pull Request

Development Setup

Use the provided build.sh script for consistent builds
Run tests with ./mlp_mlir -emit=jit to verify functionality
Follow the existing code style and naming conventions

Areas for Contribution

Complete CUDA runtime integration
Add more neural network operations
Implement additional backends (Vulkan, OpenCL)
Performance optimizations and benchmarking
Documentation improvements

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

Built on LLVM MLIR infrastructure
Inspired by research in heterogeneous compilation for machine learning
Part of ongoing work in compiler design for neural networks

Note: This is research software under active development. APIs and behavior may change without notice.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
include		include
results		results
src		src
targets		targets
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
TODO.md		TODO.md
TODO_heterogeneous.md		TODO_heterogeneous.md
build.sh		build.sh
git-push.sh		git-push.sh
readme.md		readme.md
readme_old.md		readme_old.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLP-MLIR

Table of Contents

Features

Architecture

Key Components

Requirements

LLVM/MLIR Setup

Installation

Build

Initial Build

Rebuild

Clean Build

Quick Start

Usage

Basic Commands

With Optimizations

Emit Modes

Heterogeneous Support

Example Partitioned IR

GPU Lowering

Current Pipeline

Contributing

Development Setup

Areas for Contribution

License

Acknowledgments

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MLP-MLIR

Table of Contents

Features

Architecture

Key Components

Requirements

LLVM/MLIR Setup

Installation

Build

Initial Build

Rebuild

Clean Build

Quick Start

Usage

Basic Commands

With Optimizations

Emit Modes

Heterogeneous Support

Example Partitioned IR

GPU Lowering

Current Pipeline

Contributing

Development Setup

Areas for Contribution

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages