FastAIModel 0.1.0 [ALPHA] — Native Local Inference Runtime for Java

💡 Ultra-fast local LLM and embedding inference directly inside your JVM process — Zero-copy, Zero HTTP overhead, C++ native speed.

FastAIModel is a retained-memory local inference engine for Java that wraps llama.cpp (for GGUF) and ONNX Runtime (for ONNX) using direct JNI bindings. It is the engine that drives offline execution in the FastJava Ecosystem, giving Java developers native LLM and embedding capabilities without keeping heavy external apps (like LM Studio or Ollama) open.

Quick Start

import fastaimodel.FastAIModel;

public class Demo {
    public static void main(String[] args) {
        // Load local GGUF model directly into memory
        try (FastAIModel model = new FastAIModel("models/qwen2.5-coder-1.5b.gguf", 2048, 0.7f)) {
            model.generate("Write a quicksort in Java:", token -> {
                System.out.print(token);
                System.out.flush();
            });
        }
    }
}

Why FastAIModel?

Running LLMs locally in Java typically requires invoking external subprocesses or running local HTTP servers. FastAIModel eliminates this bloat by running the model directly inside your Java process:

True In-Process Execution — Runs the model in the same process space, bypassing system context-switches and network sockets.
Zero HTTP/JSON Overhead — Text and tokens flow directly between Java and C++ memory.
Low Memory Overhead — Eliminates the footprint of keeping GUI-based desktop inference servers running in the background.

Key Features

🚀 Native llama.cpp Performance — Direct integration with CPU AVX2/AVX512 instruction sets and GPU computation (Vulkan/CUDA).
🌊 Direct Token Streaming — Direct native callbacks stream tokens back to your Java consumer in real-time.
📦 GGUF Support — Native compatibility with any GGUF quantized models (Llama, Qwen, Mistral, Gemma).
🧠 Zero-Copy Memory — Shared token handling minimizing garbage collection strain on the JVM.

Installation

Option 1: Maven (via JitPack)

Add the JitPack repository and the dependencies to your pom.xml:

<repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>

<dependencies>
    <!-- FastAIModel Engine -->
    <dependency>
        <groupId>com.github.andrestubbe</groupId>
        <artifactId>FastAIModel</artifactId>
        <version>0.1.0</version>
    </dependency>

    <!-- FastCore (Mandatory Native DLL Loader) -->
    <dependency>
        <groupId>com.github.andrestubbe</groupId>
        <artifactId>FastCore</artifactId>
        <version>0.1.0</version>
    </dependency>
</dependencies>

Option 2: Gradle (via JitPack)

repositories {
    maven { url 'https://jitpack.io' }
}

dependencies {
    implementation 'com.github.andrestubbe:FastAIModel:0.1.0'
    implementation 'com.github.andrestubbe:FastCore:0.1.0'
}

Documentation

ROADMAP.md: Planned milestone features and performance extensions.
REFERENCE.md: JNI contracts and configuration options.
PHILOSOPHY.md: In-process design decisions.
CHANGELOG.md: Releases history.

Platform Support

Platform	Status
Windows 10/11 (x64)	✅ Fully Supported
Linux	🚧 Planned
macOS	🚧 Planned

License

MIT License — See LICENSE file for details.

Part of the FastJava Ecosystem — Making the JVM faster. Small package. Maximum speed. Zero bloat. 🚀📋

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs		docs
src/main/java/fastaimodel		src/main/java/fastaimodel
target		target
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastAIModel 0.1.0 [ALPHA] — Native Local Inference Runtime for Java

Table of Contents

Quick Start

Why FastAIModel?

Key Features

Installation

Option 1: Maven (via JitPack)

Option 2: Gradle (via JitPack)

Documentation

Platform Support

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FastAIModel 0.1.0 [ALPHA] — Native Local Inference Runtime for Java

Table of Contents

Quick Start

Why FastAIModel?

Key Features

Installation

Option 1: Maven (via JitPack)

Option 2: Gradle (via JitPack)

Documentation

Platform Support

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages