Skip to content

Move to LLAMA_CACHE 🤗 #1

Description

@Vaibhavs10

Hi @cocktailpeanut,

Big fan of your work! and love all that you're doing to democratise ML. Congratulations on llamanet, it looks rad!

I saw that you are creating your own cache, llamanet and persisting models there (correct me if I'm wrong).
We recently upstream changes to llama.cpp which allows one to directly download and cache the models from the Hugging Face Hub (Note: for this you'd need to compile the server with LLAMA_CURL=1)

With the curl support all you'd need to do is pass --hf-repo & --hf-file and the model checkpoint would automatically be downloaded and cached in LLAMA_CACHE ref

This would make it easier for people to use already cached model checkpoints and should benefit well in case we make improvements to the overall caching system too.

AFAICT, you should be able to benefit from this directly by changing this line:

let args = ["-m", req.file]

Let me know what you think!
VB

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions