Move to `LLAMA_CACHE` 🤗

Hi @cocktailpeanut,

Big fan of your work! and love all that you're doing to democratise ML. Congratulations on `llamanet`, it looks rad!

I saw that you are creating your own cache, `llamanet` and persisting models there (correct me if I'm wrong). 
We recently upstream changes to `llama.cpp` which allows one to directly download and cache the models from the Hugging Face Hub (Note: for this you'd need to compile the server with `LLAMA_CURL=1`)

With the curl support all you'd need to do is pass `--hf-repo` & `--hf-file` and the model checkpoint would automatically be downloaded and cached in `LLAMA_CACHE` [ref](https://github.com/ggerganov/llama.cpp/blob/3b38d48609280aa5f8ab7ea135a4351b2a5ee240/examples/main/README.md?plain=1#L329)

This would make it easier for people to use already cached model checkpoints and should benefit well in case we make improvements to the overall caching system too.

AFAICT, you should be able to benefit from this directly by changing this line: https://github.com/pinokiocomputer/llamanet/blob/16fc9521f97549c657f80ce51c5c7a787eac4e8d/llamacpp.js#L20

Let me know what you think!
VB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move to `LLAMA_CACHE` 🤗 #1

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Move to LLAMA_CACHE 🤗 #1

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Move to `LLAMA_CACHE` 🤗 #1