Hi @cocktailpeanut,
Big fan of your work! and love all that you're doing to democratise ML. Congratulations on llamanet, it looks rad!
I saw that you are creating your own cache, llamanet and persisting models there (correct me if I'm wrong).
We recently upstream changes to llama.cpp which allows one to directly download and cache the models from the Hugging Face Hub (Note: for this you'd need to compile the server with LLAMA_CURL=1)
With the curl support all you'd need to do is pass --hf-repo & --hf-file and the model checkpoint would automatically be downloaded and cached in LLAMA_CACHE ref
This would make it easier for people to use already cached model checkpoints and should benefit well in case we make improvements to the overall caching system too.
AFAICT, you should be able to benefit from this directly by changing this line:
|
let args = ["-m", req.file] |
Let me know what you think!
VB
Hi @cocktailpeanut,
Big fan of your work! and love all that you're doing to democratise ML. Congratulations on
llamanet, it looks rad!I saw that you are creating your own cache,
llamanetand persisting models there (correct me if I'm wrong).We recently upstream changes to
llama.cppwhich allows one to directly download and cache the models from the Hugging Face Hub (Note: for this you'd need to compile the server withLLAMA_CURL=1)With the curl support all you'd need to do is pass
--hf-repo&--hf-fileand the model checkpoint would automatically be downloaded and cached inLLAMA_CACHErefThis would make it easier for people to use already cached model checkpoints and should benefit well in case we make improvements to the overall caching system too.
AFAICT, you should be able to benefit from this directly by changing this line:
llamanet/llamacpp.js
Line 20 in 16fc952
Let me know what you think!
VB