There are some devices where WebLLM starts loading and never finishes, sometimes due to lack of video ram.
Example - my Fairphone will load the model to persistent storage, and then stop at 4gb loading it into GPU.
Can I do something to detect this in advance and stop / advise against the load?
Can I recover from a failed load?
There are some devices where WebLLM starts loading and never finishes, sometimes due to lack of video ram.
Example - my Fairphone will load the model to persistent storage, and then stop at 4gb loading it into GPU.
Can I do something to detect this in advance and stop / advise against the load?
Can I recover from a failed load?