First of all thanks for the great work!
Context
I was trying to use with my new build 2 p40 in ubuntu 24.04 (Pop_OS), and it seems to run, I had to check the code and found a endpoint with some info:


Great, I found also some errors

Also the one named /gui was not working.
Request
Besides the existing quickstart would be nice a couple of ways to help users to use the program. Some stuff as I user I would find useful:
- cURL example to check gppm is working, for example
curl http://localhost:5001/get_llamacpp_subprocesses
-
Reference of available endpoints and examples of use
-
An example of using one single p40 and guide until call to llama.cpp instance with a query like "how many squares are on a chessboard?". I notice this is partially mentioned already, just would be nice the "now query llama.cpp with a prompt" section or similar.
-
An example of using 2 or more p40 and maybe to change a config, then inspect the change.
-
Just in case is needed, how to disable/uninstall.
Other considerations
It worked fine, I think not needed but I share my PC specs:
CPU: AMD Ryzen 3 3200G
RAM: 8 RAM (3200MHz)
GPU(s): 2 Nvidia P40
OS: pop_OS 24