We have some benchmarks to see model performance using Ollama, and some profiling scripts to check CPU and memory usage.
You can the tests as follows:
make test # unit testsTo measure the speed of some Ollama models we have a benchmark that uses some models for a few prompts:
cargo run --release --example ollamaYou can also benchmark these models using a larger task list at a given path, with the following command:
JSON_PATH="./path/to/your.json" cargo run --release --example ollamaWe have scripts to profile both CPU and Memory usage. A special build is created for profiling, via a custom profiling feature, such that the output inherits release mode but also has debug symbols.
Furthermore, the profiling build will exit automatically after a certain time, as if CTRL+C has been pressed. This is needed by the memory profiling tool in particular.
To create a flamegraph of the application, do:
make profile-cpuThis will create a profiling build that inherits release mode, except with debug information.
Note
CPU profiling may require super-user access.
To profile memory usage, we make use of cargo-instruments.
make profile-mem