Update since we now use the streaming API
The cost comprizes two components:
- Retrieval -> Depends on the R2R server time cost * duration for vanilla retrieval. For complex retrieval there are token costs too -- at this stage just estimate them.
- Generation -> Depends on tokens in, tokens out and the LLM costs. Here we compare two options: actual API and hypothetical self-hosted.
Update since we now use the streaming API
The cost comprizes two components: