Migrate from chat.completions to Responses API#65
Conversation
|
@filip-komarzyniec please restore changes that introduce formatting for 80 characters per line. 120 has been fine here, and these changes make noise in the PR |
rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
…s API rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
3a91573 to
59d7320
Compare
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
adbece5 to
8317802
Compare
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
jakub-walaszczyk
left a comment
There was a problem hiding this comment.
Please follow the requested changes and consider handling chroma within the ai4rag context.
rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
38210dc to
0cde922
Compare
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
52fead2 to
71460bd
Compare
rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED
| """Constants used for setting the generation (inference) parameters for chat models only.""" | ||
|
|
||
| MAX_COMPLETION_TOKENS = 2048 | ||
| MAX_TOKENS = 2048 |
There was a problem hiding this comment.
If the parameter name is now max_output_tokens why do we use max_tokens? Wouldn't it be better to reflect the name of the parameter?
There was a problem hiding this comment.
It would, if we dropped the chat.completions support. The problem is that in the older API (completions) this param is called max_completion_tokens. In newer API (responses) it's max_output_tokens.
Now that we have some abstract constants class, I've decided to call it just max_tokensthere because it's consumed by both API interfaces.
What's more, the max_tokens is a deprecated name once used by chat.completions API so it's not totally made up by me.
The whole problem is caused by the fact that we decided earlier to leave the chat method support. No one uses it now so maybe it's better to just replace it with responses-aligned methods.
There was a problem hiding this comment.
Let's replace it and keep proper parameters names, Thanks
Description
chat/completionstoresponses(both OpenAI-compatible)Motivation
Changes
Testing
Commit with updated tests will be added once manual tests are completed.
Checklist