Skip to content

Commit 3eb32ab

Browse files
release: 0.6.1-alpha.1 (#321)
Automated Release PR --- ## 0.6.1-alpha.1 (2026-03-13) Full Changelog: [v0.5.0-alpha.2...v0.6.1-alpha.1](v0.5.0-alpha.2...v0.6.1-alpha.1) ### ⚠ BREAKING CHANGES * improve consistency of post-training API endpoints ### Features * accept list content blocks in Responses API function_call_output ([f6f1fc3](f6f1fc3)) * Add prompt_cache_key parameter support ([6b45699](6b45699)) * add skip_model_availability to openai_mixin for remote models ([7ef952b](7ef952b)) * add support for 'frequency_penalty' param to Responses API ([56d39cc](56d39cc)) * add support for 'presence_penalty' param to Responses API ([4f57d15](4f57d15)) * add support for /responses background parameter ([4f8bf45](4f8bf45)) * Add top_logprobs parameter support ([2196986](2196986)) * add top_p parameter support to responses API ([23e3b9f](23e3b9f)) * Add truncation parameter support ([7501365](7501365)) * improve consistency of post-training API endpoints ([99057fd](99057fd)) * **inference:** bidirectional reasoning token passthrough for chat completions ([c314639](c314639)) * **vector_io:** Implement Contextual Retrieval for improved RAG search quality ([89ec5a7](89ec5a7)) ### Bug Fixes * align chat completion usage schema with OpenAI spec ([3974d5d](3974d5d)) * Enabled models list works ([#314](#314)) ([acd5e64](acd5e64)) * **inference:** use flat response message model for chat/completions ([e58e2e4](e58e2e4)) * **responses:** achieve full OpenResponses conformance — 6/6 tests passing ([631ab2c](631ab2c)) * **stainless:** handle [DONE] SSE terminator in streaming responses ([17f0029](17f0029)) * **vector_io:** align Protocol signatures with request models ([ea58fd8](ea58fd8)) ### Chores * **api:** minor updates ([17a2705](17a2705)) * **ci:** bump uv version ([f014d4c](f014d4c)) * **ci:** skip uploading artifacts on stainless-internal branches ([dbddad9](dbddad9)) * **docs:** add missing descriptions ([f1a093b](f1a093b)) * format all `api.md` files ([0e3e262](0e3e262)) * **internal:** add request options to SSE classes ([2ecc682](2ecc682)) * **internal:** bump dependencies ([612291e](612291e)) * **internal:** fix lint error on Python 3.14 ([a0f6975](a0f6975)) * **internal:** make `test_proxy_environment_variables` more resilient ([6bc2bb4](6bc2bb4)) * **internal:** make `test_proxy_environment_variables` more resilient to env ([44bbae1](44bbae1)) * **test:** do not count install time for mock server timeout ([185de33](185de33)) * update mock server docs ([92cb087](92cb087)) * update placeholder string ([406b9bb](406b9bb)) ### Refactors * **types:** use `extra_items` from PEP 728 ([629ca09](629ca09)) --- This pull request is managed by Stainless's [GitHub App](https://github.com/apps/stainless-app). The [semver version number](https://semver.org/#semantic-versioning-specification-semver) is based on included [commit messages](https://www.conventionalcommits.org/en/v1.0.0/). Alternatively, you can manually set the version number in the title of this pull request. For a better experience, it is recommended to use either rebase-merge or squash-merge when merging this pull request. 🔗 Stainless [website](https://www.stainlessapi.com) 📚 Read the [docs](https://app.stainlessapi.com/docs) 🙋 [Reach out](mailto:support@stainlessapi.com) for help or questions --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
1 parent 862e900 commit 3eb32ab

76 files changed

Lines changed: 2587 additions & 1103 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525
- name: Install uv
2626
uses: astral-sh/setup-uv@v5
2727
with:
28-
version: '0.9.13'
28+
version: '0.10.2'
2929

3030
- name: Install dependencies
3131
run: uv sync --all-extras
@@ -47,7 +47,7 @@ jobs:
4747
- name: Install uv
4848
uses: astral-sh/setup-uv@v5
4949
with:
50-
version: '0.9.13'
50+
version: '0.10.2'
5151

5252
- name: Install dependencies
5353
run: uv sync --all-extras
@@ -56,14 +56,18 @@ jobs:
5656
run: uv build
5757

5858
- name: Get GitHub OIDC Token
59-
if: github.repository == 'stainless-sdks/llama-stack-client-python'
59+
if: |-
60+
github.repository == 'stainless-sdks/llama-stack-client-python' &&
61+
!startsWith(github.ref, 'refs/heads/stl/')
6062
id: github-oidc
6163
uses: actions/github-script@v8
6264
with:
6365
script: core.setOutput('github_token', await core.getIDToken());
6466

6567
- name: Upload tarball
66-
if: github.repository == 'stainless-sdks/llama-stack-client-python'
68+
if: |-
69+
github.repository == 'stainless-sdks/llama-stack-client-python' &&
70+
!startsWith(github.ref, 'refs/heads/stl/')
6771
env:
6872
URL: https://pkg.stainless.com/s
6973
AUTH: ${{ steps.github-oidc.outputs.github_token }}
@@ -81,7 +85,7 @@ jobs:
8185
- name: Install uv
8286
uses: astral-sh/setup-uv@v5
8387
with:
84-
version: '0.9.13'
88+
version: '0.10.2'
8589

8690
- name: Bootstrap
8791
run: ./scripts/bootstrap

.release-please-manifest.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{
2-
".": "0.5.0-alpha.2"
2+
".": "0.6.1-alpha.1"
33
}

.stats.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
configured_endpoints: 108
2-
openapi_spec_url: https://storage.googleapis.com/stainless-sdk-openapi-specs/llamastack%2Fllama-stack-client-958e990011d6b4c27513743a151ec4c80c3103650a80027380d15f1d6b108e32.yml
3-
openapi_spec_hash: 5b49d825dbc2a26726ca752914a65114
4-
config_hash: 19b84a0a93d566334ae134dafc71991f
2+
openapi_spec_url: https://storage.googleapis.com/stainless-sdk-openapi-specs/llamastack%2Fllama-stack-client-1b387ba7b0e0d1aa931032ac2101e5a473b9fa42975e6575cf889feace342b80.yml
3+
openapi_spec_hash: a144868005520bd3f8f9dc3d8cac1c22
4+
config_hash: ef1f9b33e203c71cfc10d91890c1ed2d

CHANGELOG.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,60 @@
11
# Changelog
22

3+
## 0.6.1-alpha.1 (2026-03-13)
4+
5+
Full Changelog: [v0.5.0-alpha.2...v0.6.1-alpha.1](https://github.com/llamastack/llama-stack-client-python/compare/v0.5.0-alpha.2...v0.6.1-alpha.1)
6+
7+
### ⚠ BREAKING CHANGES
8+
9+
* improve consistency of post-training API endpoints
10+
11+
### Features
12+
13+
* accept list content blocks in Responses API function_call_output ([f6f1fc3](https://github.com/llamastack/llama-stack-client-python/commit/f6f1fc36008f4fdb7af19aa2aabfcd2482d4a1bc))
14+
* Add prompt_cache_key parameter support ([6b45699](https://github.com/llamastack/llama-stack-client-python/commit/6b45699185d934a5f8395c5cc3046f6c5aceb770))
15+
* add skip_model_availability to openai_mixin for remote models ([7ef952b](https://github.com/llamastack/llama-stack-client-python/commit/7ef952b78a5c1b8bd49509c9be7ba8781dfb7462))
16+
* add support for 'frequency_penalty' param to Responses API ([56d39cc](https://github.com/llamastack/llama-stack-client-python/commit/56d39cc9ff9d6f54e303fc377d605ae17bac9584))
17+
* add support for 'presence_penalty' param to Responses API ([4f57d15](https://github.com/llamastack/llama-stack-client-python/commit/4f57d159caba431676dced864f8f0871c3692f7b))
18+
* add support for /responses background parameter ([4f8bf45](https://github.com/llamastack/llama-stack-client-python/commit/4f8bf4526e529a74b9c53cac6df8e4beb2808d60))
19+
* Add top_logprobs parameter support ([2196986](https://github.com/llamastack/llama-stack-client-python/commit/21969867a82596e8be0aeeddbb6d8ccedf3e0f8b))
20+
* add top_p parameter support to responses API ([23e3b9f](https://github.com/llamastack/llama-stack-client-python/commit/23e3b9fcf7a23378c200604d0f57dc5a9e6a8527))
21+
* Add truncation parameter support ([7501365](https://github.com/llamastack/llama-stack-client-python/commit/7501365fe89795e87accfb6b1f2329da25d0efeb))
22+
* improve consistency of post-training API endpoints ([99057fd](https://github.com/llamastack/llama-stack-client-python/commit/99057fdc74bafdf54479674ba75b447cd4681cb6))
23+
* **inference:** bidirectional reasoning token passthrough for chat completions ([c314639](https://github.com/llamastack/llama-stack-client-python/commit/c314639b35a234ca340a08b5615a38ec838ab4f4))
24+
* **vector_io:** Implement Contextual Retrieval for improved RAG search quality ([89ec5a7](https://github.com/llamastack/llama-stack-client-python/commit/89ec5a7bf405e688bd404877e49ab1ee9b49bf7e))
25+
26+
27+
### Bug Fixes
28+
29+
* align chat completion usage schema with OpenAI spec ([3974d5d](https://github.com/llamastack/llama-stack-client-python/commit/3974d5db8270e2548d0cdd54204c1603ca7a84a8))
30+
* Enabled models list works ([#314](https://github.com/llamastack/llama-stack-client-python/issues/314)) ([acd5e64](https://github.com/llamastack/llama-stack-client-python/commit/acd5e64a9e82083192a31f85f9c810291cabcadb))
31+
* **inference:** use flat response message model for chat/completions ([e58e2e4](https://github.com/llamastack/llama-stack-client-python/commit/e58e2e4dee9c9bbb72e4903e30f169991d10e545))
32+
* **responses:** achieve full OpenResponses conformance — 6/6 tests passing ([631ab2c](https://github.com/llamastack/llama-stack-client-python/commit/631ab2c19c7cd33ac81598a795ae8be93bdd5a4b))
33+
* **stainless:** handle [DONE] SSE terminator in streaming responses ([17f0029](https://github.com/llamastack/llama-stack-client-python/commit/17f0029a3bd6719c4f71ab7b14af8cac23f9e7f1))
34+
* **vector_io:** align Protocol signatures with request models ([ea58fd8](https://github.com/llamastack/llama-stack-client-python/commit/ea58fd88201ef59e580443688100cafe45f305c0))
35+
36+
37+
### Chores
38+
39+
* **api:** minor updates ([17a2705](https://github.com/llamastack/llama-stack-client-python/commit/17a270528b503591de15f9e9fcbc378007b75eda))
40+
* **ci:** bump uv version ([f014d4c](https://github.com/llamastack/llama-stack-client-python/commit/f014d4ca0301a48078c4692cfa828016cb92c52e))
41+
* **ci:** skip uploading artifacts on stainless-internal branches ([dbddad9](https://github.com/llamastack/llama-stack-client-python/commit/dbddad9711a0ba0d2396a654e5b5220537acfc6b))
42+
* **docs:** add missing descriptions ([f1a093b](https://github.com/llamastack/llama-stack-client-python/commit/f1a093b71b5ae56f23143268ab68d851b6336ae9))
43+
* format all `api.md` files ([0e3e262](https://github.com/llamastack/llama-stack-client-python/commit/0e3e2626081ca9268297742990368c7ed6493b40))
44+
* **internal:** add request options to SSE classes ([2ecc682](https://github.com/llamastack/llama-stack-client-python/commit/2ecc682c1fccc86c643ad3da40e5134352745525))
45+
* **internal:** bump dependencies ([612291e](https://github.com/llamastack/llama-stack-client-python/commit/612291e2142b710cdd643af16bbe83e514f7a44e))
46+
* **internal:** fix lint error on Python 3.14 ([a0f6975](https://github.com/llamastack/llama-stack-client-python/commit/a0f69750827b016bb27a52bdd77fcbbacd311020))
47+
* **internal:** make `test_proxy_environment_variables` more resilient ([6bc2bb4](https://github.com/llamastack/llama-stack-client-python/commit/6bc2bb4e81b16d23e20090f45dbd8a53a63c158d))
48+
* **internal:** make `test_proxy_environment_variables` more resilient to env ([44bbae1](https://github.com/llamastack/llama-stack-client-python/commit/44bbae12bb8b4f72d1fb50db29bedd69f30340b7))
49+
* **test:** do not count install time for mock server timeout ([185de33](https://github.com/llamastack/llama-stack-client-python/commit/185de33c3b15256972df173610aa2d0d2fcb5f87))
50+
* update mock server docs ([92cb087](https://github.com/llamastack/llama-stack-client-python/commit/92cb087355ffa1fd50e3a35b8e888853784c9fe9))
51+
* update placeholder string ([406b9bb](https://github.com/llamastack/llama-stack-client-python/commit/406b9bbd327d9ce4c2423a553c15d4a7889025f9))
52+
53+
54+
### Refactors
55+
56+
* **types:** use `extra_items` from PEP 728 ([629ca09](https://github.com/llamastack/llama-stack-client-python/commit/629ca09b3c8ca32dc95082900e41df21c9dd4855))
57+
358
## 0.5.0-alpha.2 (2026-02-05)
459

560
Full Changelog: [v0.5.0-alpha.1...v0.5.0-alpha.2](https://github.com/llamastack/llama-stack-client-python/compare/v0.5.0-alpha.1...v0.5.0-alpha.2)

CONTRIBUTING.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -88,8 +88,7 @@ $ pip install ./path-to-wheel-file.whl
8888
Most tests require you to [set up a mock server](https://github.com/stoplightio/prism) against the OpenAPI spec to run the tests.
8989

9090
```sh
91-
# you will need npm installed
92-
$ npx prism mock path/to/your/openapi.yml
91+
$ ./scripts/mock
9392
```
9493

9594
```sh

README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,50 @@ async def main() -> None:
128128
asyncio.run(main())
129129
```
130130

131+
## Streaming responses
132+
133+
We provide support for streaming responses using Server Side Events (SSE).
134+
135+
```python
136+
from llama_stack_client import LlamaStackClient
137+
138+
client = LlamaStackClient()
139+
140+
stream = client.chat.completions.create(
141+
messages=[
142+
{
143+
"content": "string",
144+
"role": "user",
145+
}
146+
],
147+
model="model",
148+
stream=True,
149+
)
150+
for completion in stream:
151+
print(completion.id)
152+
```
153+
154+
The async client uses the exact same interface.
155+
156+
```python
157+
from llama_stack_client import AsyncLlamaStackClient
158+
159+
client = AsyncLlamaStackClient()
160+
161+
stream = await client.chat.completions.create(
162+
messages=[
163+
{
164+
"content": "string",
165+
"role": "user",
166+
}
167+
],
168+
model="model",
169+
stream=True,
170+
)
171+
async for completion in stream:
172+
print(completion.id)
173+
```
174+
131175
## Using types
132176

133177
Nested request parameters are [TypedDicts](https://docs.python.org/3/library/typing.html#typing.TypedDict). Responses are [Pydantic models](https://docs.pydantic.dev) which also provide helper methods for things like:

api.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -474,9 +474,9 @@ from llama_stack_client.types.alpha.post_training import (
474474
Methods:
475475

476476
- <code title="get /v1alpha/post-training/jobs">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">list</a>() -> <a href="./src/llama_stack_client/types/alpha/post_training/job_list_response.py">JobListResponse</a></code>
477-
- <code title="get /v1alpha/post-training/job/artifacts">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">artifacts</a>() -> <a href="./src/llama_stack_client/types/alpha/post_training/job_artifacts_response.py">JobArtifactsResponse</a></code>
478-
- <code title="post /v1alpha/post-training/job/cancel">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">cancel</a>() -> None</code>
479-
- <code title="get /v1alpha/post-training/job/status">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">status</a>() -> <a href="./src/llama_stack_client/types/alpha/post_training/job_status_response.py">JobStatusResponse</a></code>
477+
- <code title="get /v1alpha/post-training/jobs/{job_uuid}/artifacts">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">artifacts</a>(job_uuid) -> <a href="./src/llama_stack_client/types/alpha/post_training/job_artifacts_response.py">JobArtifactsResponse</a></code>
478+
- <code title="post /v1alpha/post-training/jobs/{job_uuid}/cancel">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">cancel</a>(job_uuid) -> None</code>
479+
- <code title="get /v1alpha/post-training/jobs/{job_uuid}/status">client.alpha.post_training.job.<a href="./src/llama_stack_client/resources/alpha/post_training/job.py">status</a>(job_uuid) -> <a href="./src/llama_stack_client/types/alpha/post_training/job_status_response.py">JobStatusResponse</a></code>
480480

481481
## Benchmarks
482482

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "llama_stack_client"
3-
version = "0.5.0-alpha.2"
3+
version = "0.6.1-alpha.1"
44
description = "The official Python library for the llama-stack-client API"
55
dynamic = ["readme"]
66
license = "MIT"

requirements-dev.lock

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,12 @@
33
-e .
44
annotated-types==0.7.0
55
# via pydantic
6-
anyio==4.12.0
6+
anyio==4.12.1
77
# via
88
# httpx
99
# llama-stack-client
1010
black==26.1.0
11-
certifi==2025.11.12
11+
certifi==2026.1.4
1212
# via
1313
# httpcore
1414
# httpx
@@ -52,7 +52,7 @@ idna==3.11
5252
# anyio
5353
# httpx
5454
# requests
55-
importlib-metadata==8.7.0
55+
importlib-metadata==8.7.1
5656
iniconfig==2.3.0
5757
# via pytest
5858
markdown-it-py==4.0.0
@@ -64,11 +64,11 @@ mypy-extensions==1.1.0
6464
# via
6565
# black
6666
# mypy
67-
nodeenv==1.9.1
67+
nodeenv==1.10.0
6868
# via
6969
# pre-commit
7070
# pyright
71-
numpy==2.4.1
71+
numpy==2.4.2
7272
# via pandas
7373
packaging==25.0
7474
# via
@@ -89,7 +89,7 @@ pluggy==1.6.0
8989
pre-commit==4.5.1
9090
prompt-toolkit==3.0.52
9191
# via llama-stack-client
92-
pyaml==25.7.0
92+
pyaml==26.2.1
9393
# via llama-stack-client
9494
pydantic==2.12.5
9595
# via llama-stack-client
@@ -100,15 +100,15 @@ pygments==2.19.2
100100
# pytest
101101
# rich
102102
pyright==1.1.399
103-
pytest==9.0.1
103+
pytest==9.0.2
104104
# via
105105
# pytest-asyncio
106106
# pytest-xdist
107107
pytest-asyncio==1.3.0
108108
pytest-xdist==3.8.0
109109
python-dateutil==2.9.0.post0
110110
# via pandas
111-
pytokens==0.4.0
111+
pytokens==0.4.1
112112
# via black
113113
pyyaml==6.0.3
114114
# via
@@ -119,7 +119,7 @@ requests==2.32.5
119119
respx==0.22.0
120120
rich==14.2.0
121121
# via llama-stack-client
122-
ruff==0.14.7
122+
ruff==0.14.13
123123
six==1.17.0
124124
# via python-dateutil
125125
sniffio==1.3.1
@@ -128,8 +128,8 @@ termcolor==3.3.0
128128
# via
129129
# fire
130130
# llama-stack-client
131-
time-machine==3.1.0
132-
tqdm==4.67.1
131+
time-machine==3.2.0
132+
tqdm==4.67.3
133133
# via llama-stack-client
134134
typing-extensions==4.15.0
135135
# via
@@ -149,7 +149,7 @@ urllib3==2.6.3
149149
# via requests
150150
virtualenv==20.36.1
151151
# via pre-commit
152-
wcwidth==0.3.1
152+
wcwidth==0.6.0
153153
# via prompt-toolkit
154154
zipp==3.23.0
155155
# via importlib-metadata

scripts/format

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ uv run ruff check --fix .
1111
uv run ruff format
1212

1313
echo "==> Formatting docs"
14-
uv run python scripts/utils/ruffen-docs.py README.md api.md
14+
uv run python scripts/utils/ruffen-docs.py README.md $(find . -type f -name api.md)

0 commit comments

Comments
 (0)