feat(providers): support profile updates#1914
Conversation
|
🌿 Preview your docs: https://nvidia-preview-pr-1914.docs.buildwithfern.com/openshell |
PR Review StatusValidation: Project-valid. PR #1914 implements approved provider profile update work from #1881, is authored by a repo admin, is non-draft, and DCO/branch checks are passing. Review findings:
Docs: Fern docs were updated in existing provider pages; no navigation change is needed. Next state: |
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. Resolved items:
Remaining items:
Docs: Fern docs are updated in the existing provider pages; no navigation change appears needed. E2E: pending. This PR touches provider credential/policy behavior, so Next state: |
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. Resolved items:
Remaining items:
Docs: Fern docs are updated in the existing provider pages; no navigation change appears needed. E2E: still pending. This PR touches provider credential/policy behavior, so Next state: |
PR Review StatusValidation: Project-valid. PR #1914 implements the validated provider profile update work from #1881 and remains non-draft. Review findings:
Docs: Fern docs are updated in the existing provider pages; no navigation change appears needed. E2E: pending. This PR touches provider credential/policy behavior, so Next state: |
Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com>
Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com>
Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com>
f0209f9 to
8ee094b
Compare
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. Resolved items:
Remaining items:
Docs: Fern docs are updated in the existing provider pages; no navigation change appears needed. E2E: still pending. This PR touches provider credential/policy behavior, so Independent reviewer result: confirmed the serialization issue as blocking; CAS/export/docs/test coverage otherwise looks satisfactory. Next state: |
Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com>
|
Addressed the latest provider profile concurrency feedback in commit 3e0fe13. Profile imports now hold the shared sandbox/provider invariant guard from conflict validation through persistence, and sandbox creation with initial providers holds the same guard across provider existence/env-key validation through create. Added regression coverage for both guarded paths.\n\nVerification:\n- |
Re-check After Author UpdateI re-evaluated latest head Disposition: partially resolved. Resolved items:
Remaining items:
Checks: required non-E2E checks are still running for this head ( Next state: |
PR Review StatusValidation: Project-valid. PR #1914 implements the validated provider profile update work from #1881 and remains non-draft. Review findings:
Docs: Fern docs are updated in existing provider pages; no navigation change appears needed. Checks: required non-E2E checks are green. E2E is still not applied; this PR should get Next state: |
Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com>
|
Addressed both follow-up review items in commit Changes:
Verification:
|
|
Label |
Re-check After Author UpdateI re-evaluated latest head Disposition: resolved. Remaining items:
Review status:
Docs: Fern docs are updated in the existing provider pages; no navigation change appears needed. Checks: Next state: |
Pipeline Follow-Up NeededHead SHA:
The failing test is I did not find an existing open issue or PR documenting this exact test failure. Because this PR touches provider, policy, and sandbox behavior, and the failure is in a required E2E policy/proxy path, this needs author follow-up before maintainer handoff. Next action: @johntmyers, please investigate the rust-podman L7 allowed-request failure or point to evidence that it is an unrelated CI flake or infrastructure incident. Gator will continue watching after an update. Next state: |
Pipeline Re-checkHead SHA: The required Because the required E2E job is actively running again, gator is moving this PR back to pipeline watch and will re-evaluate the E2E result after the rerun completes. Next state: |
Pipeline Follow-Up NeededHead SHA: The rerun of the required The repeated failing test is Because this is a repeated failure in a required policy/proxy E2E path and this PR touches provider, policy, and sandbox behavior, this still needs author follow-up before maintainer handoff. Next action: @johntmyers, please investigate the rust-podman L7 allowed-request failure or point to evidence that it is unrelated to this PR. Next state: |
Pipeline Re-checkHead SHA: The required Because the required E2E job is active again, gator is moving this PR back to pipeline watch and will re-evaluate the E2E result after the rerun completes. Next state: |
Maintainer Approval NeededGator validation and PR monitoring are complete. Validation: PR #1914 is project-valid because it implements the validated provider profile update work from #1881 and remains non-draft. Human maintainer approval or merge decision is now required. |
Maintainer Review NudgeThis PR has been in @NVIDIA/openshell-maintainers @NVIDIA/openshell-codeowners @mrunalp @maxamillion @derekwaynecarr, can someone review and either approve, request changes, or close this out? |
|
This looks good. One thing that I found from investigating this with an agent is an edge case around This was original comment from when it was introduced: https://github.com/NVIDIA/OpenShell/pull/1037/changes#diff-15bf711f0b9e1a293ee9f731c3997e6eb85e4695ff777aa4744b3a0f6ff38cceR88-R93 The problematic path could look something like this:
This means that any update that is not additive (e.g. remove an endpoint/modify an existing endpoint) won't take an effect. Interestingly, when reproducing this, I run into another issue: That policy comes back as: The reason for this is that the filesystem enrichment happens on the sandbox and on the effective policy. After the enrichment, that policy is synced backed to the gateway and the user policy becomes the one with This is a very long way of saying that we likely need an issue for an enforcement around EDIT: here's a script from codex to reproduce this with pauses to explore the live state: https://gist.github.com/pimlock/d10de2a4ca5611a9a9ae78fce7e0a489?permalink_comment_id=6212304#gistcomment-6212304 |
| .into_inner(); | ||
| diagnostics.extend(response.diagnostics); | ||
| if response.updated { | ||
| println!("Updated provider profile."); |
There was a problem hiding this comment.
nit: I don't think we're doing this anywhere else yet, but it could be useful to include the new resource_version, e.g. Updated provider profile, new version: {resource_version}.
Summary
Add safe custom provider profile updates through a new
UpdateProviderProfilesRPC andopenshell provider profile update. The update path validates profile batches before writing, preserves stored custom profile metadata, rejects built-in and missing profiles, and keeps provider-derived policy JIT-composed from current profiles.Related Issue
Closes #1881
Changes
proto/openshell.proto: AddedUpdateProviderProfilesRPC and request/response messages.crates/openshell-server/src/grpc/provider.rs: Added custom profile update handling with validation, metadata preservation, built-in/missing rejection, and attached-sandbox dynamic token grant ambiguity checks.crates/openshell-server/src/grpc/policy.rs: Added tests proving updated profile policy reaches sandbox effective config without rewriting provider instances or persisted sandbox source policy, and that profile changes affect provider env revision.crates/openshell-cli/src/main.rs,crates/openshell-cli/src/run.rs: Addedopenshell provider profile update -f|--from.docs/sandboxes/providers-v2.mdx,docs/sandboxes/manage-providers.mdx: Documented custom profile update semantics and rollout behavior.Deviations from Plan
The plan preferred all-or-none server-side batch updates if cleanly supported. The existing persistence API does not provide transactional multi-object CAS, so this implementation validates the full batch before writes and documents the remaining concurrent-write/storage-error retry behavior instead of adding a broad transaction layer.
Testing
cargo test -p openshell-server update_provider_profile -- --nocapturecargo test -p openshell-server sandbox_config_uses_updated_custom_provider_profile -- --nocapturecargo test -p openshell-server provider_env_revision_changes_when_custom_profile_token_grant_changes -- --nocapturecargo test -p openshell-cli provider_profile_commands_parse -- --nocapturecargo test -p openshell-cli provider_profile_cli_run_functions_support_custom_profiles -- --nocapturemise run pre-commite2e/files changedTests added:
Checklist