Skip to content

test(ISV-6236): add integration tests for builder content#385

Open
bclindner wants to merge 25 commits into
mainfrom
ISV-6236
Open

test(ISV-6236): add integration tests for builder content#385
bclindner wants to merge 25 commits into
mainfrom
ISV-6236

Conversation

@bclindner
Copy link
Copy Markdown
Contributor

Pre-merged with #376 as it requires that PR's functionality. Will rebase when merged.

Early pass for now, still working through what's actually going on here.

@bclindner bclindner requested a review from ezopezo May 11, 2026 13:27
@bclindner
Copy link
Copy Markdown
Contributor Author

@ezopezo slack's dead so I guess I've gotta ping you here- this is the draft PR we discussed last week, I'm definitely still getting something wrong, i may need your help spotting the issue on our meeting later today.

my amateur guess is I've messed something fundamental up with the SPDX package structure, or missed some key field in the capo json. still tracing things myself

Copy link
Copy Markdown
Contributor

@ezopezo ezopezo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezopezo slack's dead so I guess I've gotta ping you here- this is the draft PR we discussed last week, I'm definitely still getting something wrong, i may need your help spotting the issue on our meeting later today.

my amateur guess is I've messed something fundamental up with the SPDX package structure, or missed some key field in the capo json. still tracing things myself

From the output of the integration tests I can see that your tests are attempting to access real base image in repository during the build:

2026-05-12 14:23:56,790 [ERROR] mobster.cmd.generate.oci_image: Contextual SBOM workflow failed.
Traceback (most recent call last):
  File "/home/emravec/old/Documents/konflux/mobster/src/mobster/cmd/generate/oci_image/__init__.py", line 263, in _assess_and_dispatch_contextual_workflow
    contextual_sbom = await self._execute_contextual_workflow(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
    )
    ^
  File "/home/emravec/old/Documents/konflux/mobster/src/mobster/cmd/generate/oci_image/__init__.py", line 190, in _execute_contextual_workflow
    parent_image_sbom = await download_parent_image_sbom(parent_image_ref, arch)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/emravec/old/Documents/konflux/mobster/src/mobster/cmd/generate/oci_image/contextual_sbom/contextualize.py", line 117, in download_parent_image_sbom
    image_or_index = await Image.from_repository_digest_manifest(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        parent_image.repository, parent_image.digest
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/emravec/old/Documents/konflux/mobster/src/mobster/image.py", line 160, in from_repository_digest_manifest
    manifest = await get_image_manifest(image.reference)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/emravec/old/Documents/konflux/mobster/src/mobster/oci/__init__.py", line 50, in get_image_manifest
    raise SBOMError(f"Could not get manifest of {reference}: {stderr.decode()}")
mobster.error.SBOMError: Could not get manifest of quay.io/konflux-ci/syft@sha256:4ab0d32a67e22a27ea3ba4ad00a3a5aee008386ae4f0086c9a720401ab1aca45: Error response from registry: failed to fetch the content of "quay.io/konflux-ci/syft@sha256:4ab0d32a67e22a27ea3ba4ad00a3a5aee008386ae4f0086c9a720401ab1aca45": unauthorized: access to the requested resource is not authorized: map[]

This is not wanted, we want to mock and store these images ( builder / base ) via oci_client.create_image() in local storage. Check also how other helpers in test_contextual_parent.py are used (but might need an update for builder content!)

bclindner and others added 7 commits May 13, 2026 13:32
this is necessary to keep AnnotatedPackage/BuilderPkgMetadataItem
generations lined up for the next commits

AI used exclusively to fix my sloppy SBOMPackage conversions

Assisted-By: claude-opus-4-6
forgot to commit+push this on friday, sorry
still trying to understand how the exact case is supposed to look in
this scaffolding unfortunately

Assisted-By: claude-opus-4-6, claude-sonnet-4-6
bclindner added 4 commits May 19, 2026 16:00
i think this is very close to what we need but adding the grandparent
and parent stuff breaks it despite those origins still being correct

Assisted-By: claude-opus-4-6
this causes the tests to pass, but i'm a little lost as to why
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.97%. Comparing base (9c54302) to head (4fae800).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #385   +/-   ##
=======================================
  Coverage   94.97%   94.97%           
=======================================
  Files          58       58           
  Lines        3921     3921           
=======================================
  Hits         3724     3724           
  Misses        197      197           
Flag Coverage Δ
unit-tests 94.97% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bclindner
Copy link
Copy Markdown
Contributor Author

bclindner commented May 19, 2026

OK, finally something workable here. @ezopezo & maybe @jedinym does this initial test accurately portray how this builder content should look? I have a couple notes here as well:

  • I had to do some refactoring to fixtures to make this make sense. Note the new SBOMPackage interchange format used in both the SPDX and BuilderPkgMetadata conversion. Also some slight changes to make_metadata_yaml, which did require me to make changes to test_contextual_parent (sorry!).
  • Question: the test breaks if I attempt to add packages other than the one in the builder to the build metadata - that leads me to believe I've done something wrong in declaring that metadata in the way we'd expect. Is there anything you can spot about the data I'm generating and feeding into mobster that might be incorrect, or is this just something I'm misunderstanding w/ what capo actually outputs?
    • My current understanding is capo's output would have basically every package in the final image listed along with its source image (or the purl of the package that needed it) - you can see what I thought would work at a4162ed

If by some miracle this test looks good as-is I can proceed to build out more, but I've been tripping over myself trying to understand what's going on here & how to represent it so I assume there are still some disconnects here and there.

@bclindner bclindner requested a review from ezopezo May 20, 2026 13:05
@ezopezo
Copy link
Copy Markdown
Contributor

ezopezo commented May 27, 2026

  • My current understanding is capo's output would have basically every package in the final image listed along with its source image

Nope capo's output only contains package metadata of the packages that are COPY-ied to final image. Let's draw general example and propose some integration test cases:

Let's assume that builder_image:tag builder base image contain builder_package1 in /opt/app1 dir and builder_package1 in /opt/app2. Also imagine that base image of built component base_image:tag contains package base_package

FROM builder_image:tag AS builder_alias 
# These two following instructions are creating intermediate_image 
RUN install intermediate_package1  # assume that this added package to /opt/app1
RUN install intermediate_package2  # assume that this somehow added package to /opt/app2

FROM base_image:tag
RUN install component_only_package
COPY --from=builder_alias /opt/app1 /opt/app1
|
produces: final_image

After build and capo scan, capo output should look like this;

{
    PackageURL: "intermediate_package1, # here should be valid purl but lets pretend
    OriginType: "intermediate",
    Pullspec:   "builder_image", # valid digest here but again let's pretend
    StageAlias: "builder_alias",
},
{
    PackageURL: "builder_package1",
    OriginType: "builder",
    Pullspec:   "builder_image",
    StageAlias: "builder_alias",
},

Quick reminder note: stuff from /opt/app2 in builder has never reached to capo output because it was never COPY-ied and thus never present in final image. Also content from base_image and installed in final image (component_only_package) is not relevant for capo.

Now this is consumed by mobster and used for builder content contextualization. How? By matching purls from capo output against processed SBOM. Before this builder content resolution, parent content should be already resolved and SBOM should look like this (simplifying down to basic relevant relationships):

final_image DESCENDANT_OF base_image
builder_image BUILD_TOOL_OF final_image

base_image CONTAINS base_package
final_image CONTAINS component_only_package
final_image CONTAINS intermediate_package1
final_image CONTAINS builder_package1

After builder content contextualization the state of the SBOM should be;

final_image DESCENDANT_OF base_image
builder_image BUILD_TOOL_OF final_image
intermediate_image DESCENDANT_OF builder_image # this is added in mobster before builder content contextualization

base_image CONTAINS base_package
final_image CONTAINS component_only_package
intermediate_image CONTAINS intermediate_package1
builder_image CONTAINS builder_package1

This is happy path scenario of course.

I was thinking and I think this is other edge cases that should be covered:

  1. Imagine that we are iterating capo output and searching it in SBOM in mobster. It might happen that capo has some package in the output that has not been found in SBOM packages. This might happen when you something COPY and remove in next instructions. Or there is a bug. We should log capo item with warning and continue (matching in parent or builder contextualization is best effort). All other matched packages should be properly contextualized.

  2. I would expect a a functionality in mobster (outside of the scope of this ticket but lets write a test) - imagine that capo item is matched against package in SBOM that has been already assigned to base image (parent content). Of course we can guess that parent content has been rewritten by builder content, but we do not want to guess here. In this situation it is fair to say that;
    base_image CONTAINS package (relationship came from parent content contextualization)
    but also
    builder_image (or intermediate_image) CONTAINS package (this one is new relationship that should be inserted into SBOMsuch ocurrence in builder content contextualization). From the perspective of the CVE we know that package came from base and also from builder and in such case we need to remediate both.

  3. Same applies when we have same package copied from different builders (capo contains two packages with same purl but from different sources);
    Match capo item: SBOM package relationship is already present and should be modified;
    final_image CONTAINS builder_package1 -> builder1 CONTAINS builder_package1
    But we should not rewrite it with another capo item!
    We need to add another relationship for same package;
    builder2 CONTAINS builder_package1

  4. capo's output is missing purl (empty string) -> log item and continue, no SBOM package relationship should be modified for this capo item

  5. SBOM packages have same purl present multiple times per single capo item - I will think about this, but we should probably not contextualize this

  6. [edgecase] capo has multiple identical items (same source same purl) in output - not sure if such thing is possible, will check

@bclindner
Copy link
Copy Markdown
Contributor Author

Nope capo's output only contains package metadata of the packages that are COPY-ied to final image.

This + the capo explainer made it click, thanks! I'm building out cases based on what you described & stuff like the oci-archive edgecase from the other PR.

@ezopezo
Copy link
Copy Markdown
Contributor

ezopezo commented May 28, 2026

This + the capo explainer made it click, thanks! I'm building out cases based on what you described & stuff like the oci-archive edgecase from the other PR.

Oh and do not worry, if some things are not working as we described here in mobster - feel free to discuss it, let test fail (skip) and then we create a ticket for fixing that.

@bclindner bclindner marked this pull request as ready for review May 29, 2026 19:20
@bclindner
Copy link
Copy Markdown
Contributor Author

bclindner commented May 29, 2026

I added tests that roughly map to what you specified - hopefully I'm not missing anything else?

More notes:

  • Two tests fail - test_builder_content_duplicate_from_parent and test_builder_content_duplicate_different_pullspecs.
    • I'm not really sure if these are the right asserts for the task, but they make the most sense to me, let me know if I'm off base here.
    • These have been auto skipped w/ pytest.mark.skip(reason="not currently supported").
  • Big weird overhaul to make the builder content flow generic. Hope this is OK.
    • Part of this was capturing stdout/stderr from the mobster run to capture logs for a test - that's the only way I could really assert anything for test_builder_content_missing_purl.
    • Wanted to make this a parametrized test, but since it requires fixture content I don't think that's possible w/o some lambda weirdness.
  • Wasn't sure what to do with the oci-archive test we discussed, as I don't think that even shows up in capo and I'm not sure how we'd denote it in the data here for mobster. Left it out for now.

@bclindner
Copy link
Copy Markdown
Contributor Author

bclindner commented May 29, 2026

Adding @jedinym as well and marking as ready for review - apologies for adding even more code review load, I just don't have confidence on what I'm doing here & wanna make sure both you and Erik as the process SMEs can catch the things I'm probably doing wrong

@bclindner bclindner requested a review from jedinym May 29, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants