Skip to content

Fix: traceparent variable issue#1219

Open
Zenith1415 wants to merge 9 commits intojenkinsci:mainfrom
Zenith1415:fix-traceparent-variable-issue
Open

Fix: traceparent variable issue#1219
Zenith1415 wants to merge 9 commits intojenkinsci:mainfrom
Zenith1415:fix-traceparent-variable-issue

Conversation

@Zenith1415
Copy link
Copy Markdown

Context

Fix for #1202.
Currently, the TRACEPARENT environment variable injected into the Groovy env object is stuck on the Root Span of the build. It does not update when entering new stages or nodes, breaking distributed tracing context for scripts that rely on env.TRACEPARENT.

Changes

  • Modified OtelEnvironmentContributor.java to use Span.current() instead of otelTraceService.getSpan(run).
  • Added a fallback: if no active span is found on the thread, it defaults back to the Run's root span.

Testing done

Reproduced the issue locally with a pipeline script. verified that env.TRACEPARENT now changes between stages.

Before Fix:
Stage 1: 00-abc...
Stage 2: 00-abc...
image

After Fix:
Stage 1: 00-abc...
Stage 2: 00-xyz...
image

Copilot AI review requested due to automatic review settings January 19, 2026 01:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where the TRACEPARENT environment variable was not updating when entering new stages or nodes in Jenkins pipelines. The fix changes the implementation to use the current active span from the thread instead of always using the root span of the build.

Changes:

  • Modified OtelEnvironmentContributor to use Span.current() to get the active span on the thread, with a fallback to the run's root span if no valid span is found
  • Migrated ConfigurationKeyTest from JUnit 4 to JUnit 5 assertions

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/main/java/io/jenkins/plugins/opentelemetry/job/OtelEnvironmentContributor.java Updated to use Span.current() instead of otelTraceService.getSpan(run) to capture the active span context, enabling TRACEPARENT to change between stages/nodes
src/test/java/io/jenkins/plugins/opentelemetry/semconv/ConfigurationKeyTest.java Migrated from JUnit 4 (org.junit.Assert) to JUnit 5 (org.junit.jupiter.api.Assertions)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Zenith1415 Zenith1415 force-pushed the fix-traceparent-variable-issue branch from 6823eba to 03618fd Compare January 19, 2026 18:25
@Zenith1415 Zenith1415 force-pushed the fix-traceparent-variable-issue branch from 03618fd to b72cbd8 Compare January 19, 2026 18:30
@Zenith1415
Copy link
Copy Markdown
Author

Zenith1415 commented Jan 19, 2026

sorry for this, don't know how but i ran mvn spot less:apply before also, but i ran it again now, and commited only the required changes

@kuisathaverat
Copy link
Copy Markdown
Contributor

The change needs a unit test to verify that the traces are correct and that the order is as expected.

@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat I attempted to add a regression test using JenkinsRule and tried both Controller execution and a DummySlave agent, but the test fails to propagate the OpenTelemetry Context correctly in the test harness.
specifically, Span.current() returns the Root Span in the test environment, likely due to how JenkinsRule manages thread contexts compared to a real Jenkins production instance.
Since I have visually verified the fix on a running instance which you can see by the screenshots in the PR description showing env.TRACEPARENT updating correctly. I am happy to revisit the test if there is a known pattern for testing Span.current() in this plugin's test suite.

@kuisathaverat
Copy link
Copy Markdown
Contributor

@kuisathaverat I attempted to add a regression test using JenkinsRule and tried both Controller execution and a DummySlave agent, but the test fails to propagate the OpenTelemetry Context correctly in the test harness. specifically, Span.current() returns the Root Span in the test environment, likely due to how JenkinsRule manages thread contexts compared to a real Jenkins production instance. Since I have visually verified the fix on a running instance which you can see by the screenshots in the PR description showing env.TRACEPARENT updating correctly. I am happy to revisit the test if there is a known pattern for testing Span.current() in this plugin's test suite.

The JenkinsRule works as a regular Jenkins (mostly). We have tests that verify traces in memory in other places, and the tests should be more or less similar. Without a unit test, we cannot validate the change.

@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat I attempted to add a regression test using JenkinsRule and tried both Controller execution and a DummySlave agent, but the test fails to propagate the OpenTelemetry Context correctly in the test harness. specifically, Span.current() returns the Root Span in the test environment, likely due to how JenkinsRule manages thread contexts compared to a real Jenkins production instance. Since I have visually verified the fix on a running instance which you can see by the screenshots in the PR description showing env.TRACEPARENT updating correctly. I am happy to revisit the test if there is a known pattern for testing Span.current() in this plugin's test suite.

The JenkinsRule works as a regular Jenkins (mostly). We have tests that verify traces in memory in other places, and the tests should be more or less similar. Without a unit test, we cannot validate the change.

okay, working on it again !!

@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat I attempted to add a regression test using JenkinsRule and tried both Controller execution and a DummySlave agent, but the test fails to propagate the OpenTelemetry Context correctly in the test harness. specifically, Span.current() returns the Root Span in the test environment, likely due to how JenkinsRule manages thread contexts compared to a real Jenkins production instance. Since I have visually verified the fix on a running instance which you can see by the screenshots in the PR description showing env.TRACEPARENT updating correctly. I am happy to revisit the test if there is a known pattern for testing Span.current() in this plugin's test suite.

The JenkinsRule works as a regular Jenkins (mostly). We have tests that verify traces in memory in other places, and the tests should be more or less similar. Without a unit test, we cannot validate the change.

so i have added a unit test file in it!

mockedSpan.when(Span::current).thenReturn(currentSpan);
contributor.buildEnvironmentFor(run, envVars, listener);
// Verification
verify(environmentContributorService).addEnvironmentVariables(run, envVars, currentSpan);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test does not verify the current change in the behavior.

mockedSpan.when(Span::current).thenReturn(invalidSpan);
contributor.buildEnvironmentFor(run, envVars, listener);
// Verification
verify(environmentContributorService).addEnvironmentVariables(run, envVars, rootSpan);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test does not verify the current change in the behavior.

@Zenith1415
Copy link
Copy Markdown
Author

I've updated the test to be more explicit about the behavior change.
In testBuildEnvironmentFor_UsesCurrentSpanIfValid , I added verify(otelTraceService, never()).getSpan(any());
This explicitly verifies that the code bypasses the old logic when a valid current span is present, which effectively verifies the fix.

verify(environmentContributorService).addEnvironmentVariables(run, envVars, currentSpan);

// PROOF OF CHANGE
verify(otelTraceService, never()).getSpan(any());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It verifies that the method otelTraceService.getSpan(...) is never called when the current span is valid. But do not verify the spans of the step is the parant span and not the root span

verify(environmentContributorService).addEnvironmentVariables(run, envVars, rootSpan);

// Verify that it was actually called the fallback service
verify(otelTraceService).getSpan(run);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@Zenith1415
Copy link
Copy Markdown
Author

I've updated the test to strictly verify the span identity.
In testBuildEnvironmentFor_UsesCurrentSpanIfValid , I now setup both a rootSpan and a currentSpan as distinct objects. I then verify that addEnvironmentVariables is called with currentSpan and addEnvironmentVariables is never called with rootSpan.
This explicitly proves that the step uses the current span and ignores the root span.

@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat, Gentle ping

@kuisathaverat
Copy link
Copy Markdown
Contributor

The requested changes on testing are not addressed. The test provided do not verify the context pass between layers in a pipeline.

@Zenith1415
Copy link
Copy Markdown
Author

Thanks for the feedback @kuisathaverat
​I understand that the current test checks for negative behavior rather than explicitly verifying the context propagation.
​I will rewrite the test to assert that the child span correctly inherits the parent's ID as requested. I have also manually verified the fix locally using a Jenkins + Jaeger setup, and the traces are now nesting correctly. I've posted the visual proof here:
https://community.jenkins.io/t/rfc-architecture-design-scalable-opentelemetry-for-jenkins/36060/2

@kuisathaverat
Copy link
Copy Markdown
Contributor

Manual test does not remove the need of a proper test.

@Zenith1415 Zenith1415 force-pushed the fix-traceparent-variable-issue branch from ef25e0f to 7c1d73d Compare February 8, 2026 04:43
@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat Thanks for the guidance. I have updated the PR with a new dedicated integration test that addresses your concerns.
Instead of relying on negative verification (verify(..., never())), this test now:

  1. It creates a Root Span and a Child Span using the OpenTelemetry SDK.
  2. It explicitly asserts that the injected TRACEPARENT environment variable matches the Child Span ID and is different from the Root Span ID.
  3. I used Mockito.RETURNS_DEEP_STUBS to mock the ReconfigurableOpenTelemetry configuration, allowing us to test the OtelEnvironmentContributor logic without needing a full Jenkins instance.
    I have verified this passes locally . This proves that the context is correctly switching between layers.
Screenshot from 2026-02-08 10-03-04

@Zenith1415 Zenith1415 force-pushed the fix-traceparent-variable-issue branch from 2130fc2 to 02f6e9d Compare February 8, 2026 04:59
@Zenith1415 Zenith1415 force-pushed the fix-traceparent-variable-issue branch from 52a9bab to cc87aea Compare February 8, 2026 07:42
@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat Thanks for your review, this new test file created by me adds a regression test that validates TRACEPARENT propagation and span creation across nested pipeline stages, if there are any changes that to be done pls specifically guide me, also it passes mvn clean test perfectly.

@Zenith1415
Copy link
Copy Markdown
Author

@kuisathaverat i have added an extra step WithEnv, also a two stage check and try and catch block. also, it passes mvn clean test. Happy to do further adjustments if needed!

@Zenith1415
Copy link
Copy Markdown
Author

Applied the suggestion and pushed the update!

@kuisathaverat kuisathaverat added the bug Something isn't working label Feb 15, 2026
@kuisathaverat
Copy link
Copy Markdown
Contributor

kuisathaverat commented Feb 15, 2026

I need to verify an issue is not new (not related to this PR), before merging it and releasing it

@kuisathaverat
Copy link
Copy Markdown
Contributor

I am testing the incremental version on main and the incremental of this PR and the change cause #1234

@Zenith1415
Copy link
Copy Markdown
Author

I think the propagation now links Jenkins internal operations to the pipeline span.
Happy to help, add filtering rules or reproduce the behavior locally if that helps.

@ArpanC6
Copy link
Copy Markdown

ArpanC6 commented Mar 19, 2026

Hi @kuisathaverat and @Zenith1415, I reviewed this PR and the
related issue #1234.

The concern about Span.current() linking Jenkins internal
operations to the pipeline span makes sense — in a multi-threaded
environment, Span.current() depends on the thread-local OTel
context, which may capture spans from Jenkins internals if they
happen to be active on the same thread.

One possible approach to investigate: could we check
Span.current() but also verify that the current span belongs
to the current pipeline run before using it? Something like
checking if the span's trace ID matches the run's root span
trace ID before preferring it over the root span.

This might prevent Jenkins internal spans from leaking into
the pipeline's TRACEPARENT while still fixing the original issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants