Add support for dynamically querying multiple tenants in trace domain#229
Add support for dynamically querying multiple tenants in trace domain#229shwetaap wants to merge 1 commit intoopenshift:mainfrom
Conversation
WalkthroughThis pull request introduces trace store configuration management for Korrel8r. It adds a new StoreConfig interface and replaceTraceStore function to the client, creates utility functions to extract trace context from URL parameters and build store configurations, introduces Redux state management for trace context, and integrates trace store updates into the Korrel8rPanel component. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: shwetaap The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/src/hooks/useTroubleshootingPanel.tsx`:
- Around line 16-35: The open callback currently awaits replaceTraceStore(...)
which can stall and prevent dispatch(openTP()) from running; update
useTroubleshootingPanel's open function so the replaceTraceStore call is bounded
by a timeout/fallback (e.g., Promise.race with a short timeout or a cancellable
promise) and ensure dispatch(openTP()) always runs after the timeout/fallback
rather than waiting indefinitely; locate the open function and adjust the logic
around extractTraceContext(), buildTraceStoreConfig(), replaceTraceStore(), and
dispatch(openTP()) so failures or timeouts in replaceTraceStore do not block
opening the panel.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0c473d50-d4ce-48e4-a5ee-e449dc6660de
📒 Files selected for processing (3)
web/src/hooks/useTroubleshootingPanel.tsxweb/src/korrel8r-client.tsweb/src/korrel8r/trace.ts
| const open = React.useCallback(async () => { | ||
| // Check if we're on a traces page with tenant information | ||
| const traceContext = extractTraceContext(); | ||
|
|
||
| if (traceContext) { | ||
| try { | ||
| // Update korrel8r's trace store to match the selected tenant | ||
| const storeConfig = buildTraceStoreConfig(traceContext); | ||
| await replaceTraceStore(storeConfig); | ||
| // eslint-disable-next-line no-console | ||
| console.log('Trace store updated for tenant:', traceContext.tenant); | ||
| } catch (error) { | ||
| // Log error but don't block panel from opening | ||
| // The panel will still work with other domains even if trace store update fails | ||
| // eslint-disable-next-line no-console | ||
| console.error('Failed to update trace store:', error); | ||
| } | ||
| } | ||
|
|
||
| dispatch(openTP()); |
There was a problem hiding this comment.
Panel opening can hang indefinitely on a stalled network call.
At Line 24, await replaceTraceStore(...) runs before dispatch(openTP()), and there is no timeout. If the request never resolves, the panel never opens.
Suggested fix (bound wait time before fallback)
const open = React.useCallback(async () => {
@@
if (traceContext) {
try {
// Update korrel8r's trace store to match the selected tenant
const storeConfig = buildTraceStoreConfig(traceContext);
- await replaceTraceStore(storeConfig);
+ await Promise.race([
+ replaceTraceStore(storeConfig),
+ new Promise((_, reject) =>
+ setTimeout(() => reject(new Error('Timed out replacing trace store')), 3000),
+ ),
+ ]);
// eslint-disable-next-line no-console
console.log('Trace store updated for tenant:', traceContext.tenant);
} catch (error) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const open = React.useCallback(async () => { | |
| // Check if we're on a traces page with tenant information | |
| const traceContext = extractTraceContext(); | |
| if (traceContext) { | |
| try { | |
| // Update korrel8r's trace store to match the selected tenant | |
| const storeConfig = buildTraceStoreConfig(traceContext); | |
| await replaceTraceStore(storeConfig); | |
| // eslint-disable-next-line no-console | |
| console.log('Trace store updated for tenant:', traceContext.tenant); | |
| } catch (error) { | |
| // Log error but don't block panel from opening | |
| // The panel will still work with other domains even if trace store update fails | |
| // eslint-disable-next-line no-console | |
| console.error('Failed to update trace store:', error); | |
| } | |
| } | |
| dispatch(openTP()); | |
| const open = React.useCallback(async () => { | |
| // Check if we're on a traces page with tenant information | |
| const traceContext = extractTraceContext(); | |
| if (traceContext) { | |
| try { | |
| // Update korrel8r's trace store to match the selected tenant | |
| const storeConfig = buildTraceStoreConfig(traceContext); | |
| await Promise.race([ | |
| replaceTraceStore(storeConfig), | |
| new Promise((_, reject) => | |
| setTimeout(() => reject(new Error('Timed out replacing trace store')), 3000), | |
| ), | |
| ]); | |
| // eslint-disable-next-line no-console | |
| console.log('Trace store updated for tenant:', traceContext.tenant); | |
| } catch (error) { | |
| // Log error but don't block panel from opening | |
| // The panel will still work with other domains even if trace store update fails | |
| // eslint-disable-next-line no-console | |
| console.error('Failed to update trace store:', error); | |
| } | |
| } | |
| dispatch(openTP()); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/hooks/useTroubleshootingPanel.tsx` around lines 16 - 35, The open
callback currently awaits replaceTraceStore(...) which can stall and prevent
dispatch(openTP()) from running; update useTroubleshootingPanel's open function
so the replaceTraceStore call is bounded by a timeout/fallback (e.g.,
Promise.race with a short timeout or a cancellable promise) and ensure
dispatch(openTP()) always runs after the timeout/fallback rather than waiting
indefinitely; locate the open function and adjust the logic around
extractTraceContext(), buildTraceStoreConfig(), replaceTraceStore(), and
dispatch(openTP()) so failures or timeouts in replaceTraceStore do not block
opening the panel.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
web/src/hooks/useTroubleshootingPanel.tsx (1)
16-35:⚠️ Potential issue | 🟠 MajorPanel open is still vulnerable to indefinite wait on network stall.
dispatch(openTP())is gated behindawait replaceTraceStore(...)(Line 24). If that request hangs, the panel open action can hang too.Suggested fix (bounded wait before fallback)
const open = React.useCallback(async () => { @@ if (traceContext) { try { // Update korrel8r's trace store to match the selected tenant const storeConfig = buildTraceStoreConfig(traceContext); - await replaceTraceStore(storeConfig); + await Promise.race([ + replaceTraceStore(storeConfig), + new Promise((_, reject) => + setTimeout(() => reject(new Error('Timed out replacing trace store')), 3000), + ), + ]); // eslint-disable-next-line no-console console.log('Trace store updated for tenant:', traceContext.tenant); } catch (error) {As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/src/hooks/useTroubleshootingPanel.tsx` around lines 16 - 35, The open callback can hang because replaceTraceStore is awaited with no timeout; change open (the async function using extractTraceContext, buildTraceStoreConfig and replaceTraceStore) to bound the wait by wrapping replaceTraceStore in a timeout (e.g., Promise.race between replaceTraceStore(...) and a short reject/resolve timeout) or run it without awaiting by letting it proceed in background, but ensure dispatch(openTP()) is always invoked after the bounded wait or in a finally block so the panel opens even if replaceTraceStore stalls; locate the replaceTraceStore call inside open and implement the timeout/fallback logic there.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web/src/utils/traceStoreUtils.ts`:
- Around line 17-23: The query params namespace, name and tenant are used raw
when building the Tempo endpoint (symbols: namespace, name, tenant) which can
create malformed hosts/paths; validate that namespace and name conform to
DNS/host-label rules (e.g., allowed chars and length) and return null on invalid
input, and encode the tenant when interpolating into the path using
path-encoding (e.g., encodeURIComponent) so special chars are safe; update the
code that builds the host/path to perform these checks and encoding before
returning the endpoint (fail closed by returning null if validation fails).
---
Duplicate comments:
In `@web/src/hooks/useTroubleshootingPanel.tsx`:
- Around line 16-35: The open callback can hang because replaceTraceStore is
awaited with no timeout; change open (the async function using
extractTraceContext, buildTraceStoreConfig and replaceTraceStore) to bound the
wait by wrapping replaceTraceStore in a timeout (e.g., Promise.race between
replaceTraceStore(...) and a short reject/resolve timeout) or run it without
awaiting by letting it proceed in background, but ensure dispatch(openTP()) is
always invoked after the bounded wait or in a finally block so the panel opens
even if replaceTraceStore stalls; locate the replaceTraceStore call inside open
and implement the timeout/fallback logic there.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: df6ff5bf-7c89-4f8d-85b1-222702475269
📒 Files selected for processing (4)
web/src/hooks/useTroubleshootingPanel.tsxweb/src/korrel8r-client.tsweb/src/korrel8r/trace.tsweb/src/utils/traceStoreUtils.ts
✅ Files skipped from review due to trivial changes (1)
- web/src/korrel8r/trace.ts
| const namespace = searchParams.get('namespace'); | ||
| const name = searchParams.get('name'); | ||
| const tenant = searchParams.get('tenant'); | ||
|
|
||
| // If any of these are missing, return null (not on traces page or no tenant selected) | ||
| if (!namespace || !name || !tenant) { | ||
| return null; |
There was a problem hiding this comment.
Validate/encode URL-derived context before building the Tempo endpoint.
namespace, name, and tenant come from query params, but Line 37 interpolates them raw into host/path. This can produce malformed targets; tenant should be path-encoded, and host labels should be validated.
Suggested fix (input guard + path encoding)
export const extractTraceContext = (): TraceContext | null => {
const searchParams = new URLSearchParams(window.location.search);
+ const isDnsLabel = (value: string): boolean =>
+ /^[a-z0-9]([-a-z0-9]*[a-z0-9])?$/.test(value);
@@
- if (!namespace || !name || !tenant) {
+ if (!namespace || !name || !tenant || !isDnsLabel(namespace) || !isDnsLabel(name)) {
return null;
}
return { namespace, name, tenant };
};
@@
export const buildTraceStoreConfig = (context: TraceContext): StoreConfig => {
const { namespace, name, tenant } = context;
+ const encodedTenant = encodeURIComponent(tenant);
@@
- const tempoStackURL = `https://tempo-${name}-gateway.${namespace}.svc.cluster.local:8080/api/traces/v1/${tenant}/tempo/api/search`;
+ const tempoStackURL = `https://tempo-${name}-gateway.${namespace}.svc.cluster.local:8080/api/traces/v1/${encodedTenant}/tempo/api/search`;As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
Also applies to: 37-37
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/src/utils/traceStoreUtils.ts` around lines 17 - 23, The query params
namespace, name and tenant are used raw when building the Tempo endpoint
(symbols: namespace, name, tenant) which can create malformed hosts/paths;
validate that namespace and name conform to DNS/host-label rules (e.g., allowed
chars and length) and return null on invalid input, and encode the tenant when
interpolating into the path using path-encoding (e.g., encodeURIComponent) so
special chars are safe; update the code that builds the host/path to perform
these checks and encoding before returning the endpoint (fail closed by
returning null if validation fails).
alanconway
left a comment
There was a problem hiding this comment.
I was assuming there was a persistent notion of the "current" tempo instance, but it looks like the user has to pick the instance every time they open the panel.
Maybe korrel8r should create a store for all the tempo instances from the tracing UIPlugin instead. In that case, maybe its a job for the COO, not the panel.
I'm getting a bit nervous about making korrel8r responsible for turning transient user actions as persistent state.
Signed-off-by: Shweta Padubidri <spadubid@redhat.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
web/src/components/Korrel8rPanel.tsx (1)
178-185: Consider whether failed store updates should be retried.Currently,
dispatch(setTraceContext(currentTraceContext))is called in the catch block, which marks the trace context as "processed" even when the backend update fails. This prevents retries on subsequent effect runs, which may be intentional to avoid retry loops, but also means transient failures won't auto-recover.If the intent is to retry on next user action (e.g., refresh), the current approach is fine. If automatic retry for transient failures is desired, consider keeping
storedTraceContextunchanged on failure so the next effect run retries.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web/src/components/Korrel8rPanel.tsx` around lines 178 - 185, The catch block currently dispatches setTraceContext(currentTraceContext) which marks the trace as processed even when the backend update fails; change the flow so that dispatch(setTraceContext(...)) is only called on successful store updates (i.e., move/remove the dispatch from the .catch handler in Korrel8rPanel.tsx), leaving storedTraceContext unchanged on failure so the next effect run will retry; if you want explicit retries instead, implement a limited retry mechanism around performFetch or the update call rather than marking the context as processed on error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@web/src/components/Korrel8rPanel.tsx`:
- Around line 178-185: The catch block currently dispatches
setTraceContext(currentTraceContext) which marks the trace as processed even
when the backend update fails; change the flow so that
dispatch(setTraceContext(...)) is only called on successful store updates (i.e.,
move/remove the dispatch from the .catch handler in Korrel8rPanel.tsx), leaving
storedTraceContext unchanged on failure so the next effect run will retry; if
you want explicit retries instead, implement a limited retry mechanism around
performFetch or the update call rather than marking the context as processed on
error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8898c933-2a00-46d6-a922-65c3ba45ca65
📒 Files selected for processing (6)
web/src/components/Korrel8rPanel.tsxweb/src/korrel8r-client.tsweb/src/korrel8r/trace.tsweb/src/redux-actions.tsweb/src/redux-reducers.tsweb/src/utils/traceStoreUtils.ts
✅ Files skipped from review due to trivial changes (1)
- web/src/korrel8r/trace.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- web/src/utils/traceStoreUtils.ts
- web/src/korrel8r-client.ts
|
@shwetaap: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
I agree, before moving forward with this PR lets figure out how to handle the backend and the selection of the tempo instance |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@PeterYurkovich can you make this PR a draft? Apparently I have "asked for permission" to use this feature via some automated github something but I can't do it. /draft |
|
Moved it to draft |
No description provided.