Skip to content

Fix accidental job cancellation#105

Merged
yaqi-lyu merged 6 commits intomasterfrom
auto-cancel
May 8, 2026
Merged

Fix accidental job cancellation#105
yaqi-lyu merged 6 commits intomasterfrom
auto-cancel

Conversation

@yaqi-lyu
Copy link
Copy Markdown
Member

@yaqi-lyu yaqi-lyu commented May 7, 2026

What changed

This PR fixes accidental job cancellation from cancel-link visits (#104) and improves the cancellation flow.

Changes:

  • Makes GET /CancelProcessing non-destructive.
  • Shows a confirmation page before stopping analysis.
  • Requires POST /CancelProcessing to actually cancel a job.
  • Checks whether the target Container App Job execution is still running before showing the stop button.
  • Shows “No running job executions found. It may have already completed.” when the job is already finished or cannot be found.
  • Uses the correct Azure Container Apps SDK stop API: client.jobs.beginStopExecutionAndWait(...).
  • Removes the invalid client.jobsExecutions.delete(...) usage.
  • Adds request diagnostics to cancellation logs: method, user agent, forwarded IP, and referer.
  • Sends best-effort cancelled Teams notifications from inside the container when cancellation is confirmed.
  • Reuses the container’s existing notification metadata so cancelled notifications can go to the same participant list as failed/completed notifications.
  • Updates signal handling so SIGTERM exits 143, SIGINT exits 130, and user-requested cancellation can exit 0.
  • Removes the javascript:window.close() link from the confirmation page.
  • Updates the confirmation page copy and layout.

Why it worked before but started failing now

The old cancellation endpoint accepted both GET and POST, and a GET request immediately cancelled the job.

That worked for months because nothing was regularly visiting the cancel URL before the user clicked it. Recently, something in the message/link path appears to have started touching the URL automatically, likely one of:

  • Microsoft Defender Safe Links scanning
  • Teams link preview or URL safety checks
  • Browser/mobile client prefetching
  • Tenant security policy changes or rollout behavior

The Azure logs showed CancelProcessing being invoked even when the user had not clicked the cancel button. Once that endpoint was invoked, it marked the execution as cancelled, then the container’s CheckCancellation poll saw the cancelled state and terminated the job.

The root issue was that cancellation was implemented as a destructive GET, which is unsafe because links can be visited by systems other than the user.

Fix

GET /CancelProcessing now only renders a confirmation/status page.

The actual cancellation only happens when the user submits the confirmation form, which sends POST /CancelProcessing.

The confirmation page checks Container App Job execution status first:

  • If the execution is still Running or Processing, the user sees the stop confirmation.
  • If the execution has already completed, failed, or cannot be found, the user sees a completed/not-running message instead.
  • If there are multiple running jobs and the specific execution cannot be identified, the page refuses to cancel because it cannot safely determine which job to stop.

On confirmed cancel, the Function marks the execution as cancelled, then stops the Container App Job execution using client.jobs.beginStopExecutionAndWait(...).

The container handles user cancellation as a best-effort graceful path:

  • It receives SIGTERM.
  • It calls CheckCancellation.
  • If cancellation is confirmed, it sends NOTIFICATION_TYPE=cancelled through the existing notification script.
  • It exits 0 for user-requested cancellation.

Cancelled notification behavior

Cancelled notifications are sent from inside the container, not directly from the Azure Function.

This lets cancelled notifications reuse the same metadata already available for failed/completed notifications:

  • PARTICIPANTS_JSON
  • MEETING_SUBJECT
  • PROJECT_NAME
  • MEETING_DURATION

That means cancelled notifications can go to the same participant list as failed/completed notifications.

Known limitation / TODO

The cancelled marker currently uses Azure Function in-memory state.

That means the cancelled Teams notification is best-effort only: if CheckCancellation is served by a different Function instance than the one that handled the cancel POST, the container may not see the cancelled marker before it is stopped.

The job stop itself is reliable because it uses the Container Apps API, but the cancelled Teams notification can still be missed in a cross-instance case.

TODO:

  • Move the cancellation marker to shared storage, such as Cosmos DB, Azure Table Storage, or Blob Storage.
  • Make CheckCancellation read from that shared store.
  • Keep the current ARM stop as the reliable fallback.

User impact

Users now get an explicit confirmation step before stopping analysis.

Security scanners, link previews, or accidental URL visits can no longer cancel a meeting analysis job.

If the job has already finished by the time the user opens the cancel link, the page tells them it has likely completed instead of showing a misleading stop button.

When cancellation is confirmed and the container observes the cancellation marker, participants receive a cancelled Teams notification.

Validation

  • Confirmed the previous Azure logs showed CancelProcessing being invoked before CheckCancellation returned cancelled.
  • Verified the Azure SDK exposes beginStopExecutionAndWait(...) on client.jobs; jobsExecutions only supports list().
  • Removed usage of the invalid client.jobsExecutions.delete(...).
  • Verified entrypoint.sh syntax with bash -n entrypoint.sh.

@yaqi-lyu yaqi-lyu marked this pull request as ready for review May 7, 2026 05:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the job cancellation flow to prevent unintended cancellations caused by non-user link visits (e.g., link scanners/previews), by making GET /CancelProcessing non-destructive and requiring an explicit POST confirmation to cancel. It also adjusts container shutdown behavior so user-initiated cancellation can exit cleanly.

Changes:

  • Updated CancelProcessing to render a confirmation UI on GET, and only cancel on POST, with execution-status checks before showing the stop option.
  • Enhanced cancellation logging with request diagnostics and refined fallback behavior when execution mapping is unavailable.
  • Updated container entrypoint cancellation handling to attempt a “successful exit” on user-requested cancellation and slowed the polling interval.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.

File Description
entrypoint.sh Adds signal handling and cancellation marker file; changes cancellation polling interval and termination behavior.
azure-function/src/functions/CancelProcessing.js Implements non-destructive GET confirmation flow, status checks, richer logging, and cooperative cancellation behavior.
.gitignore Ignores additional AI agent working directories.
Comments suppressed due to low confidence (1)

azure-function/src/functions/CancelProcessing.js:515

  • After a user opens the confirmation page, the execution could finish before they submit the POST. In the mapping/cooperative branch, the POST handler does not re-check whether the target execution is still running before returning success, so users can get a “cancelled successfully” response even though nothing was stopped. Consider re-checking execution status in the POST flow (or returning a completed/not-running message) before marking cancelled / removing the mapping.
        // Found in cache, let the container's cancellation checker stop itself.
        structuredLog(
          context,
          "info",
          "Marked job execution for cooperative cancellation",
          {
            jobName,
            executionName,
            resourceGroup,
          },
        );

        // Mark as cancelled only after confirming the cooperative path is valid
        markAsCancelled(executionId);
        // Remove from mapping cache
        removeExecutionMapping(executionId);
      }

      return createResponse(
        request,
        true,
        "Processing has been cancelled successfully.",
        200,
        { executionId, executionName },
      );

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread entrypoint.sh Outdated
Comment thread entrypoint.sh
Comment thread azure-function/src/functions/CancelProcessing.js Outdated
Comment thread azure-function/src/functions/CancelProcessing.js Outdated
yaqi-lyu and others added 3 commits May 7, 2026 14:17
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@suiyangqiu suiyangqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good change, lgtm

@yaqi-lyu yaqi-lyu merged commit cefa77c into master May 8, 2026
4 checks passed
@yaqi-lyu yaqi-lyu deleted the auto-cancel branch May 8, 2026 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants