Skip to content

fix: BQ analytics plugin — fork detection, GCS offload, and agent response logging#1

Draft
caohy1988 wants to merge 1 commit intohaiyuan-eng-google:mainfrom
caohy1988:fix/bqaa-fork-and-offload
Draft

fix: BQ analytics plugin — fork detection, GCS offload, and agent response logging#1
caohy1988 wants to merge 1 commit intohaiyuan-eng-google:mainfrom
caohy1988:fix/bqaa-fork-and-offload

Conversation

@caohy1988
Copy link
Copy Markdown

@caohy1988 caohy1988 commented May 4, 2026

Summary

Three fixes to the BigQuery Agent Analytics Plugin in one PR, with full test coverage.

Fix 1: False-positive fork detection after pickle (google#5528)

When deployed via Vertex AI Agent Engine, __getstate__ sets _init_pid = 0. On the server, _ensure_started() checks os.getpid() != self._init_pid — since os.getpid() is never 0, this always triggers _reset_runtime_state(), producing a misleading "Fork detected (parent PID 0, child PID xx)" warning and adding cold-start latency.

Fix:

  • Skip _reset_runtime_state when _init_pid == 0 (pickle sentinel)
  • Record _init_pid = os.getpid() after successful startup
  • Real forks still caught by os.register_at_fork and PID check (when _init_pid != 0)

Fix 2: GCS text offload byte/character unit mismatch (google#5561)

The offload decision mixed inline_text_limit (32KB, byte-based) and max_content_length (character-based) in a single min(). For multi-byte text (CJK, emoji), this produced false offloads.

Fix: Evaluate each limit in its own unit — byte_len vs inline_text_limit, char_len vs max_length. Offload if either is exceeded.

Fix 3: Missing final agent response (haiyuan-eng-google/BigQuery-Agent-Analytics-SDK#87)

The plugin logged LLM_RESPONSE (pre-callback raw output) and AGENT_COMPLETED (latency only), but never captured the final response events emitted by agents after callback modifications.

Fix: Detect final response events in on_event_callback via a strict guard and log as AGENT_RESPONSE:

is_agent_response = (
    event.content
    and event.content.parts
    and event.is_final_response()
    and event.partial is not True
    and not event.get_function_calls()
    and not event.get_function_responses()
    and not event.long_running_tool_ids
)

Each AGENT_RESPONSE row includes:

  • response_text — the actual formatted response content
  • source_event_author — the agent that produced the response (from event.author)
  • source_event_id / source_event_branch — for cross-referencing

Query the latest agent response for an invocation:

SELECT response_text, source_event_author
FROM v_agent_response
WHERE invocation_id = @id
ORDER BY timestamp DESC
LIMIT 1

Test plan

  • 225 tests pass (213 existing + 12 new), 0 regressions

Fork detection tests (2)

  • test_no_reset_after_unpickle — unpickled plugin skips reset, records _init_pid == os.getpid()
  • test_reset_on_real_fork — stale non-zero PID triggers reset

GCS offload tests (5)

  • test_multibyte_text_offloaded_by_byte_limit — 10K emoji (40KB) offloaded via byte limit
  • test_ascii_under_both_limits_stays_inline — small ASCII stays inline
  • test_text_exceeding_char_limit_offloaded — ASCII over char limit offloaded
  • test_multibyte_under_char_and_byte_limits_stays_inlineregression: 3K emoji (12K bytes) with max_length=10000 stays inline
  • test_no_offloader_falls_back_to_truncate — truncates inline without offloader

AGENT_RESPONSE tests (5)

  • test_logs_final_text_response — logs response with source_event_author from event.author
  • test_skips_function_call_events — function calls not logged
  • test_skips_function_response_events — function responses (even with skip_summarization) not logged
  • test_skips_partial_events — partial streaming chunks not logged
  • test_skips_long_running_tool_events — long-running tool pauses not logged

🤖 Generated with Claude Code

@caohy1988 caohy1988 force-pushed the fix/bqaa-fork-and-offload branch from 79104f8 to 79245d9 Compare May 4, 2026 23:32
@caohy1988 caohy1988 changed the title fix: BQ analytics plugin — GCS offload unit mismatch and dataset location inference fix: BQ analytics plugin — fork detection and GCS offload unit mismatch May 4, 2026
@caohy1988 caohy1988 force-pushed the fix/bqaa-fork-and-offload branch from 79245d9 to 93d5d81 Compare May 4, 2026 23:45
@caohy1988 caohy1988 changed the title fix: BQ analytics plugin — fork detection and GCS offload unit mismatch fix: BQ analytics plugin — fork detection, GCS offload, and agent response logging May 4, 2026
@caohy1988 caohy1988 force-pushed the fix/bqaa-fork-and-offload branch 2 times, most recently from 3bd8b6a to 609883c Compare May 5, 2026 06:49
…ponse logging

Three fixes to the BigQuery Agent Analytics Plugin:

1. **False-positive fork detection after pickle (google#5528):** When
   deployed via Vertex AI Agent Engine, __getstate__ sets
   _init_pid = 0.  On the server, _ensure_started() checked
   os.getpid() != 0 which always triggered _reset_runtime_state(),
   producing misleading "Fork detected" warnings and adding cold-start
   latency.  Fix: skip reset when _init_pid == 0 (pickle sentinel),
   and record os.getpid() after successful startup so fork detection
   works for the rest of the instance lifetime.

2. **GCS text offload byte/character unit mismatch (google#5561):** The
   offload decision mixed inline_text_limit (32KB, byte-based) and
   max_content_length (character-based) in a single min() comparison.
   For multi-byte text (CJK, emoji), this produced false offloads.
   Fix: evaluate each limit in its own unit — byte_len vs
   inline_text_limit, char_len vs max_length — offload if either
   is exceeded.

3. **Missing final agent response in BQ (issue google#87):** The plugin
   logged LLM_RESPONSE (pre-callback raw output) and AGENT_COMPLETED
   (latency only, no content), but never captured the final response
   events emitted by agents after callback modifications.  Fix: detect
   final response events in on_event_callback via a strict guard and
   log as AGENT_RESPONSE with source_event_author from event.author.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@caohy1988 caohy1988 force-pushed the fix/bqaa-fork-and-offload branch from 609883c to b2eca40 Compare May 5, 2026 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant