Storage Read API returns full Arrow IPC streams instead of bare messages

### What happened?

Reading a table over the Storage Read API (Arrow) with the official `cloud.google.com/go/bigquery` client returns no rows. With the Storage Read client enabled, `RowIterator` reads zero rows, and on the current `main` it panics in `RowIterator.Next` (`index out of range [0] with length 0`).

The cause is the Arrow framing: each field should be a single bare IPC message, but the emulator writes a whole IPC stream into it.

- `serialized_schema`: `[schema][empty record batch][EOS]` — should be just a schema message.
- `serialized_record_batch`: `[schema][record][EOS]` — should be just a record-batch message.

The client reads the schema field first, consumes its empty batch (zero rows), and stops at the EOS before the real data — so `RowIterator.Next` panics.
Low-level access (`CreateReadSession` / `ReadRows` decoded by hand) tolerates this, which is why it goes unnoticed.


### What did you expect to happen?

Per the Storage v1 proto ([`google/cloud/bigquery/storage/v1/arrow.proto`](https://github.com/googleapis/googleapis/blob/master/google/cloud/bigquery/storage/v1/arrow.proto)), `serialized_schema` is an "IPC serialized Arrow schema" and `serialized_record_batch` is an "IPC-serialized Arrow RecordBatch" — i.e. each field is a single Arrow [IPC-encapsulated message](https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc) (no leading schema, no end-of-stream marker), as real BigQuery sends them. With that framing the official client's `RowIterator` returns the table rows.

<details>
<summary>Why a single message, and not an IPC stream</summary>

Arrow IPC has two layers: a single [encapsulated message](https://arrow.apache.org/docs/format/Columnar.html#encapsulated-message-format) (one schema, or one record batch), and a [stream](https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format) — the sequence `<schema><record batch>…<EOS optional>` built from those messages.

These proto fields each carry one encapsulated message. The consumer builds the stream itself by concatenating `serialized_schema` + each `serialized_record_batch` — that concatenation is the stream.

The emulator instead writes a whole stream into every field (`serialized_schema` = `[schema][empty batch][EOS]`, each `serialized_record_batch` = `[schema][record][EOS]`). Concatenating those gives two schemas and a mid-stream EOS — not a valid stream.

Note on dictionaries: a dictionary-encoded column requires `DictionaryBatch` message(s) before the record batch. The emulator emits plain (non-dictionary) batches, so a single record-batch message is complete here; dictionary support would be a separate change.

</details>

### How can we reproduce it (as minimally and precisely as possible)?

A minimal failing test reproduces it in two steps.

1. Save this as `server/storage_read_repro_test.go`:

<details>
<summary>storage_read_repro_test.go</summary>

```go
package server_test

import (
	"context"
	"testing"

	"cloud.google.com/go/bigquery"
	"github.com/goccy/bigquery-emulator/server"
	"github.com/goccy/bigquery-emulator/types"
	"google.golang.org/api/iterator"
	"google.golang.org/api/option"
)

func TestStorageReadRepro(t *testing.T) {
	ctx := context.Background()

	// Emulator holding one table: dataset1.users = {id: 42, name: "alice"}.
	emulator, err := server.New(server.TempStorage)
	if err != nil {
		t.Fatal(err)
	}
	source := server.StructSource(
		types.NewProject("test",
			types.NewDataset("dataset1",
				types.NewTable("users",
					[]*types.Column{
						types.NewColumn("id", types.INT64),
						types.NewColumn("name", types.STRING),
					},
					types.Data{
						{"id": 42, "name": "alice"},
					},
				),
			),
		),
	)
	if err := emulator.Load(source); err != nil {
		t.Fatal(err)
	}
	testServer := emulator.TestServer() // in-process REST + gRPC, no TLS
	defer testServer.Close()

	// Official client with the Storage Read API enabled.
	client, err := bigquery.NewClient(ctx, "test",
		option.WithEndpoint(testServer.URL), option.WithoutAuthentication())
	if err != nil {
		t.Fatal(err)
	}
	defer client.Close()
	grpcOptions, err := testServer.GRPCClientOptions(ctx)
	if err != nil {
		t.Fatal(err)
	}
	if err := client.EnableStorageReadClient(ctx, grpcOptions...); err != nil {
		t.Fatal(err)
	}

	// Read the table over the Storage Read API.
	rowIterator := client.Dataset("dataset1").Table("users").Read(ctx)
	var rows []map[string]bigquery.Value
	for {
		row := map[string]bigquery.Value{}
		err := rowIterator.Next(&row)
		if err == iterator.Done {
			break
		}
		if err != nil {
			t.Fatalf("read failed: %v", err)
		}
		rows = append(rows, row)
	}

	// Expect exactly the row we stored to come back.
	if len(rows) != 1 {
		t.Fatalf("want 1 row, got %d: %v", len(rows), rows)
	}
	gotID := rows[0]["id"]
	gotName := rows[0]["name"]
	if gotID != int64(42) || gotName != "alice" {
		t.Fatalf("want {id: 42, name: alice}, got {id: %v, name: %v}", gotID, gotName)
	}
}
```

</details>


2. Run it:

```
go test ./server/ -run TestStorageReadRepro -count=1 -v
```

#### Expected

PASS — the test reads `{id: 42, name: "alice"}` back.

#### Actual (on current `main`)

FAIL — the official client decodes the non-conformant framing into zero rows, then `RowIterator.Next` panics:

```
=== RUN   TestStorageReadRepro
--- FAIL: TestStorageReadRepro (0.12s)
panic: runtime error: index out of range [0] with length 0 [recovered, repanicked]
...
cloud.google.com/go/bigquery.(*RowIterator).Next(...)
	cloud.google.com/go/bigquery@v1.77.0/iterator.go:177
github.com/goccy/bigquery-emulator/server_test.TestStorageReadRepro(...)
	server/storage_read_repro_test.go
FAIL	github.com/goccy/bigquery-emulator/server
```

### Anything else we need to know?

References:

- Storage v1 proto (the contract): https://github.com/googleapis/googleapis/blob/master/google/cloud/bigquery/storage/v1/arrow.proto
- Arrow columnar format — serialization & IPC (encapsulated message format): https://arrow.apache.org/docs/format/Columnar.html#serialization-and-interprocess-communication-ipc


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Storage Read API returns full Arrow IPC streams instead of bare messages #493

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Expected

Actual (on current `main`)

Anything else we need to know?

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Storage Read API returns full Arrow IPC streams instead of bare messages #493

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Expected

Actual (on current main)

Anything else we need to know?

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Actual (on current `main`)