Skip to content

Storage API returns records byte array containing schema bytes #398

@vladDotH

Description

@vladDotH

What happened?

When you try to run default storage api example with emulator, it correctly fetches schema and message, but arrow decoding always produces empty table

What did you expect to happen?

Output table data

How can we reproduce it (as minimally and precisely as possible)?

BigQuery storage API example: https://cloud.google.com/bigquery/docs/reference/storage/libraries#use

Requires custom grpc client option with emulator url:

grpcclient, err := grpc.NewClient("0.0.0.0:9060", grpc.WithTransportCredentials(insecure.NewCredentials()))

if err != nil {
    log.Fatal(err)
}

bqReadClient, err := bqStorage.NewBigQueryReadClient(
    ctx,
    option.WithGRPCConn(grpcclient)
    option.WithoutAuthentication(),
)

Anything else we need to know?

I noticed that in official example they create decoding buffer from schema array (in processArrow function) and append record batch to it

undecoded := rows.GetArrowRecordBatch().GetSerializedRecordBatch()
if len(undecoded) > 0 {
	buf = bytes.NewBuffer(schema)
	buf.Write(undecoded)
	r, err = ipc.NewReader(buf, ipc.WithAllocator(mem), ipc.WithSchema(aschema))
	//... other code
}

But in your test you don`t use schema array but only record batch:

undecoded := rows.GetArrowRecordBatch().GetSerializedRecordBatch()
if len(undecoded) > 0 {
	buf = bytes.NewBuffer(undecoded)
	r, err = ipc.NewReader(buf, ipc.WithAllocator(mem), ipc.WithSchema(aschema))
	// ... other code
}

After hours of debugging I saw that your record batches already contains schema bytes. And when I tried to use the second way with real BigQuery source, I gained:

error processing arrow: arrow/ipc: invalid message type (got=RecordBatch, want=Schema)

So it is more a question: why do you send schema bytes in batch? It is not a problem but this feature requires to do a specific conditions depending on using emulator or not.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions