Skip to content

Adding job labels for FinOps#673

Open
JulianUmbhau wants to merge 4 commits intor-dbi:mainfrom
JulianUmbhau:job_labels
Open

Adding job labels for FinOps#673
JulianUmbhau wants to merge 4 commits intor-dbi:mainfrom
JulianUmbhau:job_labels

Conversation

@JulianUmbhau
Copy link

@JulianUmbhau JulianUmbhau commented Feb 23, 2026

Added labels to job creation calls for FinOps purposes
Labels can be set globally via:
options(bigrquery.labels = list(env = "prod", team = "analytics"))
Labels are then automatically attached to all BigQuery job requests (query, load, extract, copy).

  • Added check_labels() to validate labels against BigQuery's key/value constraints before sending
  • Added bigrquery.labels default option (NULL) in .onLoad()
  • Added tests for check_labels()
  • Fixed pre-existing inconsistency: auto_unbox = TRUE was already present in bq_post/bq_patch but was missing from the toJSON calls in bq_perform_upload and bq_parse_single

Fixes #652

@hadley
Copy link
Member

hadley commented Feb 25, 2026

Would you mind pointing me to the BigQuery docs for labels?

In the interest of using the simplest possible data structure, is there a reason to use a list instead of a named character vector?

@JulianUmbhau
Copy link
Author

Would you mind pointing me to the BigQuery docs for labels?

In the interest of using the simplest possible data structure, is there a reason to use a list instead of a named character vector?

Here are the docs: https://docs.cloud.google.com/bigquery/docs

And you are correct, the list was unnecessary. I have changed that to a named chr vector.

@schifferl
Copy link

Adding my support for this feature – it is important for cost allocation in production applications that use R to communicate with BigQuery.

Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! A few comments on the overall approach and implementation below.

I assume you have tested this interactively?

check_bool(print_header)
check_string(billing)

labels <- check_labels(getOption("bigrquery.labels"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general principle, I don't think it's good idea to allow global options to change what a computation does, only how it looks. I think it's probably ok in this case because I think you can frame this as a sort of logging operation, but it should appear in function arguments too.

I would also wonder if it would instead make sense to make it a field in BigQueryConnection. Would that also solve the problem for you or are you usually calling these low-level functions directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand.
I do use the low-level functions directly currently, but it is also a good point with a field in BigQueryConnection.

Would it make sense to both:

  1. Add labels field to BigQueryConnection
  2. Add labels as arguments to each bq_perform_... function? - To set the labels explicitly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also happy to do that myself, if you don't have the time.

@JulianUmbhau
Copy link
Author

JulianUmbhau commented Mar 6, 2026

Thanks for working on this! A few comments on the overall approach and implementation below.

I assume you have tested this interactively?

Ofc, thank you the comments!

We use this every day, but I have now tested it more thoroughly.

I had to change the character vector back to list - i hadn't tested that change well enough.
The jsonlite::toJSON function drops the names of the character vector and changes it to an array. And the auto_unbox = TRUE is needed to ensure the value is not wrapped in an array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for labels in BigQuery jobs

3 participants