Skip to content

Fix error response status code when dataset or job already exists#184

Open
leonasdev wants to merge 3 commits into
goccy:mainfrom
leonasdev:main
Open

Fix error response status code when dataset or job already exists#184
leonasdev wants to merge 3 commits into
goccy:mainfrom
leonasdev:main

Conversation

@leonasdev

Copy link
Copy Markdown

The official BigQuery documentation's error messages state that when creating a job, dataset, or table, a response of 409 should be returned if they already exist.

Currently, I have found that only the table portion has the correct response of 409, while the other two always respond with 500. Therefore, I am submitting this PR to fix this issue.

Additionally, the documentation mentions that The error also returns when a job's writeDisposition property is set to WRITE_EMPTY and the destination table accessed by the job already exists. I have not implemented this part, but it may be considered for future improvements.

This is my first PR of this repo, so please let me know if there are any issues with my PR. Thank you.

@thuibr

thuibr commented Dec 5, 2023

Copy link
Copy Markdown

I'm running into this too and was just about to consider contributing and fixing it. This is my workaround:

    def create_dataset(self, dataset, exists_ok=False):
        # The exists_ok flag doesn't seem to work with the emulator,
        # so work around it.
        try:
            return self._client.create_dataset(dataset, retry=None)
        except google.api_core.exceptions.InternalServerError as e:
            errors = e.errors
            dataset_basename = dataset.split('.')[-1]
            if exists_ok and re.match(rf'dataset {dataset_basename} is already created', errors[0]["message"]):
                return
            raise e

Obviously, it would be better to fix it in the emulator! Let me know if I can help with anything.

@thuibr

thuibr commented Dec 13, 2023

Copy link
Copy Markdown

@goccy hi just wondering what I can do to help get this PR through?

"sync"
)

var ErrDuplicatedDataset = errors.New("dataset is already created")

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use errDuplicate function in error.go

Comment thread server/handler.go
}

func (h *datasetsInsertHandler) Handle(ctx context.Context, r *datasetsInsertRequest) (*bigqueryv2.DatasetListDatasets, error) {
func (h *datasetsInsertHandler) Handle(ctx context.Context, r *datasetsInsertRequest) (*bigqueryv2.DatasetListDatasets, *ServerError) {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem necessary to change to *ServerError type

shollands-sc added a commit to shollands-sc/bigquery-emulator that referenced this pull request Mar 10, 2026
When creating a dataset or job that already exists, the emulator returns
HTTP 500 (InternalServerError) instead of HTTP 409 (Conflict). This
causes the BigQuery Python client to retry with exponential backoff for
up to 600 seconds, since it treats 500 as transient. The exists_ok=True
parameter also fails to suppress the error because it only checks for 409.

This fix follows the existing ErrDuplicatedTable pattern already in the
codebase: sentinel errors in the metadata package, checked with
errors.Is in ServeHTTP, mapped to errDuplicate() for the HTTP response.

Handle() method signatures are unchanged, addressing the feedback on goccy#184.

Fixes goccy#256
Supersedes goccy#184

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants