Skip to content

BDS doesn't re-try when it can't authenticate with the MQ #148

@simon-20

Description

@simon-20

Brief Description
When the BDS fails to communicate with the MQ, it doesn't re-try or gracefully catch the exception, but let's the outer exception handling code catch it, such that the checker look restarts:

stdin│2026-05-17T16:25:04,761Z - bds - INFO - dataset id: a0effbe2-b01f-4128-9453-7a46a6de22e9 - Updated dataset                                                                                                                                                                                                                                     │
stdin│2026-05-17T16:25:06,782Z - bds - ERROR - Exception in checker service loop. Waiting 10 minutes then restarting. Exception message: Handler failed: Authentication attempt timed-out..                                                                                                                                                          │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_servicebus_sender.py", line 505, in create_message_batch                                                                                                                                                                                                                      │
stdin│  File "/bulk-data-service/src/bulk_data_service/dataset_updater.py", line 181, in add_or_update_registered_dataset                                                                                                                                                                                                                            │
stdin│    future.result()                                                                                                                                                                                                                                                                                                                            │
stdin│    return self._do_retryable_operation(self._open)                                                                                                                                                                                                                                                                                            │
stdin│2026-05-17T16:25:07,135Z - bds - ERROR - Full traceback: Traceback (most recent call last):                                                                                                                                                                                                                                                    │
stdin│  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result                                                                                                                                                                                                                                                      │
stdin│           ^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                │
stdin│                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                  │
stdin│                                                                                                                                                                                                                                                                                                                                               │
stdin│  File "/bulk-data-service/src/bulk_data_service/dataset_updater.py", line 79, in add_or_update_dataset_batch                                                                                                                                                                                                                                  │
stdin│           ^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                 │
stdin│              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                     │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_base_handler.py", line 386, in _do_retryable_operation                                                                                                                                                                                                                        │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_servicebus_sender.py", line 264, in _open                                                                                                                                                                                                                                     │
stdin│  File "/bulk-data-service/src/bulk_data_service/dataset_updater.py", line 61, in add_or_update_datasets                                                                                                                                                                                                                                       │
stdin│    while not self._handler.client_ready():                                                                                                                                                                                                                                                                                                    │
stdin│Traceback (most recent call last):                                                                                                                                                                                                                                                                                                             │
stdin│                                                                                                                                                                                                                                                                                                                                               │
stdin│  File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run                                                                                                                                                                                                                                                               │
stdin│TimeoutError: Authentication attempt timed-out.                                                                                                                                                                                                                                                                                                │
stdin│            ^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                        │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_base_handler.py", line 545, in _open_with_retry                                                                                                                                                                                                                               │
stdin│  File "/bulk-data-service/src/bulk_data_service/checker.py", line 79, in checker_run                                                                                                                                                                                                                                                          │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_pyamqp/client.py", line 391, in auth_complete                                                                                                                                                                                                                                 │
stdin│  File "/bulk-data-service/src/utilities/azure.py", line 185, in send_dataset_check_result_message                                                                                                                                                                                                                                             │
stdin│    sender.send_messages(message)                                                                                                                                                                                                                                                                                                              │
stdin│  File "/bulk-data-service/src/bulk_data_service/checker.py", line 33, in checker_service_loop                                                                                                                                                                                                                                                 │
stdin│    add_or_update_registered_dataset(                                                                                                                                                                                                                                                                                                          │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_pyamqp/client.py", line 405, in client_ready                                                                                                                                                                                                                                  │
stdin│    add_or_update_datasets(context, datasets_in_bds, registered_datasets)                                                                                                                                                                                                                                                                      │
stdin│    return operation(**kwargs)                                                                                                                                                                                                                                                                                                                 │
stdin│    send_message_to_iati_mq(context, topic_name, msg_payload)                                                                                                                                                                                                                                                                                  │
stdin│    self._open_with_retry()                                                                                                                                                                                                                                                                                                                    │
stdin│    if self._cbs_authenticator and not self._cbs_authenticator.handle_token():                                                                                                                                                                                                                                                                 │
stdin│    result = self.fn(*self.args, **self.kwargs)                                                                                                                                                                                                                                                                                                │
stdin│  File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result                                                                                                                                                                                                                                                            │
stdin│    raise TimeoutError("Authentication attempt timed-out.")                                                                                                                                                                                                                                                                                    │
stdin│  File "/bulk-data-service/src/utilities/azure.py", line 204, in send_message_to_iati_mq                                                                                                                                                                                                                                                       │
stdin│During handling of the above exception, another exception occurred:                                                                                                                                                                                                                                                                            │
stdin│    batch = self.create_message_batch()                                                                                                                                                                                                                                                                                                        │
stdin│    checker_run(context, datasets_in_bds)                                                                                                                                                                                                                                                                                                      │
stdin│           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                            │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_pyamqp/cbs.py", line 278, in handle_token                                                                                                                                                                                                                                     │
stdin│             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                │
stdin│           ^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                                                 │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_servicebus_sender.py", line 452, in send_messages                                                                                                                                                                                                                             │
stdin│    send_dataset_check_result_message(context, msg_payload, 2)                                                                                                                                                                                                                                                                                 │
stdin│    raise self._exception                                                                                                                                                                                                                                                                                                                      │
stdin│    if not self.auth_complete():                                                                                                                                                                                                                                                                                                               │
stdin│    return self.__get_result()                                                                                                                                                                                                                                                                                                                 │
stdin│    last_exception = self._handle_exception(exception)                                                                                                                                                                                                                                                                                         │
stdin│                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                                                         │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_base_handler.py", line 393, in _do_retryable_operation                                                                                                                                                                                                                        │
stdin│                                                                                                                                                                                                                                                                                                                                               │
stdin│azure.servicebus.exceptions.ServiceBusError: Handler failed: Authentication attempt timed-out..                                                                                                                                                                                                                                                │
stdin│  File "/opt/venv/lib/python3.12/site-packages/azure/servicebus/_base_handler.py", line 344, in _handle_exception                                                                                                                                                                                                                              │
stdin│    raise error                                                                                                                                                                                                                                                                                                                                │
stdin│2026-05-17T16:35:07,293Z - bds - INFO - Checker starting run

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions