IDrive e2 backup provider#144910
Conversation
| # NOTE: We use boto3 instead of aiobotocore.AioSession because AioSession | ||
| # causes blocking operations inside the event loop. | ||
| # Two examples include: | ||
| # - os.listdir | ||
| # - ssl.SSLContext.load_verify_locations | ||
| # These lead to 'Detected blocking call` warnings. Using boto3 inside | ||
| # async_add_executor_job avoids these issues by running blocking code off the event loop. |
There was a problem hiding this comment.
Don't other integrations do the same?
There was a problem hiding this comment.
Sorry for the late response but I was on holiday 😄
The AWS S3 implementation uses the aiobotocore.AioSession and because I reused that code I ran into this issue. I will raise an issue for the AWS S3 integration describing the issue, close?
There was a problem hiding this comment.
I believe there are also ways to load something async in the executor, but I am not 100% sure on that
There was a problem hiding this comment.
Don't know either, my suggestion would be to leave it like this.
There was a problem hiding this comment.
with all the similarities (and the open discussion about how to avoid it), I don't think we should use different libraries.
Something like
client = hass.async_add_executor_job(
session.create_client(
service_name="s3",
endpoint_url=data.get(CONF_ENDPOINT_URL),
aws_secret_access_key=data[CONF_SECRET_ACCESS_KEY],
aws_access_key_id=data[CONF_ACCESS_KEY_ID],
)
)
await client.__aenter__() # pylint: disable-next=unnecessary-dunder-callshould get rid of the blocking call warning (and ideally it gets fixed in the library afterwards)
There was a problem hiding this comment.
Nice, glad to see there’s already an issue and that it’s getting some attention again, very happy you jumped in there.
I totally understand the concern about ending up with two different S3 approaches, and I’m aligned on aiming for a single, non-blocking pattern (ideally via aiobotocore once the blocking bits are sorted out).
From my side I’m happy to:
- keep an eye on the aiobotocore issue and its timeline, and
- prepare a switch for this integration to the agreed aiobotocore pattern if it looks like the fix will land soon.
If you and the team feel it makes more sense to hold this PR until that’s clearer, I’m OK with that; if not, I’m also fine to get this in now and follow up with a switch once the upstream situation is resolved.
There was a problem hiding this comment.
I’ve updated this PR to use aiobotocore, so it aligns with the existing aws_s3 backup integration approach in Core.
For next steps, I’m happy to follow whatever the Home Assistant maintainers prefer. We can either merge this as-is, or hold off until there’s more clarity on the upstream preload proposal (aio-libs/aiobotocore#1462). That PR still needs review (@bdraco / @thehesiod / @webknjaz).
Please let me know how you’d like to proceed.
There was a problem hiding this comment.
@patrickvorgers we're waiting on feedback on aio-libs/aiobotocore#1451 to see which strategy people prefer
There was a problem hiding this comment.
@thehesiod, I saw that the "warmup" functionality is now included in the latest release and that partly resolves the issue (see Unblocking creation of SSL context. I have tested that with the idrive_e2 integration
The preference is to have aiobotocore completely async but as far as I understand it, that is not possible in the shortterm. The possibility for specifying an executor as described in your PR, would that enable us to resolve the Unblocking creation of SSL context issue and how would that work?
Maybe @zweckj can also help with the test cases as suggested by @bdraco.
| async def async_setup_entry(hass: HomeAssistant, entry: IDriveE2ConfigEntry) -> bool: | ||
| """Set up IDrive e2 from a config entry.""" | ||
|
|
||
| data = cast(dict, entry.data) |
There was a problem hiding this comment.
It is there for stricter type checker hints. If needed I can also for instance directly use entry.data[CONF_ENDPOINT_URL]. The orignal AWS S3 integration also uses this construct.
There was a problem hiding this comment.
I think that is fine to use without casting
There was a problem hiding this comment.
Removed the cast
| except ParamValidationError as err: | ||
| if "Invalid bucket name" in str(err): | ||
| raise ConfigEntryError( | ||
| translation_domain=DOMAIN, | ||
| translation_key="invalid_bucket_name", | ||
| ) from err |
There was a problem hiding this comment.
but if this is not the case, what should we do then?
There was a problem hiding this comment.
Basically we could request the user to recreate the bucket. During configuration of the integration the bucket existed (user selected the bucket in the configflow) but now it doesn't anymore.
So instead of the current message: "Invalid bucket name" we could change the message to: "Bucket %s doesn't exist, please recreate the bucket."
|
|
||
| async def async_unload_entry(hass: HomeAssistant, entry: IDriveE2ConfigEntry) -> bool: | ||
| """Unload a config entry.""" | ||
| client: boto3.client = entry.runtime_data |
There was a problem hiding this comment.
you don't need to retype the runtime data, as that is already part of the typeslot in IDriveE2ConfigEntry
There was a problem hiding this comment.
Changed it into: client = entry.runtime_data
| def _build_user_step_schema( | ||
| access_key: str | None = None, secret_key: str | None = None | ||
| ) -> vol.Schema: | ||
| """Build the user step schema.""" | ||
| if access_key is None or secret_key is None: | ||
| return STEP_USER_DATA_SCHEMA | ||
|
|
||
| # Return a schema with prefilled values | ||
| return vol.Schema( | ||
| { | ||
| vol.Required( | ||
| CONF_ACCESS_KEY_ID, | ||
| default=access_key, | ||
| ): cv.string, | ||
| vol.Required( | ||
| CONF_SECRET_ACCESS_KEY, | ||
| default=secret_key, | ||
| ): cv.string, | ||
| } | ||
| ) |
There was a problem hiding this comment.
There's a function for this in the ConfigFlow, something like self.add_suggested_values_to_schema or something along those lines
| resp = await http.post( | ||
| GET_REGION_ENDPOINT_URL, | ||
| json={"access_key": user_input[CONF_ACCESS_KEY_ID]}, | ||
| ) | ||
| resp.raise_for_status() | ||
| payload = await resp.json() | ||
| if payload.get("resp_code") == GET_REGION_INVALID_CREDENTIALS_CODE: | ||
| _raise_invalid_credentials(resp) |
There was a problem hiding this comment.
This is device or service specific code and should exist in a library hosted on Pypi
|
Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍 |
| # NOTE: We use boto3 instead of aiobotocore.AioSession because AioSession | ||
| # causes blocking operations inside the event loop. | ||
| # Two examples include: | ||
| # - os.listdir | ||
| # - ssl.SSLContext.load_verify_locations | ||
| # These lead to 'Detected blocking call` warnings. Using boto3 inside | ||
| # async_add_executor_job avoids these issues by running blocking code off the event loop. |
There was a problem hiding this comment.
I believe there are also ways to load something async in the executor, but I am not 100% sure on that
| async def async_setup_entry(hass: HomeAssistant, entry: IDriveE2ConfigEntry) -> bool: | ||
| """Set up IDrive e2 from a config entry.""" | ||
|
|
||
| data = cast(dict, entry.data) |
There was a problem hiding this comment.
I think that is fine to use without casting
| # Delete both the backup file and its metadata file | ||
| await self._hass.async_add_executor_job( | ||
| functools.partial( | ||
| self._client.delete_object, Bucket=self._bucket, Key=tar_filename | ||
| ) | ||
| ) | ||
| await self._hass.async_add_executor_job( | ||
| functools.partial( | ||
| self._client.delete_object, Bucket=self._bucket, Key=metadata_filename | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Switching from the executor to the event loop is expensive, so ideally we don't switch often, can we group these?
There was a problem hiding this comment.
Yes can be grouped into one delete_objects call
|
|
||
| return self.async_show_form( | ||
| step_id="user", | ||
| data_schema=self.add_suggested_values_to_schema(STEP_USER_DATA_SCHEMA, {}), |
There was a problem hiding this comment.
We don't have to call to add suggested values to schema if the dict we use to fill it with is empty
There was a problem hiding this comment.
Agreed, is an unnecessary call
| try: | ||
| # Information should be available from the previous step | ||
| assert self._endpoint_url is not None | ||
| assert self._access_key is not None | ||
| assert self._secret_key is not None | ||
| # List buckets using the provided credentials | ||
| buckets = await self.hass.async_add_executor_job( | ||
| _list_buckets, | ||
| self._endpoint_url, | ||
| self._access_key, | ||
| self._secret_key, | ||
| ) | ||
| except ClientError: |
There was a problem hiding this comment.
Only have things in the try block that can raise
There was a problem hiding this comment.
Moved the assertion outside of the try-block
| # Go back to the user step if there are errors getting buckets | ||
| # Prefill the access key and secret key fields with the current values |
There was a problem hiding this comment.
would it make sense to have the user step responsible for fetching the buckets?
There was a problem hiding this comment.
I’d suggest keeping the _list_buckets call in the bucket step rather than the user step. Each step in a config flow should manage its own data and initialization: the user step is responsible for collecting credentials, while the bucket step is responsible for selecting a bucket. Having the bucket step fetch the list keeps responsibilities clear, avoids mixing concerns, and makes the flow easier to maintain.
There was a problem hiding this comment.
Sure, but I would argue that validating the input is a part of the concern of the user step. I don't expect the bucket list to change at this point so we can just fetch it once and display it here right?
There was a problem hiding this comment.
Hopefully I interpreted your remark/question correctly.
You are correct that the bucket list itself doesn't change in between steps. The would be one edge case where the user deletes the bucket while inbetween steps.
The bucket fetch is done based on the provided credentials. In theory it is possible that the provided access-key is correct but no buckets are assigned to this access-key. While creating the access-key the user is forced to assign one or more buckets to this access-key but is free to remove that bucket afterwards. In that case you would have an access-key with no buckets.
You cannot add buckets to existing access-keys so it is usefull for the user to be presented again with the credential screen so he/she can provide new access-key information, when there are no buckets associated with the provided access-key. In case of a hiccup in the _list_buckets call, the user can use the existing credentials and try again.
| "bucket": "Bucket name" | ||
| }, | ||
| "data_description": { | ||
| "bucket": "Bucket must be writable by the provided credentials." |
There was a problem hiding this comment.
The data description should explain the data that is needed, not give extra hints on how the user should set up the integration.
There was a problem hiding this comment.
Agree and changed the description.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
(Close + reopen to restart CI against current tip of dev) |
| cm: Any | None = None | ||
| client: S3Client | None = None |
There was a problem hiding this comment.
Why are these | None I would consider that we'd know the type of cm and that we don't have to type it Any
There was a problem hiding this comment.
Good point, cm is gone now so no Any type needed. client is still | None because it’s created inside the try and may not exist if create_client().__aenter__() fails, It is used in the exception handling in the _async_safe_client_close.
Fixed in this commit
| with patch( | ||
| "aiobotocore.session.AioSession.create_client", | ||
| autospec=True, | ||
| return_value=AsyncMock(), | ||
| ) as create_client: |
There was a problem hiding this comment.
Ideally we patch the library where we use it, so homeassistant.components.idrive_e2..
There was a problem hiding this comment.
Okay, I’ll patch AioSession.create_client at the Home Assistant integration import path (homeassistant.components.idrive_e2...) instead of the library path to match where it’s used.
Fixed in this commit
| @pytest.fixture | ||
| def mock_aiohttp_session() -> Generator[AsyncMock]: | ||
| """Patch async_get_clientsession to return a mocked aiohttp session.""" | ||
| mock_session = AsyncMock() | ||
| with patch( | ||
| "homeassistant.components.idrive_e2.config_flow.async_get_clientsession", | ||
| return_value=mock_session, | ||
| ): | ||
| yield mock_session |
There was a problem hiding this comment.
We don't have to mock this one really
There was a problem hiding this comment.
Agreed and has been removed.
Fixed in this commit
| mock_client.list_buckets.return_value = { | ||
| "Buckets": [{"Name": USER_INPUT[CONF_BUCKET]}] | ||
| } |
There was a problem hiding this comment.
Why isn't this the default in the mock? You'll likely use it in more tests
There was a problem hiding this comment.
I have moved the default list_buckets return value into the shared mock_client fixture and drop the repeated assignments in the tests.
Fixed in this commit
| mock_client.list_buckets.return_value = { | ||
| "Buckets": [{"Name": USER_INPUT[CONF_BUCKET]}] | ||
| } |
There was a problem hiding this comment.
In the case you do followup on the last comment, if this is set as default, you don't have to set it again when removing a side_effect
There was a problem hiding this comment.
Removed assignment because of previous fix.
Fixed in this commit
| MockConfigEntry( | ||
| domain=DOMAIN, | ||
| data={ | ||
| CONF_BUCKET: USER_INPUT[CONF_BUCKET], | ||
| CONF_ENDPOINT_URL: USER_INPUT[CONF_ENDPOINT_URL], | ||
| }, | ||
| unique_id="existing", | ||
| ).add_to_hass(hass) |
There was a problem hiding this comment.
Why don't you use mock_config_entry fixture
There was a problem hiding this comment.
I updated the code and now use the mock_config_entry as a template because MockConfigEntry.data is immutable.
Fixed in this commit
| # If close() masks the original error, this would raise RuntimeError instead | ||
| with pytest.raises(ConfigEntryError): | ||
| await async_setup_entry(hass, mock_config_entry) |
There was a problem hiding this comment.
So we shouldn't touch async_setup_entry directly, instead we should use await hass.config_entries.async_setup() and then we can assert that the mock_config_entry.state isSETUP_ERROR
There was a problem hiding this comment.
I switched the test to set up the entry via hass.config_entries.async_setup().
Fixed in this commit
| with patch( | ||
| "aiobotocore.session.AioSession.create_client", | ||
| side_effect=exception, | ||
| ): |
There was a problem hiding this comment.
patch where we use it, and preferrably use the shared mock
There was a problem hiding this comment.
Adopted patch where used
Fixed in this commit
| async def test_setup_entry_head_bucket_error( | ||
| hass: HomeAssistant, | ||
| mock_config_entry: MockConfigEntry, | ||
| mock_client: AsyncMock, | ||
| ) -> None: | ||
| """Test setup_entry error when calling head_bucket.""" | ||
| mock_client.head_bucket.side_effect = ClientError( | ||
| error_response={"Error": {"Code": "InvalidAccessKeyId"}}, | ||
| operation_name="head_bucket", | ||
| ) | ||
| await setup_integration(hass, mock_config_entry) | ||
| assert mock_config_entry.state is ConfigEntryState.SETUP_ERROR | ||
|
|
||
|
|
||
| async def test_setup_entry_head_bucket_not_found_error( | ||
| hass: HomeAssistant, | ||
| mock_config_entry: MockConfigEntry, | ||
| mock_client: AsyncMock, | ||
| ) -> None: | ||
| """Test setup_entry error when head_bucket returns: Not Found.""" | ||
| mock_client.head_bucket.side_effect = ClientError( | ||
| error_response={"Error": {"Code": "404", "Message": "Not Found"}}, | ||
| operation_name="head_bucket", | ||
| ) | ||
| await setup_integration(hass, mock_config_entry) | ||
| assert mock_config_entry.state is ConfigEntryState.SETUP_ERROR |
There was a problem hiding this comment.
Agreed, these head_bucket error tests need to be parametrized.
Fixed in this commit
zweckj
left a comment
There was a problem hiding this comment.
It'd be nice to also get the improved buffer handling to the other S3 integrations if you don’t mind
|
Please DM your Discord user, so we can add you to the appropriate channels, thanks! |
Proposed change
Based on the feedback earlier that S3 backup providers should be specific. I started developing the IDrive e2 backup provider based on the AWS S3 backup integration. This implementation for instance uses a specific config_flow for IDrive e2 which automatically determines the endpoint based on the provided credentials and provides the user a dropdown selection box to select the bucket. The iDrive E2 backup provider has been running for the last month on my production HA machine with no issues.
Type of change
Additional information
Checklist
ruff format homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all.To help with the load of incoming pull requests: