Discover BCM Data Exports CUR 2.0 buckets for IAM policy#110
Conversation
Discovery now calls bcm-data-exports ListExports/GetExport to find the S3 destination buckets of CUR 2.0 exports and folds every export's bucket into MasterPayerBillingBucketArns. Bucket access is granted type-agnostically (only COST_AND_USAGE_REPORT is ingested), and the discovery Lambda role gains the two bcm-data-exports actions. Author: Erik Peterson <erik@cloudzero.com>
Address review feedback on BCM Data Exports discovery: - Wrap each get_export call in its own try/except so a transient failure on a single export no longer drops buckets already resolved from other exports; the failing export is logged and skipped. - Keep list_exports failure as graceful fallback to empty exports. - Add tests for the multi-page list_exports NextToken pagination path and for per-export get_export error isolation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@greptile |
Greptile SummaryThis PR extends the discovery Lambda to enumerate BCM Data Exports (CUR 2.0) S3 destination buckets and fold them into
Confidence Score: 4/5Safe to merge with one fix: a mid-pagination ListExports failure discards all ARNs from earlier pages instead of using them. The list_data_export_bucket_names function wraps the entire list_exports pagination loop in a single try/except. If the first page succeeds but any subsequent page raises, the except block executes return [], silently discarding all ARNs collected from previous pages. For accounts with enough exports to span multiple pages, this means their export buckets are missing from the IAM policy until the next stack update. Everything else looks correct. services/discovery/src/app.py — the list_data_export_bucket_names pagination error handling. Important Files Changed
|
| next_token = response.get('NextToken') | ||
| if not next_token: | ||
| break | ||
| except ClientError: |
There was a problem hiding this comment.
This except ClientError may be too narrow to deliver the intended graceful degradation. Two non-ClientError failure modes can bypass it:
- Import-time client construction (line 19):
boto3.client('bcm-data-exports', ...)runs at module scope. If the Lambda runtime's bundled botocore predates this relatively new service, it raisesUnknownServiceError(abotocore.exceptionserror, notClientError) on cold start — before any try/except runs — crashing the whole custom resource. - Call-time connectivity — e.g.
EndpointConnectionErroris aBotoCoreError, not aClientError.
Suggestion: confirm the runtime's botocore version actually supports bcm-data-exports, and consider broadening the catch here to also include BotoCoreError so unexpected client-side failures still fall back to classic CUR rather than failing the stack.
| try: | ||
| export = bcm.get_export(ExportArn=export_arn).get('Export', {}) | ||
| except ClientError: | ||
| logger.warning(f'Failed to access BCM Data Exports GetExport for {export_arn}', exc_info=True) |
There was a problem hiding this comment.
Nit: prefer lazy %s logging over an f-string for the message arg — it defers interpolation and matches the other logger.warning(...) calls in this module:
logger.warning('Failed to access BCM Data Exports GetExport for %s', export_arn, exc_info=True)| return {'Exports': [{'ExportArn': arn, 'ExportName': arn.split('/')[-1]} for arn in export_arns]} | ||
|
|
||
|
|
||
| def _get_export_response(bucket_name): |
There was a problem hiding this comment.
Small coverage gap: there's no test for the if bucket: guard in coeffects_bcm_data_exports (app.py:171) — i.e. an export whose DestinationConfigurations.S3Destination.S3Bucket is missing/None. A quick test asserting such an export is silently skipped (and doesn't add an empty/None bucket) would lock in that behavior.
| @@ -0,0 +1,24 @@ | |||
| # [1.0.101](https://github.com/Cloudzero/provision-account/compare/1.0.100...1.0.101) (2026-06-15) | |||
There was a problem hiding this comment.
The version (1.0.101) and date (2026-06-15) are hardcoded here. The compare-URL format suggests releases may be auto-generated (semantic-release style) — if so, a hand-written release file could conflict with or duplicate the generated one. Worth confirming this is the intended workflow for the repo.
Collapse the 6 nested stacks into one: connected_account.yaml inlines the Discovery and Notification Lambdas and provisions account resources through a single nested AccountResources stack (merged resource_owner + master_payer). Replace the Custom::CostAndUsageReport Lambda with native AWS::S3::Bucket, AWS::S3::BucketPolicy and AWS::CUR::ReportDefinition, gated on us-east-1 with DeletionPolicy: Retain. Remove the deprecated audit and cloudtrail-owner account types (templates, discovery detection, cloudtrail:DescribeTrails); the reactor payload contract is unchanged, emitting null/false for the deprecated fields. Rewrite the discovery and notification Lambdas into clear procedural flows; notification now reads direct resource properties instead of scraping stack outputs. Add a Stage parameter so prod/dev parents differ by one line. Author: Erik Peterson <erik@cloudzero.com>
- CodeQL (py/clear-text-logging-sensitive-data): stop logging the full event / payload / output in the discovery and notification handlers; they carry ExternalId and account-identifying fields. Log only non-sensitive summary fields. - Broaden the BCM Data Exports list/get catch to BotoCoreError so connectivity errors degrade gracefully (per review feedback). - Use lazy %s logging for the per-export GetExport warning. - Add tests for an export with no destination bucket (the if-bucket guard). - Drop now-unused toolz from discovery/notification requirements. Author: Erik Peterson <erik@cloudzero.com>
|
Thanks for the reviews. Addressed the feedback in CodeQL — clear-text logging of sensitive data ( @qiuz-cz — @qiuz-cz — lazy @qiuz-cz — coverage gap for the @qiuz-cz — hardcoded version/date in Greptile P2s (per-export error isolation + pagination test) — these were resolved by @khill2018's Also dropped the now-unused Heads-up @khill2018: this PR now also contains the connected-account stack flattening (6 nested stacks → 1), native |
The redacted logs still read tainted values: the notification payload dict carries ExternalId (so any field read is flagged) and the discovery log interpolated the CUR bucket name. Log a static message when posting to the reactor, and log only the (literal) billing report format when selecting a CUR. Author: Erik Peterson <erik@cloudzero.com>
CodeQL's clear-text-logging 'private data' heuristic flags the variable name billing_report_format (contains 'billing'), not its value (a literal like 'aws'). The log carried no real value beyond the format already present in the discovery output, so remove it rather than fight the scanner. Author: Erik Peterson <erik@cloudzero.com>
Summary
Extends the discovery Lambda's CUR bucket discovery to also enumerate BCM Data Exports (CUR 2.0) S3 destination buckets, so the master-payer IAM policy grants
s3:Get*/s3:List*on them alongside classic CUR buckets.coeffects_bcm_data_exportscallsbcm-data-exports:ListExports(paginated) thenGetExportper export to readDestinationConfigurations.S3Destination.S3Bucket.get_all_local_cur_bucket_namesunions classic CUR report buckets with the data-export buckets, keeping only locally-owned ones, and feeds them intoMasterPayerBillingBucketArns.DiscoveryFunctionexecution role gainsbcm-data-exports:ListExportsandbcm-data-exports:GetExport. (The cross-account role policies already grantbcm-data-exports:Get*/List*.)AccessDeniedExceptionfrom BCM Data Exports logs a warning and falls back to classic CUR buckets.Behavior notes
COST_AND_USAGE_REPORT-only — CUR 2.0 is not wired into primary-bucket /BillingReportFormatselection; the classic CUR still drives what CloudZero ingests.Testing
pytest— 18 passed, 89% coverage;flake8clean. New tests cover: local CUR 2.0 bucket included, all export buckets covered regardless of type, remote bucket excluded, and graceful degradation on access-denied.