feat: bring back 3p setup infrastructure module at telescope#1062
feat: bring back 3p setup infrastructure module at telescope#1062vittoriasalim wants to merge 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request introduces infrastructure-as-code modules for setting up the Telescope test framework's third-party dependencies and automation. The PR adds Terraform modules under modules/terraform/setup/ that automate the creation of Azure and AWS resources, Azure DevOps pipelines, and Kusto database connections required to run Telescope benchmarks.
Changes:
- Adds infrastructure setup module to create Azure/AWS resources, service connections, and IAM users
- Adds pipeline setup module to create and configure Azure DevOps pipelines
- Adds table-data-connections module to create Kusto tables and Event Grid data connections
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 22 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/terraform/setup/infrastructure/main.tf | Main infrastructure provisioning including Azure resources, Kusto cluster, AWS IAM, and service connections |
| modules/terraform/setup/infrastructure/variables.tf | Variable definitions for infrastructure configuration |
| modules/terraform/setup/infrastructure/infrastructure.tfvars | Sample configuration values for infrastructure setup |
| modules/terraform/setup/pipeline/main.tf | Azure DevOps pipeline creation and authorization |
| modules/terraform/setup/pipeline/variables.tf | Variable definitions with validation rules for pipeline configuration |
| modules/terraform/setup/pipeline/pipeline.tfvars | Sample pipeline configuration |
| modules/terraform/setup/table-data-connections/main.tf | Kusto table and Event Grid data connection setup |
| modules/terraform/setup/table-data-connections/variables.tf | Variable definitions for table and data connection setup |
| modules/terraform/setup/table-data-connections/table-data-connections.tfvars | Sample configuration for table setup |
| modules/terraform/setup/table-data-connections/bash/script.sh | Script to determine EventHub namespace creation requirements |
| modules/terraform/setup/table-data-connections/result.json | Sample JSON data for testing table schema generation |
| modules/terraform/setup/README.md | Documentation for setup modules including prerequisites and usage |
| modules/terraform/setup/Makefile | Automation for running setup modules |
| service_endpoint_name = var.github_config.service_connection_name | ||
| description = var.github_config.service_connection_description | ||
| auth_personal { | ||
| personal_access_token = null |
There was a problem hiding this comment.
Setting personal_access_token to null appears incorrect. The GitHub service connection requires a valid PAT for authentication. Based on the README documentation, this should be provided via the environment variable AZDO_GITHUB_SERVICE_CONNECTION_PAT, but the resource definition needs to properly reference it or allow the provider to read it from the environment.
| personal_access_token = null |
| } | ||
|
|
||
| resource "azurerm_kusto_script" "script" { | ||
| name = "kusto-script-${formatdate("MM-DD-YYYY-hh-mm-ss", timestamp())}" |
There was a problem hiding this comment.
The timestamp() function is called directly which will cause the resource name to change on every terraform plan/apply run, leading to unnecessary resource recreation. Consider using the replace_triggered_by lifecycle meta-argument or using a fixed naming scheme to prevent unintended resource churn.
| name = "kusto-script-${formatdate("MM-DD-YYYY-hh-mm-ss", timestamp())}" | |
| name = "kusto-script-${local.kusto_table_name}" |
| - Create a new pipeline from the existing YAML file in Azure DevOps or GitHub | ||
| - Create new variables for the pipeline | ||
| - Attach/Link existing Variable Groups to the pipeline | ||
| - Authroize the pipeline to use the existing Service Connections listed in the tfvars file |
There was a problem hiding this comment.
There is a typo: "Authroize" should be "Authorize".
| - Authroize the pipeline to use the existing Service Connections listed in the tfvars file | |
| - Authorize the pipeline to use the existing Service Connections listed in the tfvars file |
| value = variable.value.value | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The infrastructure module creates critical resources like storage accounts, Kusto clusters, and service connections but does not define any outputs. Consider adding outputs for important resource attributes (e.g., storage_account_name, kusto_cluster_name, resource_group_name, service_connection_ids) that might be needed by downstream modules or for reference after deployment.
| } | |
| } | |
| # Outputs for critical resources | |
| output "aws_service_connection_id" { | |
| description = "ID of the AWS service connection created in Azure DevOps." | |
| value = azuredevops_serviceendpoint_aws.aws_service_connection.id | |
| } | |
| output "aws_service_connection_name" { | |
| description = "Name of the AWS service connection created in Azure DevOps." | |
| value = azuredevops_serviceendpoint_aws.aws_service_connection.service_endpoint_name | |
| } |
| data_format = "JSON" | ||
| mapping_rule_name = "${local.kusto_table_name}_mapping" | ||
| depends_on = [azurerm_eventgrid_event_subscription.event_subscription, azurerm_kusto_script.script, azurerm_eventhub_consumer_group.consumer_group] | ||
| } |
There was a problem hiding this comment.
The table-data-connections module creates tables and data connections but does not define any outputs. Consider adding outputs for important attributes like the table name, data connection name, or eventhub details that might be useful for verification or downstream automation.
| } | |
| } | |
| output "kusto_table_name" { | |
| description = "Name of the Kusto table used by the Event Grid data connection." | |
| value = local.kusto_table_name | |
| } | |
| output "kusto_eventgrid_data_connection_name" { | |
| description = "Name of the Kusto Event Grid data connection." | |
| value = azurerm_kusto_eventgrid_data_connection.evengrid_connection.name | |
| } | |
| output "eventhub_name" { | |
| description = "Name of the Event Hub receiving events." | |
| value = azurerm_eventhub.eventhub.name | |
| } | |
| output "eventhub_id" { | |
| description = "Resource ID of the Event Hub receiving events." | |
| value = azurerm_eventhub.eventhub.id | |
| } |
| - AWS IAM User and Access Key's | ||
| - Azure DevOps Variable Groups | ||
|
|
||
| All the resources are created based on the input tfvars file which is located here [setup.tfvars](./infrastructure/setup.tfvars) |
There was a problem hiding this comment.
The documentation references "setup.tfvars" file, but the actual file is named "infrastructure.tfvars" as seen in the Makefile and the file listing. This inconsistency could confuse users trying to follow the documentation.
| All the resources are created based on the input tfvars file which is located here [setup.tfvars](./infrastructure/setup.tfvars) | |
| All the resources are created based on the input tfvars file which is located here [infrastructure.tfvars](./infrastructure/infrastructure.tfvars) |
| resource_id = data.azuredevops_serviceendpoint_azurerm.service_connection[count.index].id | ||
| type = "endpoint" | ||
| pipeline_id = azuredevops_build_definition.Pipeline.id | ||
| } |
There was a problem hiding this comment.
The pipeline module creates a build definition but does not define any outputs. Consider adding an output for the pipeline ID or URL that might be useful for reference after deployment or for integration with other automation.
| } | |
| } | |
| output "pipeline_id" { | |
| description = "The ID of the Azure DevOps pipeline created by this module." | |
| value = azuredevops_build_definition.Pipeline.id | |
| } | |
| output "pipeline_url" { | |
| description = "The URL of the Azure DevOps pipeline created by this module." | |
| value = azuredevops_build_definition.Pipeline.url | |
| } |
| } | ||
|
|
||
| resource "azurerm_eventhub" "eventhub" { | ||
| name = "adx-eg-${formatdate("MM-DD-YYYY-hh-mm-ss", timestamp())}" |
There was a problem hiding this comment.
The timestamp() function is called directly which will cause the resource name to change on every terraform plan/apply run, leading to unnecessary resource recreation. Consider using the replace_triggered_by lifecycle meta-argument or using a fixed naming scheme to prevent unintended resource churn.
| } | ||
| aws = { | ||
| source = "hashicorp/aws" | ||
| version = "<= 5.38" |
There was a problem hiding this comment.
The AWS provider version constraint (<= 5.38) appears outdated compared to the main AWS terraform module which uses <= 6.27.0. Consider updating to maintain consistency across the codebase and benefit from newer provider features and bug fixes.
| version = "<= 5.38" | |
| version = "<= 6.27.0" |
| # Check if eventhub_instances is empty | ||
| if [ -z "$eventhub_instances" ]; then | ||
| create_eventhub_namespace=true | ||
| eventhub_namespace=null |
There was a problem hiding this comment.
The logic for determining when to create a new EventHub namespace may have a bug. When an empty namespace is found (line 14-17), the script sets create_eventhub_namespace=true and breaks. However, an empty namespace could be reused instead of creating a new one. Consider whether the intended behavior is to reuse an empty namespace or always create a new one when an empty namespace exists.
| # Check if eventhub_instances is empty | |
| if [ -z "$eventhub_instances" ]; then | |
| create_eventhub_namespace=true | |
| eventhub_namespace=null | |
| # Check if eventhub_instances is empty; if so, reuse this namespace | |
| if [ -z "$eventhub_instances" ]; then | |
| create_eventhub_namespace=false |
as title