Skip to content

fix: Improve sparse checkout script reliability and error handling#139

Open
pgabriel-01 wants to merge 23 commits intoAzure:Dec2025patchfrom
pgabriel-01:main
Open

fix: Improve sparse checkout script reliability and error handling#139
pgabriel-01 wants to merge 23 commits intoAzure:Dec2025patchfrom
pgabriel-01:main

Conversation

@pgabriel-01
Copy link

PR Summary into Azure/mlops-project-template

Checklist

I have:

  • read and followed the contributing guidelines
  • PR has a meaningful title
  • Summarized the changes
  • PR is ready to merge and NOT ** WORK IN PROGRESS **

Changes

Fixed several issues in the sparse checkout script to improve reliability when bootstrapping new ML projects:

Bug Fixes

  • Fixed missing directories issue: Explicitly specify all subdirectories (data, data-science, mlops) in sparse checkout to ensure complete folder structure is pulled
  • Added directory validation: Check if directories exist before attempting to move them, with warning messages for missing folders
  • Fixed git reinitialization errors: Suppress stderr during git init to prevent "not a git repository" warnings
  • Improved GitHub repo creation: Added --confirm flag and corrected repo name format to $github_org_name/$project_name

Improvements

  • Added progress messages for better user feedback during execution
  • Separated git commands to individual lines for easier debugging
  • Added error handling to prevent script failure when directories are missing

These changes ensure users get a complete project structure when running the sparse checkout script, eliminating the "missing folders" issue reported on different computers.

fixes #

prgabriel and others added 23 commits November 12, 2025 15:10
… and private endpoints

- Add comprehensive Terraform v4.11 modernization details
- Document Python dependencies upgrade from 3.8 to 3.11
- Add Azure SDK updates (azure-ai-ml 1.1.0 → 1.21.1)
- Document private endpoint implementation for all services
- Update version matrix with all completed changes
- Add Key Vault RBAC migration details
- Include storage security enhancements (TLS 1.2, retention policies)
- Document Container Registry upgrade to Premium SKU
…d compute SKUs

GitHub Actions Guide (deployguide_gha.md):
- Replace service principal secrets with OIDC workload identity federation
- Update authentication setup with federated credentials
- Change from single AZURE_CREDENTIALS secret to three separate secrets (CLIENT_ID, TENANT_ID, SUBSCRIPTION_ID)
- Update compute SKUs from Standard_DS3_v2 to Standard_D4s_v5 (5th gen)
- Update quota requirements to Dsv5 family

Azure DevOps Guide (deployguide_ado.md):
- Add security best practice note about workload identity federation
- Update prerequisites with Terraform v1.9.0+ and Python 3.11 requirements
- Update all compute SKU references to Standard_D4s_v5
- Update compute instance examples with 5th generation SKUs

These changes align with the infrastructure modernization completed in the project templates.
…vider to ~> 4.52.0 in Azure DevOps deployment guide
…uring Terraform variables for GitHub Actions permissions
- Clarify service principal object ID requirement and usage
- Add complete terraform.tfvars configuration example with all parameters
- Document Azure ML workspace soft-delete behavior and workarounds
- Add comprehensive endpoint testing instructions (online and batch)
- Add environment destruction/teardown section with automatic endpoint cleanup
- Include troubleshooting guidance for common issues
- Based on successful end-to-end dev and prod deployments
…etion wait

- Updated from 60-second to 2-minute wait for endpoint deletions
- Adjusted total destroy time estimate from 7-10 minutes to 9-12 minutes
- Added note that online endpoints can take 2-3 minutes to fully delete
- Reflects fix in mlops-templates commit 7f20b6e
…ture, clarity, and additional setup instructions
…figuration and replace manual image with automatic image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants