feat: add Databricks Platform Migration workshop and module#61
Open
devin-ai-integration[bot] wants to merge 4 commits into
Open
feat: add Databricks Platform Migration workshop and module#61devin-ai-integration[bot] wants to merge 4 commits into
devin-ai-integration[bot] wants to merge 4 commits into
Conversation
- New module: modules/data-engineering/databricks-platform-migration.md Converts standard PySpark/Airflow/dbt stacks to Databricks Notebooks, Delta Lake, Workflows, and dbt-databricks. Uses streamify-data-engineering and uc-data-source-migration-legacy-to-modern repos. - New workshop: workshops/databricks-migration/README.md Two tracks (6 labs) for Databricks-user audiences: Track A: Open-source stack → Databricks Lakehouse (parallel sessions) Track B: Legacy data → Databricks Lakehouse - Updated data-engineering modules README with new module and repo entries - Updated workshops README with new workshop listing - Updated catalog repos.md with enriched streamify-data-engineering entry and new etl-workflow entry - Updated upstream-map.yaml with etl-workflow entry
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
… maintenance section - Move etl-workflow entry to correct alphabetical position in upstream-map.yaml (between dotnet-modular-monolith-fe-react and fineract) and repos.md (between angular-1.x-dashboard and ts-informatica-powercenter) - Add Scheduled Maintenance section to Databricks Migration workshop with 5 recurring O&M prompts (Delta table maintenance, dbt drift detection, notebook code quality, dependency hygiene, data quality monitoring)
- Add etl-workflow to module Repositories section with full detailed section (anchor, repo link, Step 1-4 instructions) - Remove explicit upstream_url: null from etl-workflow in upstream-map.yaml to match pattern of other original repos - Update modules/README.md navigation index: add Databricks Platform Migration and COBOL Copybook rows to Data Engineering table, update module count from 7 to 9
…urce-migration-legacy-to-modern
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new Databricks Platform Migration workshop and module targeting Databricks-user audiences. The narrative: "Your team has standard data engineering code (PySpark, Airflow, dbt, SQL ETL) — now convert it to Databricks-native equivalents."
New files
modules/data-engineering/databricks-platform-migration.md— Challenge module covering PySpark to Databricks Notebooks, Airflow to Databricks Workflows, dbt-bigquery to dbt-databricks, Parquet to Delta Lake.workshops/databricks-migration/README.md— Two-track workshop (6 labs) plus scheduled O&M section:streamify-data-engineeringUpdated files
modules/data-engineering/README.md— Added new module and repo entriesworkshops/README.md— Added Databricks Migration to workshops tablecatalog/repos.md— Enriched streamify-data-engineering description, added etl-workflow (alphabetically ordered)catalog/upstream-map.yaml— Added etl-workflow entry (alphabetically ordered)Review & Testing Checklist for Human
Notes
Link to Devin session: https://partner-workshops.devinenterprise.com/sessions/e069cd377a4e4fc3b94f08fa1f2d6295
Requested by: @bsmitches