repo_support: add a script that generates cherry pick PRs from main to a release branch, in correct merge order#2680
repo_support: add a script that generates cherry pick PRs from main to a release branch, in correct merge order#2680mattkur wants to merge 4 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a Python script to automate the creation of cherry-pick PRs from the main branch to release branches. The script handles the complete workflow of finding merged PRs, cherry-picking them in merge order, and creating corresponding backport PRs on GitHub.
Changes:
- Adds
gen_cherrypick_prs.pyscript that generates cherry-pick PRs for release branches - Supports both manual PR specification and automatic discovery via backport labels
- Includes conflict detection, duplicate backport checking, and interactive confirmation
repo_support/gen_cherrypick_prs.py
Outdated
| queries.append(f'base:{base_branch} "{orig_title}" in:title') | ||
|
|
||
| items: List[dict] = [] | ||
| seen_numbers: set[int] = set() |
There was a problem hiding this comment.
The type hint set[int] uses Python 3.9+ syntax. For better compatibility with older Python versions, consider using Set[int] from the typing module which is already imported on line 41.
repo_support/gen_cherrypick_prs.py
Outdated
| return f"{orange}*IN PROGRESS*{reset}" | ||
| return "*NONE*" | ||
|
|
||
| backport_map: dict[int, List[BackportInfo]] = {} |
There was a problem hiding this comment.
The type hint dict[int, List[BackportInfo]] uses Python 3.9+ syntax. For consistency with the typing imports already present and better compatibility, consider using Dict[int, List[BackportInfo]] from the typing module.
repo_support/gen_cherrypick_prs.py
Outdated
| stderr = getattr(p, "stderr", "") or "" | ||
| stdout = getattr(p, "stdout", "") or "" |
There was a problem hiding this comment.
The getattr() calls with fallback empty strings are unnecessary. When capture=True, subprocess.run with stdout=subprocess.PIPE and stderr=subprocess.PIPE always sets these attributes. The defensive coding here could mask programming errors.
| stderr = getattr(p, "stderr", "") or "" | |
| stdout = getattr(p, "stdout", "") or "" | |
| stderr = p.stderr or "" | |
| stdout = p.stdout or "" |
repo_support/gen_cherrypick_prs.py
Outdated
| seen_numbers.add(n) | ||
| items.append(pr) | ||
|
|
||
| rx = re.compile(rf"\bPR\s*#?{pr_number}\b", re.IGNORECASE) |
There was a problem hiding this comment.
The regex pattern doesn't escape the PR number, which could lead to false matches. For example, if searching for PR #123, the pattern \bPR\s*#?123\b would match "PR #1234" because \b matches between "123" and "4". Consider using a more precise pattern like rf"\bPR\s*#?{pr_number}(?:\b|$)" or ensuring the match is complete.
| rx = re.compile(rf"\bPR\s*#?{pr_number}\b", re.IGNORECASE) | |
| rx = re.compile( | |
| rf"\bPR\s*#?{re.escape(str(pr_number))}(?:\b|$)", re.IGNORECASE | |
| ) |
| if not confirm("Create this PR on GitHub? [y/N] "): | ||
| print("Aborting by user request. No further PRs will be created.") | ||
| return 0 |
There was a problem hiding this comment.
When the user confirms to not create a PR (line 586-588), the script returns 0 (success), but some PRs may still be unprocessed. This could be confusing. Consider returning a different exit code or providing a summary of skipped PRs to make it clear that not all PRs were processed.
| green = "\x1b[32m" | ||
| orange = "\x1b[38;5;208m" | ||
| reset = "\x1b[0m" |
There was a problem hiding this comment.
The script uses ANSI color codes directly in the main function. These may not work correctly on all terminals (especially on Windows without ANSI support). Consider checking if the output is a TTY and/or if color is supported before using color codes, or using a library like colorama for cross-platform color support.
| if not ( | ||
| rx.search(hay) | ||
| or url_fragment in hay | ||
| or (title_lc and title_lc in title.lower()) |
There was a problem hiding this comment.
The title matching on line 160 uses substring matching (title_lc in title.lower()), which could produce false positives. For example, if the original PR title is "Fix bug", it would match any PR with "Fix bug" anywhere in its title like "Fix bug in parser" even if that's a different PR. Consider using more precise matching or requiring the title match to be more substantial (e.g., checking for an exact match or a high similarity threshold).
repo_support/gen_cherrypick_prs.py
Outdated
| "--search", | ||
| query, | ||
| "--limit", | ||
| "50", |
There was a problem hiding this comment.
The gh pr list --limit 50 hardcoded limit in the search query might miss backport PRs if there are more than 50 matching PRs. This could lead to the script incorrectly thinking a PR hasn't been backported when it actually has. Consider increasing this limit or adding pagination to ensure all matching PRs are found.
| "50", | |
| "1000", |
repo_support/gen_cherrypick_prs.py
Outdated
| title = str(pr.get("title") or "") | ||
| body = str(pr.get("body") or "") | ||
| hay = f"{title}\n{body}" | ||
| hay_lc = hay.lower() |
There was a problem hiding this comment.
Variable hay_lc is not used.
| hay_lc = hay.lower() |
|
This looks great. Thanks for doing it!! |
| orange = "\x1b[38;5;208m" | ||
| reset = "\x1b[0m" | ||
|
|
||
| def status_label(state: str) -> str: |
There was a problem hiding this comment.
nit: Move this function outside of main
| prj = gh_pr_view(n, args.repo) | ||
|
|
||
| state = str(prj.get("state", "")).upper() | ||
| if state != "MERGED": |
There was a problem hiding this comment.
You could also have not_completed list being populated in this branch, instead of failing when it comes across an unmerged.
How about making a function that does that takes a list of pr jsons and spits out 2 lists: infos and not_completed. That way behavior remains consistent in both from_backport_label and by_pr_number branches.
| print("\nDone. Created PRs:") | ||
| for n, url in created: | ||
| print(f" #{n}: {url}") | ||
|
|
There was a problem hiding this comment.
Add a checkout back to the branch this script was started from.
|
@gurasinghMS : thank you for the good feedback! I'm not in a position to take it right now. So, let me know if you think this PR (now updated with minor changes) is "good enough" to be valuable. |
…o a release branch, in correct merge order
alandau
left a comment
There was a problem hiding this comment.
The script works for me (with the PAT change), thanks!
| path = url[len("git@github.com:") :] | ||
| elif url.startswith("https://github.com/"): | ||
| path = url[len("https://github.com/") :] | ||
| else: |
There was a problem hiding this comment.
Can we add support for PAT GitHub auth?
| else: | |
| elif m := re.match(r"https://[\w:]+@github.com/", url): | |
| path = url[m.end() :] | |
| else: |
Adds
gen_cherrypick_prs.py, which cherry picks PRs that are completed intomaininto therelease/<foo>branch.The script generates these in merge order, and stops if there are merge conflicts.
You can specify a set of PRs to cherry pick, or the script will try to do any PRs that are missing.
Written with copilot.