feat: integrate Copilot suggestions and automate PR creation#23
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements end-to-end GitHub workflow automation for Bauer, enabling it to clone repositories, create feature branches, apply suggestions, and automatically create pull requests. The implementation introduces new workflow and GitHub integration packages while restructuring the CLI to support the complete automated workflow.
Changes:
- Added workflow orchestration package that coordinates GitHub setup, Bauer processing, and PR creation
- Implemented API endpoint
/api/v1/workflowfor triggering workflows via HTTP POST requests - Refactored CLI to require GitHub repository and use the new workflow execution model
- Enhanced PR creation to handle multi-line gh CLI output with warnings
- Added file exclusion logic to prevent committing temporary Bauer artifacts
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 18 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/workflow/workflow.go | Core workflow orchestration logic that sequences GitHub setup, Bauer processing, and PR creation |
| internal/workflow/api.go | HTTP handler for the workflow API endpoint with request validation and response formatting |
| internal/github/setup.go | Extracted reusable GitHub setup and finalization phase functions |
| internal/github/repo.go | Added logic to exclude bauer-doc-suggestions.json and bauer-output/ from commits |
| internal/github/pr.go | Improved PR URL extraction to handle multi-line output and added token availability logging |
| cmd/bauer/main.go | Restructured CLI to use workflow execution model with required GitHub parameters |
| cmd/app/main.go | Added workflow API endpoint to the HTTP server |
| config.json | Example configuration file for API server |
| Taskfile.yml | Consolidated build tasks and updated run command to start API server |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if err := os.Chdir(input.LocalRepoPath); err != nil { | ||
| output.Status = "failed" | ||
| output.Errors = append(output.Errors, fmt.Sprintf("failed to change to cloned repository: %v", err)) | ||
| output.EndTime = time.Now() | ||
| output.TotalDuration = output.EndTime.Sub(output.StartTime) | ||
| return output, err | ||
| } | ||
| logger.Info("workflow: changed to cloned repository", "path", input.LocalRepoPath) | ||
| defer os.Chdir(originalDir) |
There was a problem hiding this comment.
The workflow execution changes the working directory using os.Chdir (line 146 in workflow.go). If multiple API requests are processed concurrently, they will interfere with each other since os.Chdir affects the entire process. This can cause race conditions where one request's directory change affects another request's file operations. Consider using absolute paths throughout instead of changing directories, or implement request serialization.
| Labels: []string{}, | ||
| } | ||
|
|
||
| finalizationOutput, _ := github.FinalizeGitHubPhase(finalizationInput) |
There was a problem hiding this comment.
The error from FinalizeGitHubPhase is being silently ignored with the blank identifier. If finalization fails, the function should handle the error appropriately. At minimum, it should be logged or added to the errors list in the output. This can lead to silent failures where the workflow appears to complete but the PR creation actually failed.
| finalizationOutput, _ := github.FinalizeGitHubPhase(finalizationInput) | |
| finalizationOutput, err := github.FinalizeGitHubPhase(finalizationInput) | |
| if err != nil { | |
| logger.Error("workflow: GitHub finalization failed", "error", err) | |
| output.Errors = append(output.Errors, fmt.Sprintf("GitHub finalization failed: %v", err)) | |
| } |
| logger.Info("workflow API response", | ||
| "status", response.Status, | ||
| "http_status", statusCode, | ||
| "duration", workflowOutput.TotalDuration, |
There was a problem hiding this comment.
Potential nil pointer dereference: workflowOutput may be nil if ExecuteWorkflow returns an error without setting workflowOutput. The code should check if workflowOutput is not nil before accessing workflowOutput.TotalDuration.
| logger.Info("workflow API response", | |
| "status", response.Status, | |
| "http_status", statusCode, | |
| "duration", workflowOutput.TotalDuration, | |
| duration := time.Duration(0) | |
| if workflowOutput != nil { | |
| duration = workflowOutput.TotalDuration | |
| } | |
| logger.Info("workflow API response", | |
| "status", response.Status, | |
| "http_status", statusCode, | |
| "duration", duration, |
| workflowInput := workflow.WorkflowInput{ | ||
| GitHubRepo: *githubRepo, | ||
| GitHubToken: ghToken, | ||
| BranchPrefix: *branchPrefix, | ||
| DocID: *docID, | ||
| Credentials: *credentialsPath, | ||
| LocalRepoPath: *localRepoPath, | ||
| DryRun: *dryRun, | ||
| OutputDir: *outputDir, | ||
| } |
There was a problem hiding this comment.
The CLI doesn't expose flags for ChunkSize, PageRefresh, or Model configuration options that are available in the WorkflowInput. Users must rely on defaults (ChunkSize=0, PageRefresh=false, Model="") which will be set by config.Validate(). Consider adding CLI flags for these options to provide users with the same configurability that was available in the previous CLI version.
| if len(output.Errors) == 0 { | ||
| output.Status = "success" | ||
| } else if output.FinalizationInfo.BranchPushed { | ||
| output.Status = "partial" | ||
| } else { | ||
| output.Status = "failed" | ||
| } |
There was a problem hiding this comment.
The status determination logic has inconsistent behavior. If errors occur during Bauer processing (line 178), the status is set to "partial", but if the finalization phase has errors and the branch was not pushed (lines 245-250), the status is set to "failed". However, if Bauer succeeds but finalization fails to push the branch, the status would be "failed" even though Bauer processing completed. Consider refining the status logic to better represent the different failure scenarios.
| // ExecuteWorkflow orchestrates the complete flow: | ||
| // 1. GitHub Setup (clone, create branch) | ||
| // 2. Bauer Processing (extract, chunk, apply changes) | ||
| // 3. GitHub Finalization (commit, push, create PR) |
There was a problem hiding this comment.
The ExecuteWorkflow function lacks a complete doc comment explaining its parameters, return values, and potential error scenarios. According to Go conventions, exported functions should have comprehensive documentation that starts with the function name. Add documentation describing the workflow phases, how errors are handled, and what the output contains.
| // ExecuteWorkflow orchestrates the complete flow: | |
| // 1. GitHub Setup (clone, create branch) | |
| // 2. Bauer Processing (extract, chunk, apply changes) | |
| // 3. GitHub Finalization (commit, push, create PR) | |
| // ExecuteWorkflow runs the full Bauer workflow for a single document update. | |
| // | |
| // The workflow is composed of three phases: | |
| // 1. GitHub setup: clones or opens the target repository, creates a working | |
| // branch using the configured branch prefix, and records repository metadata. | |
| // 2. Bauer processing: runs the Bauer orchestrator to extract content from the | |
| // source document, chunk it, plan changes, and apply modifications to the | |
| // local repository. | |
| // 3. GitHub finalization: commits the applied changes, pushes the working | |
| // branch, and optionally creates a pull request. | |
| // | |
| // Parameters: | |
| // - ctx: controls cancellation and deadlines for the entire workflow execution. | |
| // - input: high-level configuration for the workflow, including GitHub | |
| // repository information, Bauer document settings, model configuration, and | |
| // local repository overrides. | |
| // - orch: the orchestrator implementation used to perform the Bauer processing | |
| // phase. | |
| // | |
| // Returns: | |
| // - *WorkflowOutput: a structured summary of the execution, including | |
| // repository details, Bauer processing metrics, finalization information, and | |
| // overall status/timing, even when an error is returned where possible. | |
| // - error: a non-nil error if any phase encounters a failure that prevents the | |
| // workflow from completing successfully. | |
| // | |
| // Errors and warnings are also reflected in the returned WorkflowOutput: | |
| // - Status is set to "success", "partial", or "failed" depending on how many | |
| // phases completed successfully. | |
| // - Errors contains human-readable descriptions of failures. | |
| // - Warnings contains non-fatal issues that occurred during execution. |
| "output_dir": "bauer-output", | ||
| "model": "gpt-5-mini-high", | ||
| "summary_model": "gpt-5-mini-high", | ||
| "target_repo": "../canonical.com", |
There was a problem hiding this comment.
The config.json file contains a relative path "../canonical.com" for target_repo. This path is specific to a particular developer's directory structure and should not be committed to version control. Consider using an environment variable, a local config file that's gitignored, or documenting that this should be customized per developer.
| "target_repo": "../canonical.com", | |
| "target_repo": "./canonical.com", |
| if *githubRepo == "" { | ||
| fmt.Fprintf(os.Stderr, "ERROR: --github-repo is required\n") | ||
| os.Exit(1) | ||
| } | ||
| fmt.Printf("✅ Branch created: %s\n", branchName) | ||
|
|
||
| // 6. Check current branch | ||
| current, err := github.GetCurrentBranch(localPath) | ||
| if err != nil { | ||
| log.Fatal(err) | ||
| if *docID == "" { | ||
| fmt.Fprintf(os.Stderr, "ERROR: --doc-id is required\n") | ||
| os.Exit(1) | ||
| } |
There was a problem hiding this comment.
The CLI now requires --github-repo and --doc-id flags but doesn't provide backward compatibility with the previous interface that could run Bauer processing standalone. Users who were using the CLI without GitHub integration will experience a breaking change. Consider adding a flag or subcommand to support both standalone Bauer processing and the full workflow, or document this breaking change in release notes.
| fmt.Fprintf(os.Stderr, "WARNING: Could not get GitHub token: %v\n", err) | ||
| ghToken = "" |
There was a problem hiding this comment.
The GitHub token retrieval error is only logged as a warning and the token is set to an empty string. However, the workflow will likely fail later when trying to authenticate with GitHub. Consider making this a fatal error or at least inform the user that they need to provide a token through environment variables or gh auth login.
| fmt.Fprintf(os.Stderr, "WARNING: Could not get GitHub token: %v\n", err) | |
| ghToken = "" | |
| fmt.Fprintf(os.Stderr, "ERROR: Could not get GitHub token: %v\n", err) | |
| fmt.Fprintln(os.Stderr, "Please provide a GitHub token via the appropriate environment variable or by running `gh auth login`.") | |
| os.Exit(1) |
| func ExecuteWorkflowHandler(orch orchestrator.Orchestrator) http.HandlerFunc { | ||
| return func(w http.ResponseWriter, r *http.Request) { | ||
| logger := slog.Default() | ||
|
|
||
| if r.Method != http.MethodPost { | ||
| writeError(w, http.StatusMethodNotAllowed, "method not allowed") | ||
| return | ||
| } | ||
|
|
||
| // Parse request | ||
| var req APIRequest | ||
| if err := json.NewDecoder(r.Body).Decode(&req); err != nil { | ||
| logger.Error("failed to parse request", "error", err) | ||
| writeError(w, http.StatusBadRequest, fmt.Sprintf("invalid request: %v", err)) | ||
| return | ||
| } | ||
|
|
||
| // Validate request | ||
| if req.GitHubRepo == "" { | ||
| writeError(w, http.StatusBadRequest, "github_repo is required") | ||
| return | ||
| } | ||
| if req.GitHubToken == "" { | ||
| writeError(w, http.StatusBadRequest, "github_token is required") | ||
| return | ||
| } | ||
| if req.DocID == "" { | ||
| writeError(w, http.StatusBadRequest, "doc_id is required") | ||
| return | ||
| } | ||
| if req.Credentials == "" { | ||
| writeError(w, http.StatusBadRequest, "credentials is required") | ||
| return | ||
| } |
There was a problem hiding this comment.
The API endpoint accepts GitHub tokens in the request body without any authentication or authorization mechanism. This means anyone who can reach the API can trigger arbitrary actions on any GitHub repository using any provided token. Consider implementing authentication for the API endpoint itself, or at minimum, rate limiting and request validation to prevent abuse.
When a POST request is made, Bauer will clone the repo, create a feature branch, implement the suggestions on the cloned repo and push a PR
Done
/api/v1/workflowQA
task rungh auth tokenor github.comgithub_repois set to my fork at the moment to prevent inflating the main repo with PRs. You can also update it tocanonical/ubuntu.comto test it on the main repoFixes
Fixes WD-33641