Skip to content

Fix weave init#206

Closed
Benzbeeb wants to merge 4 commits into
mainfrom
fix-weave-init
Closed

Fix weave init#206
Benzbeeb wants to merge 4 commits into
mainfrom
fix-weave-init

Conversation

@Benzbeeb
Copy link
Copy Markdown
Collaborator

@Benzbeeb Benzbeeb commented Mar 19, 2026

Description

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow-up issues.

I have...

  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

  • Bug Fixes

    • Binary lookup now finds matching files even when they are not marked executable.
    • Archive extraction improved: resolves target paths, prevents path traversal, ensures parent directories exist, handles symbolic links safely, and cleans up correctly on extraction errors.
  • Tests

    • Updated test expectations to reflect the changed binary lookup behavior.

- Improved error handling during file creation by ensuring the target directory exists before creating files.
- Added support for symlink extraction, allowing the creation of symbolic links as specified in the tar.gz archive.
- Streamlined the file handling logic to ensure proper closure of file descriptors, enhancing resource management.
- Modified the FindBinaryDir function to allow non-executable files to be recognized, changing the behavior to return the current directory when a non-executable binary is found.
- Updated the corresponding test case to reflect this change, ensuring that non-executable files are now correctly identified in the binary search process.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 19, 2026

Walkthrough

FindBinaryDir now matches binaries by filename regardless of executable bits; ExtractTarGz canonicalizes destination, prevents path traversal, ensures parent directories, preserves file modes, closes files on copy errors, and adds symlink extraction with root-bound validation.

Changes

Cohort / File(s) Summary
Binary discovery
cosmosutils/binary.go, cosmosutils/binary_test.go
Remove executable-bit check in FindBinaryDir so match is by filename only; test updated to expect non-executable files to be found.
Tar extraction and filesystem safety
io/filesystem.go
Normalize dest (Abs + EvalSymlinks), derive extraction root, reject path-traversing entries, create parent dirs for files/symlinks, use os.OpenFile with header mode, close files on io.Copy failure, and handle tar.TypeSymlink with root-bound validation and os.Symlink creation.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant Caller
    end
    rect rgba(200,255,200,0.5)
    participant Extractor
    end
    rect rgba(255,200,200,0.5)
    participant Filesystem
    end
    rect rgba(255,255,200,0.5)
    participant PathValidator
    end

    Caller->>Extractor: Call ExtractTarGz(src, dest)
    Extractor->>Filesystem: Resolve dest -> Abs + EvalSymlinks
    Extractor->>Extractor: Read next tar header
    alt header.Type == TypeReg
        Extractor->>Filesystem: MkdirAll(parent of target)
        Extractor->>Filesystem: OpenFile(target, mode)
        Extractor->>Filesystem: io.Copy(content -> file)
        Extractor->>Filesystem: Close(file) on success or error
    else header.Type == TypeSymlink
        Extractor->>PathValidator: Validate link target within dest root
        PathValidator-->>Extractor: allowed / rejected
        alt allowed
            Extractor->>Filesystem: MkdirAll(parent of symlink)
            Extractor->>Filesystem: os.Symlink(linkname, target)
        else rejected
            Extractor-->>Caller: return error
        end
    else
        Extractor-->>Caller: handle other types / continue
    end
    Extractor-->>Caller: return nil on completion
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I sniffed the paths both near and far,
Followed roots and healed each tar,
No exec bit stops my chase,
I stitch symlinks in their place,
A happy hop — archive unbar!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Fix weave init' is vague and does not clearly describe the primary changes, which involve enhancing tar.gz extraction with symlink validation and modifying binary file matching logic. Consider using a more descriptive title that reflects the main changes, such as 'Enhance tar.gz extraction with path traversal protection and binary file matching improvements' or break into multiple focused PRs.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-weave-init
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread io/filesystem.go Dismissed
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
io/filesystem.go (1)

63-92: ⚠️ Potential issue | 🔴 Critical

Block archive entries from escaping dest before creating files or symlinks.

A crafted tar archive can escape the destination directory through two vectors: (1) header.Name with path traversal sequences like ../ that, when joined and resolved by the filesystem, point outside dest, and (2) header.Linkname creating symlinks to targets outside dest, which subsequent entries then write through. Validate the resolved target path is within dest before creating directories, files, or symlinks. For symlinks, also resolve the linkname relative to its directory and confirm it stays within bounds.

🔒 Hardening sketch
+func withinRoot(root, path string) bool {
+	rel, err := filepath.Rel(root, path)
+	return err == nil && rel != ".." && !strings.HasPrefix(rel, ".."+string(os.PathSeparator))
+}
+
 func ExtractTarGz(src string, dest string) error {
 	...
-		target := filepath.Join(dest, header.Name)
+		target := filepath.Clean(filepath.Join(dest, header.Name))
+		if !withinRoot(dest, target) {
+			return fmt.Errorf("archive entry escapes destination: %q", header.Name)
+		}
 		switch header.Typeflag {
 		case tar.TypeReg:
+			resolvedParent, err := filepath.EvalSymlinks(filepath.Dir(target))
+			if err != nil {
+				return err
+			}
+			if !withinRoot(dest, resolvedParent) {
+				return fmt.Errorf("archive entry follows symlink outside destination: %q", header.Name)
+			}
 			...
 		case tar.TypeSymlink:
+			linkTarget := header.Linkname
+			if !filepath.IsAbs(linkTarget) {
+				linkTarget = filepath.Join(filepath.Dir(target), linkTarget)
+			}
+			linkTarget = filepath.Clean(linkTarget)
+			if !withinRoot(dest, linkTarget) {
+				return fmt.Errorf("symlink escapes destination: %q -> %q", header.Name, header.Linkname)
+			}
 			if err := os.Symlink(header.Linkname, target); err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@io/filesystem.go` around lines 63 - 92, Validate that the resolved path of
each archive entry stays inside dest before creating any dirs/files/symlinks:
for entries using filepath.Join(dest, header.Name) (in the tar.TypeDir and
tar.TypeReg branches) compute cleaned := filepath.Clean(target) and use
filepath.Rel(dest, cleaned) (or equivalent) to ensure the result does not begin
with ".." (return an error if it does) before calling os.MkdirAll or
os.OpenFile; for tar.TypeSymlink, resolve the symlink target relative to the
symlink's parent directory (e.g., filepath.Join(filepath.Dir(target),
header.Linkname) then Clean + Rel against dest) and refuse to create the symlink
if the resolved link target would point outside dest (also refuse creation if
header.Name itself escapes); apply these checks right before each os.MkdirAll,
os.OpenFile, and os.Symlink call in the extraction loop.
🧹 Nitpick comments (1)
cosmosutils/binary.go (1)

262-274: Update FindBinaryDir’s contract to match the new behavior.

The doc comment still promises the directory containing the named executable, but Line 272 now accepts any non-directory entry with that name. Either restore the executability filter or document the broader match so callers do not assume the returned path is already runnable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cosmosutils/binary.go` around lines 262 - 274, The comment for FindBinaryDir
no longer matches its implementation because the filepath.Walk callback accepts
any non-directory entry named binaryName (it no longer checks executability);
update the function to either reintroduce an executability check in the walk
callback (e.g., test file mode for executable bits on info.Mode() before setting
result) or change the doc comment to state it returns the directory of any
matching file name (not necessarily executable). Locate the closure passed to
filepath.Walk inside FindBinaryDir and apply the chosen fix (restore the exec
bit check on info or revise the doc comment) so callers have a correct contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@io/filesystem.go`:
- Around line 63-92: Validate that the resolved path of each archive entry stays
inside dest before creating any dirs/files/symlinks: for entries using
filepath.Join(dest, header.Name) (in the tar.TypeDir and tar.TypeReg branches)
compute cleaned := filepath.Clean(target) and use filepath.Rel(dest, cleaned)
(or equivalent) to ensure the result does not begin with ".." (return an error
if it does) before calling os.MkdirAll or os.OpenFile; for tar.TypeSymlink,
resolve the symlink target relative to the symlink's parent directory (e.g.,
filepath.Join(filepath.Dir(target), header.Linkname) then Clean + Rel against
dest) and refuse to create the symlink if the resolved link target would point
outside dest (also refuse creation if header.Name itself escapes); apply these
checks right before each os.MkdirAll, os.OpenFile, and os.Symlink call in the
extraction loop.

---

Nitpick comments:
In `@cosmosutils/binary.go`:
- Around line 262-274: The comment for FindBinaryDir no longer matches its
implementation because the filepath.Walk callback accepts any non-directory
entry named binaryName (it no longer checks executability); update the function
to either reintroduce an executability check in the walk callback (e.g., test
file mode for executable bits on info.Mode() before setting result) or change
the doc comment to state it returns the directory of any matching file name (not
necessarily executable). Locate the closure passed to filepath.Walk inside
FindBinaryDir and apply the chosen fix (restore the exec bit check on info or
revise the doc comment) so callers have a correct contract.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c3541242-e032-4825-a38b-4d26235842b2

📥 Commits

Reviewing files that changed from the base of the PR and between 42070f1 and 44971fc.

📒 Files selected for processing (3)
  • cosmosutils/binary.go
  • cosmosutils/binary_test.go
  • io/filesystem.go

…lution

- Added absolute path resolution for the extraction destination to improve reliability.
- Implemented symlink validation to ensure extracted links do not escape the designated extraction root, enhancing security.
- Introduced a new helper function, isWithinRoot, to check if a candidate path is within the specified root after resolving symlinks.
coderabbitai[bot]
coderabbitai Bot previously approved these changes Mar 19, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Mar 19, 2026
- Added a destination prefix check to ensure extracted files and symlinks do not escape the specified extraction root, enhancing security.
- Removed the isWithinRoot helper function, streamlining the validation logic for symlinks during extraction.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@io/filesystem.go`:
- Around line 73-85: The current extraction checks only the lexical path
(destPrefix) but does not reject existing symlinked path components, so before
creating directories or opening files (when handling tar.TypeDir and tar.TypeReg
for the variables target and header), walk each path component of target with
os.Lstat (including existing leaf) and if any component is a symlink
(FileMode&os.ModeSymlink != 0) return an error (e.g., "tar entry %q has
symlinked path component") to prevent following symlinks; perform this check
immediately after computing target/validation against destPrefix and before
calling os.MkdirAll or os.OpenFile.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 57f75094-4b9f-428b-9ffa-ea5e5ac96142

📥 Commits

Reviewing files that changed from the base of the PR and between 838b918 and 69c301c.

📒 Files selected for processing (1)
  • io/filesystem.go

Comment thread io/filesystem.go
Comment on lines +73 to +85
if !strings.HasPrefix(target, destPrefix) && target != dest {
return fmt.Errorf("tar entry %q escapes extraction root", header.Name)
}
switch header.Typeflag {
case tar.TypeDir:
if err := os.MkdirAll(target, os.ModePerm); err != nil {
return err
}
case tar.TypeReg:
file, err := os.Create(target)
if err := os.MkdirAll(filepath.Dir(target), os.ModePerm); err != nil {
return err
}
file, err := os.OpenFile(target, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, header.FileInfo().Mode())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -name "filesystem.go" -path "*/io/*" | head -5

Repository: initia-labs/weave

Length of output: 79


🏁 Script executed:

cat -n io/filesystem.go | head -120

Repository: initia-labs/weave

Length of output: 3673


🏁 Script executed:

# Check lines 73-85 specifically and surrounding context
cat -n io/filesystem.go | sed -n '50,110p'

Repository: initia-labs/weave

Length of output: 2055


🌐 Web query:

Go os.MkdirAll os.OpenFile symlink resolution behavior path traversal

💡 Result:

  • os.Open, os.OpenFile, os.Create, os.WriteFile: these use normal OS pathname resolution, meaning symlinks in the path are followed (so a symlink can redirect the operation somewhere else). This is one of the core ways symlink-based path traversal bugs happen. [1][2]
  • os.MkdirAll(path, perm): creates the directory tree for the path you pass it (like mkdir -p) and does not provide any special protection against path traversal / symlink tricks in attacker-controlled paths; it’s just a convenience for recursive directory creation. If path is already a directory, it returns nil. [4]
  • filepath.EvalSymlinks: resolves symlinks in a path, but using it as “resolve then validate then open” is TOCTOU-racy if an attacker can change the filesystem between your check and your use. [1][3]
  • Traversal-resistant alternative (Go 1.24+): use os.Root (via os.OpenRoot) or os.OpenInRoot(dir, name). These APIs reject any .. or symlink traversal that would escape the chosen root directory (portable across platforms). [1][2]

Sources: [1] Go blog: Traversal-resistant file APIs (Damien Neil, Mar 12 2025). [2] Go os package docs (OpenFile, OpenInRoot, Root). [3] Go path/filepath docs (EvalSymlinks). [4] Go os package docs (MkdirAll).


Reject symlinked components in target before extraction.

Line 73 only validates the lexical path. If any parent directory of target already exists as a symlink (e.g., dest/bin/usr/bin), the subsequent os.MkdirAll and os.OpenFile calls on lines 78, 82, and 85 will follow the symlink and write outside the extraction root. Walk every existing component of target with os.Lstat—including the leaf if it already exists—and reject symlinks before extraction. This applies to both directory and regular file extraction.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@io/filesystem.go` around lines 73 - 85, The current extraction checks only
the lexical path (destPrefix) but does not reject existing symlinked path
components, so before creating directories or opening files (when handling
tar.TypeDir and tar.TypeReg for the variables target and header), walk each path
component of target with os.Lstat (including existing leaf) and if any component
is a symlink (FileMode&os.ModeSymlink != 0) return an error (e.g., "tar entry %q
has symlinked path component") to prevent following symlinks; perform this check
immediately after computing target/validation against destPrefix and before
calling os.MkdirAll or os.OpenFile.

@Benzbeeb Benzbeeb closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants