Skip to content

Feature/rule link to files#226

Merged
lizhengfeng101 merged 6 commits into
alibaba:mainfrom
zephyrq-z:feature/rule-link-to-files
Jun 27, 2026
Merged

Feature/rule link to files#226
lizhengfeng101 merged 6 commits into
alibaba:mainfrom
zephyrq-z:feature/rule-link-to-files

Conversation

@zephyrq-z

@zephyrq-z zephyrq-z commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Description

Closes #67.

The rule field in rule.json now supports automatic detection of inline text vs. external file paths. A single-line value with no spaces ending in .md, .txt, or .markdown is treated as a file path — the file content is read and replaces the rule field. All other values (multi-line text, text containing spaces, no extension, other extensions) remain inline.

Differences from the abandoned PR #87

This PR is a redesign and replacement of the closed #87. Key differences:

Dimension PR #87 (abandoned) This PR
Control Top-level use_file_path: true switch, global toggle Auto-detection, no extra field needed
Granularity File-wide mode (all file paths or all inline) Per-entry mixing: file paths and inline can coexist in the same rule.json
Backward compatibility Off by default, requires manual opt-in Fully compatible, existing configs work unchanged
Ease of use Must understand use_file_path semantics Intuitive: use a .md path to load a file

Core Changes

  • internal/config/rules/system_rules.go (+113 lines)

    • Added allowedRuleExts — package-level extension whitelist (.md / .txt / .markdown), shared by all validation functions
    • Added looksLikeFilePath() — heuristic: multi-line or contains spaces → inline; single-line, no spaces, whitelisted extension → file path
    • Added resolveRuleEntries() — iterates entries, auto-detects and loads file content; clears the rule if loading fails. Called by all three loaders: loadGlobalRule, loadProjectRule, and loadRuleFile
    • Added tryReadRuleFile() — absolute paths are used directly; relative paths are resolved against the repository directory with path traversal protection; empty repoDir with a relative path is rejected
    • Added readRuleFileSafe() — safe reading: symlink resolution, extension whitelist, 512 KB size cap
  • internal/config/rules/system_rules_test.go (+408 lines)

    • 24 new tests: basic file loading, multi-line inline, missing file, absolute path, oversized file, relative path, symlink safety, empty rule, .txt/.markdown extensions, subdirectory path, path traversal blocked, empty repoDir (relative and absolute), global rule file resolution, heuristic detection (5), safe reading (4)
  • README (5 languages) — added documentation for file path resolution in the rule field with configuration examples, including path traversal protection and size limit

Configuration Example

{
  "rules": [
    {"path": "**/*.py", "rule": "docs/python-rules.md"},
    {"path": "**/*.go", "rule": "Always check for nil pointers"},
    {"path": "**/*.ts", "rule": "shared.md"}
  ]
}
  • docs/python-rules.md → relative path, file content loaded automatically
  • "Always check for nil pointers" → inline text, kept as-is
  • "shared.md" → relative path, resolved from the project directory; cleared if not found

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactoring (no functional changes)
  • Documentation update
  • CI / Build / Tooling

How Has This Been Tested?

Unit Tests

internal/config/rules/system_rules_test.go — 24 new test cases:

  • TestResolveRuleEntries_BasicFile.md file content correctly replaces rule field
  • TestResolveRuleEntries_MultiLineInline — multi-line text stays inline
  • TestResolveRuleEntries_MissingFile — missing file emits [WARN], clears the rule
  • TestResolveRuleEntries_AbsolutePath — absolute path resolved directly
  • TestResolveRuleEntries_TooLarge — files over 512 KB rejected, rule cleared
  • TestResolveRuleEntries_RelativePath — relative path resolved from project directory
  • TestResolveRuleEntries_EmptyRule — empty rules skipped
  • TestResolveRuleEntries_SymlinkSafety — symlink target extension checked, clears rule if not whitelisted
  • TestResolveRuleEntries_TxtExtension.txt extension supported
  • TestResolveRuleEntries_MarkdownExtension.markdown extension supported
  • TestResolveRuleEntries_SubdirectoryPath — subdirectory path resolution
  • TestResolveRuleEntries_PathTraversalBlocked../../etc/passwd.md traversal blocked, rule cleared
  • TestResolveRuleEntries_EmptyRepoDirRelative — relative path rejected when repoDir is empty
  • TestResolveRuleEntries_EmptyRepoDirAbsolute — absolute path works when repoDir is empty
  • TestResolveRuleEntries_GlobalRuleFileResolution~/.opencodereview/ rule files resolved
  • TestLooksLikeFilePath_* — 5 heuristic tests (inline content, multi-line, extensions, spaces, no extension)
  • TestReadRuleFileSafe_* — 4 safe read tests (normal file, unsupported extension, oversized, missing)

Manual Testing

test-rule independent validation suite:

# Scenario Result
1 File path + inline mixed python.md content loaded, inline unchanged
2 Inline rules only No file lookup, text used directly
3 Missing file [WARN] + rule cleared
4 .json extension Not in extension whitelist, treated as inline
5 Absolute path /tmp/absolute-rule.md resolved directly
6 Subdirectory path rules/nested/nested.md resolved correctly
7 Regression Normal review unaffected
  • make test passes locally
  • Manual testing (describe below)

Checklist

  • My code follows the project's coding style (go fmt, go vet)
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the documentation accordingly (if applicable)
  • I have signed the CLA

Related Issues

Closes #67
Supersedes #87

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 OpenCodeReview found 3 issue(s) in this PR.

  • ✅ 3 posted as inline comment(s)
  • 📝 0 posted as summary

Comment thread internal/config/rules/system_rules.go
Comment thread internal/config/rules/system_rules.go Outdated
Comment on lines +479 to +482
if filepath.IsAbs(rule) {
candidates = []string{rule}
} else {
candidates = []string{filepath.Join(repoDir, rule)}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security (Path Traversal): There is no validation that the resolved file path stays within the expected base directories (repoDir or ~/.opencodereview). A malicious rule entry like ../../../../etc/passwd.md would pass through filepath.Join(repoDir, rule), resolve via EvalSymlinks, pass the extension check, and be read successfully. Consider adding a containment check after symlink resolution to ensure the final path is under one of the allowed base directories, e.g.:

rel, err := filepath.Rel(allowedBase, resolved)
if err != nil || strings.HasPrefix(rel, "..") {
    return "", fmt.Errorf("path escapes allowed directory")
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rule.json 由项目维护者自己编写,不是不可信输入。维护者写 ../../../../etc/passwd.md
是在读自己的文件,没有提权路径。repoDir 是受信的仓库根目录

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个在readme中进行了补充

Comment thread internal/config/rules/system_rules.go Outdated
fmt.Fprintf(os.Stderr, "[WARN] cannot read rule file %s: %v\n", candidates[1], err)
}
}
fmt.Fprintf(os.Stderr, "[WARN] rule file not found: %s (tried project dir, then ~/.opencodereview)\n", rule)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noisy warnings: When a rule file simply doesn't exist in either location (a normal situation for projects that don't use external rule files), this warning is always printed to stderr. This could produce confusing noise during regular operation. Consider only logging at debug level, or suppressing the "not found" warning entirely when both locations are legitimately optional.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只有 rule 值以 .md/.txt/.markdown 结尾时才会触发文件查找。用户写 "nonexistent.md"
就明确表达了"加载文件"的意图,文件不存在时给出警告是合理的,帮助发现配置错误

Comment thread internal/config/rules/system_rules.go Outdated
Comment thread internal/config/rules/system_rules.go Outdated
Comment thread README.md
Comment thread internal/config/rules/system_rules.go Outdated
}
}

primary := candidates[0]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: primary is only used for one read and one error log — you could use candidates[0] directly to reduce the local variable count. Minor style preference, not blocking.

Comment thread internal/config/rules/system_rules.go
- rule field auto-detects: .md/.txt/.markdown ending = file path, otherwise inline
- file paths: project-relative first, then as absolute path
- safety: stat before read (512KB cap), extension whitelist, symlink resolution
- tightened heuristic: values with spaces treated as inline to avoid false positives
- guard against empty repoDir to avoid CWD-relative resolution
- 5-language README docs updated with file path usage and first-match-wins behavior
- 15 new unit tests covering all resolution branches

Closes alibaba#67
Supersedes alibaba#87
@zephyrq-z zephyrq-z force-pushed the feature/rule-link-to-files branch from 9cdfe9f to 9d511f4 Compare June 26, 2026 08:27
- Add blank line between matchProjectRuleEntry and allowedRuleExts (Issue 1)
- Remove dead code '|| repoDir == ' in tryReadRuleFile (Issue 2)
- Remove unnecessary warning when repoDir is empty but path is absolute (Issue 3)

@lizhengfeng101 lizhengfeng101 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lizhengfeng101 lizhengfeng101 merged commit b109baf into alibaba:main Jun 27, 2026
2 checks passed
@zephyrq-z

Copy link
Copy Markdown
Contributor Author

Thanks everyone for the feedback and discussion! @vanducng @candy-Tong @paul-yangmy @yuuk

PR #87 has been redesigned and merged as #226. Key improvements over the original approach:

  • No use_file_path switch needed — file paths are auto-detected
  • File paths and inline rules can coexist in the same rule.json

See the PR description for examples and details. Feedback always welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants