-
Notifications
You must be signed in to change notification settings - Fork 26
[chore] Add sanity checks for docstring, license and DCO #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| name: DCO Check | ||
|
|
||
| on: | ||
| pull_request: | ||
| types: [opened, synchronize, reopened] | ||
|
|
||
| permissions: | ||
| pull-requests: read | ||
| contents: read | ||
|
|
||
| jobs: | ||
| check-dco: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Check for Signed-off-by | ||
| uses: actions/github-script@v7 | ||
| with: | ||
| script: | | ||
| const commits = await github.rest.pulls.listCommits({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| pull_number: context.issue.number, | ||
| }); | ||
|
|
||
| const regex = /^Signed-off-by: .* <.*@.*>/m; | ||
| let failed = false; | ||
|
|
||
| for (const commit of commits.data) { | ||
| if (!regex.test(commit.commit.message)) { | ||
| console.log(`Commit ${commit.sha} is missing Signed-off-by`); | ||
| failed = true; | ||
| } | ||
| } | ||
|
|
||
| if (failed) { | ||
| core.setFailed('One or more commits are missing the Signed-off-by line.'); | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,6 +8,7 @@ on: | |
| branches: | ||
| - main | ||
| - dev | ||
| - v0.* | ||
|
|
||
| # Declare permissions just read content. | ||
| permissions: | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
|
|
||
| name: sanity | ||
|
|
||
| on: | ||
| # Trigger the workflow on push or pull request | ||
| push: | ||
| branches: | ||
| - main | ||
| - v0.* | ||
| pull_request: | ||
| branches: | ||
| - main | ||
| - v0.* | ||
| paths: | ||
| - "**/*.py" | ||
| - .github/workflows/sanity.yml | ||
| - "tests/sanity/**" | ||
|
|
||
| # Cancel jobs on the same ref if a new one is triggered | ||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | ||
|
|
||
| # Declare permissions just read content. | ||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| sanity: | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 5 # Increase this timeout value as needed | ||
| strategy: | ||
| matrix: | ||
| python-version: ["3.10"] | ||
| steps: | ||
| - uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| - name: Set up Python ${{ matrix.python-version }} | ||
| uses: actions/setup-python@0b93645e9fea7318ecaed2b359559ac225c90a2b # v5.3.0 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | ||
| - name: Install dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| python -m pip install build | ||
| python -m build --wheel | ||
| pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu | ||
| pip install dist/*.whl | ||
| - name: Run license test | ||
| run: | | ||
| python3 tests/sanity/check_license.py --directories . | ||
| - name: Check docstrings for specified files | ||
| run: | | ||
| python3 tests/sanity/check_docstrings.py | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,170 @@ | ||
| # Copyright 2025 Bytedance Ltd. and/or its affiliates | ||
| # Copyright 2025 Huawei Technologies Co., Ltd. All Rights Reserved. | ||
| # Copyright 2025 The TransferQueue Team | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """ | ||
| Python script to check docstrings for functions and classes in specified files. | ||
| Checks that every public function and class has proper docstring documentation. | ||
| """ | ||
|
|
||
| import ast | ||
| import os | ||
| import sys | ||
|
|
||
|
|
||
| class DocstringChecker(ast.NodeVisitor): | ||
| """AST visitor to check for missing docstrings in functions and classes.""" | ||
|
|
||
| def __init__(self, filename: str): | ||
| self.filename = filename | ||
| self.missing_docstrings: list[tuple[str, str, int]] = [] | ||
| self.current_class = None | ||
| self.function_nesting_level = 0 | ||
|
|
||
| def visit_FunctionDef(self, node: ast.FunctionDef): | ||
| """Visit function definitions and check for docstrings.""" | ||
| if not node.name.startswith("_") and self.function_nesting_level == 0: | ||
| if not self._has_docstring(node): | ||
| func_name = f"{self.current_class}.{node.name}" if self.current_class else node.name | ||
| self.missing_docstrings.append((func_name, self.filename, node.lineno)) | ||
|
|
||
| self.function_nesting_level += 1 | ||
| self.generic_visit(node) | ||
| self.function_nesting_level -= 1 | ||
|
|
||
| def visit_AsyncFunctionDef(self, node: ast.AsyncFunctionDef): | ||
| """Visit async function definitions and check for docstrings.""" | ||
| if not node.name.startswith("_") and self.function_nesting_level == 0: | ||
| if not self._has_docstring(node): | ||
| func_name = f"{self.current_class}.{node.name}" if self.current_class else node.name | ||
| self.missing_docstrings.append((func_name, self.filename, node.lineno)) | ||
|
|
||
| self.function_nesting_level += 1 | ||
| self.generic_visit(node) | ||
| self.function_nesting_level -= 1 | ||
|
|
||
| def visit_ClassDef(self, node: ast.ClassDef): | ||
| """Visit class definitions and check for docstrings.""" | ||
| if not node.name.startswith("_"): | ||
| if not self._has_docstring(node): | ||
| self.missing_docstrings.append((node.name, self.filename, node.lineno)) | ||
|
|
||
| old_class = self.current_class | ||
| self.current_class = node.name | ||
| self.generic_visit(node) | ||
| self.current_class = old_class | ||
|
|
||
| def _has_docstring(self, node) -> bool: | ||
| """Check if a node has a docstring.""" | ||
| return ast.get_docstring(node) is not None | ||
|
|
||
|
|
||
| def check_file_docstrings(filepath: str) -> list[tuple[str, str, int]]: | ||
| """Check docstrings in a single file.""" | ||
| try: | ||
| with open(filepath, encoding="utf-8") as f: | ||
| content = f.read() | ||
|
|
||
| tree = ast.parse(content, filename=filepath) | ||
| checker = DocstringChecker(filepath) | ||
| checker.visit(tree) | ||
| return checker.missing_docstrings | ||
|
|
||
| except Exception as e: | ||
|
0oshowero0 marked this conversation as resolved.
|
||
| print(f"Error processing {filepath}: {e}") | ||
| return [] | ||
|
|
||
|
|
||
| def get_python_files_in_transfer_queue(repo_path: str) -> list[str]: | ||
| """Get all Python files in the transfer_queue directory.""" | ||
| transfer_queue_path = os.path.join(repo_path, "transfer_queue") | ||
| if not os.path.exists(transfer_queue_path): | ||
| print(f"Warning: transfer_queue directory {transfer_queue_path} does not exist!") | ||
| return [] | ||
|
|
||
| python_files = [] | ||
| for root, _, files in os.walk(transfer_queue_path): | ||
| for file in files: | ||
| if file.endswith(".py"): | ||
| python_files.append(os.path.join(root, file)) | ||
|
|
||
| return sorted(python_files) | ||
|
|
||
|
|
||
| def main(): | ||
| """Main function to check docstrings in transfer_queue Python files.""" | ||
|
|
||
| script_dir = os.path.dirname(os.path.abspath(__file__)) | ||
| repo_path = os.path.dirname(os.path.dirname(script_dir)) | ||
|
|
||
| if not os.path.exists(repo_path): | ||
| print(f"Repository path {repo_path} does not exist!") | ||
| sys.exit(1) | ||
|
|
||
| os.chdir(repo_path) | ||
|
|
||
| files_to_check = get_python_files_in_transfer_queue(repo_path) | ||
|
|
||
| if not files_to_check: | ||
| print("No Python files found in transfer_queue directory!") | ||
| sys.exit(1) | ||
|
|
||
| all_missing_docstrings = [] | ||
|
|
||
| print("Checking docstrings in transfer_queue Python files...") | ||
| print(f"Found {len(files_to_check)} Python files to check") | ||
| print("=" * 60) | ||
|
|
||
| for file_path in files_to_check: | ||
| if not os.path.exists(file_path): | ||
| print(f"Warning: File {file_path} does not exist!") | ||
| continue | ||
|
|
||
| print(f"Checking {file_path}...") | ||
| missing = check_file_docstrings(file_path) | ||
| all_missing_docstrings.extend(missing) | ||
|
|
||
| if missing: | ||
| print(f" Found {len(missing)} missing docstrings") | ||
| else: | ||
| print(" All functions and classes have docstrings [OK]") | ||
|
|
||
| print("=" * 60) | ||
|
|
||
| if all_missing_docstrings: | ||
| print(f"\nSUMMARY: Found {len(all_missing_docstrings)} functions/classes missing docstrings:") | ||
| print("-" * 60) | ||
|
|
||
| by_file = {} | ||
| for name, filepath, lineno in all_missing_docstrings: | ||
| if filepath not in by_file: | ||
| by_file[filepath] = [] | ||
| by_file[filepath].append((name, lineno)) | ||
|
|
||
| for filepath in sorted(by_file.keys()): | ||
| print(f"\n{filepath}:") | ||
| for name, lineno in sorted(by_file[filepath], key=lambda x: x[1]): | ||
| print(f" - {name} (line {lineno})") | ||
|
|
||
| print(f"\nTotal missing docstrings: {len(all_missing_docstrings)}") | ||
|
|
||
| raise Exception(f"Found {len(all_missing_docstrings)} functions/classes without proper docstrings!") | ||
|
0oshowero0 marked this conversation as resolved.
|
||
|
|
||
| else: | ||
| print("\n[OK] All functions and classes have proper docstrings!") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| # Copyright 2024 Bytedance Ltd. and/or its affiliates | ||
|
0oshowero0 marked this conversation as resolved.
0oshowero0 marked this conversation as resolved.
|
||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| from argparse import ArgumentParser | ||
| from pathlib import Path | ||
| from typing import Iterable | ||
|
|
||
| # Add license headers below | ||
| license_head_huawei = "Copyright 2025 Huawei Technologies Co., Ltd. All Rights Reserved." | ||
| license_head_tq = "Copyright 2025 The TransferQueue Team" | ||
|
|
||
| license_headers = [ | ||
|
0oshowero0 marked this conversation as resolved.
|
||
| license_head_huawei, | ||
| license_head_tq, | ||
| ] | ||
|
|
||
|
|
||
| def get_py_files(path_arg: Path) -> Iterable[Path]: | ||
| """Get Python files under a directory. If already a Python file, return it. | ||
|
|
||
| Args: | ||
| path_arg (Path): path to scan for .py files | ||
|
|
||
| Returns: | ||
| Iterable[Path]: list of .py files | ||
| """ | ||
| if path_arg.is_dir(): | ||
| return path_arg.glob("**/*.py") | ||
| elif path_arg.is_file() and path_arg.suffix == ".py": | ||
| return [path_arg] | ||
| return [] | ||
|
0oshowero0 marked this conversation as resolved.
|
||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| parser = ArgumentParser() | ||
| parser.add_argument( | ||
| "--directories", | ||
| "-d", | ||
| required=True, | ||
| type=Path, | ||
| nargs="+", | ||
| help="List of directories to check for license headers", | ||
| ) | ||
| args = parser.parse_args() | ||
|
|
||
| # Collect all Python files from specified directories | ||
| pathlist = set(path for path_arg in args.directories for path in get_py_files(path_arg)) | ||
|
|
||
| for path in pathlist: | ||
| # because path is object not string | ||
| path_in_str = str(path.absolute()) | ||
| print(path_in_str) | ||
| with open(path_in_str, encoding="utf-8") as f: | ||
| file_content = f.read() | ||
|
|
||
| has_license = False | ||
| for lh in license_headers: | ||
| if lh in file_content: | ||
|
0oshowero0 marked this conversation as resolved.
|
||
| has_license = True | ||
| break | ||
| assert has_license, f"file {path_in_str} does not contain license" | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workflow is missing a permissions section. Following security best practices, GitHub Actions workflows should explicitly define permissions. Consider adding a permissions section that specifies only the minimum required permissions. For this DCO check workflow, you likely need "pull-requests: read" and "contents: read".