Description
Currently, sphinx-codelinks can optionally use the .gitignore file at the repository root to exclude files during source discovery. However, there are several enhancements needed to make this feature more robust and flexible.
Current Behavior
When gitignore = true is set in the configuration, the tool uses the .gitignore file at the source root to filter out files. This works for basic cases but has limitations.
Requested Enhancements
-
Support nested .gitignore files: Git supports .gitignore files in subdirectories, which apply to files within that directory and its children. Currently, only the root .gitignore is considered.
-
Support .gitignore patterns in configuration: Allow users to specify gitignore-style patterns directly in the codelinks.toml configuration without needing an actual .gitignore file. This is useful for:
- CI/CD environments where the
.gitignore might not be available
- Excluding files that shouldn't be in
.gitignore but should be excluded from traceability
- Bazel builds where the source tree structure differs from the git repository
-
Support global gitignore: Respect the global gitignore file (~/.gitignore_global or configured via core.excludesFile)
-
Support .git/info/exclude: This file contains patterns specific to the local repository that aren't shared via .gitignore
Proposed Configuration
[codelinks.projects.my_project.source_discover]
src_dir = "./src"
# Existing option - use .gitignore at src_dir root
gitignore = true
# NEW: Use nested .gitignore files in subdirectories
nested_gitignore = true
# NEW: Additional gitignore-style patterns (applied after .gitignore)
gitignore_patterns = [
"**/__pycache__/",
"**/node_modules/",
"*.generated.py",
"vendor/**",
]
# NEW: Path to a custom gitignore file (useful for Bazel builds)
# Question if really needed/if it is an actual usecase. Initially not required
gitignore_file = "/path/to/custom/.gitignore"
Use Cases
- Monorepo with multiple projects: Each project directory has its own
.gitignore with project-specific patterns
- Generated code exclusion: Exclude auto-generated files that match certain patterns without modifying
.gitignore
- [optional] Bazel/Buck builds: Specify a custom gitignore file path or inline patterns when the build sandbox doesn't have access to the original
.gitignore
- CI/CD pipelines: Exclude test fixtures, mock data, or vendored dependencies from traceability analysis
Technical Details
File: src/sphinx_codelinks/source_discover/source_discover.py
The current implementation uses gitignore_parser library to parse the .gitignore file:
from gitignore_parser import parse_gitignore
# Currently only checks root .gitignore
if self.src_discover_config.gitignore:
gitignore_path = Path(self.src_discover_config.src_dir) / ".gitignore"
if gitignore_path.exists():
self.gitignore_matcher = parse_gitignore(gitignore_path)
Acceptance Criteria
Related
- Current gitignore implementation:
src/sphinx_codelinks/source_discover/source_discover.py
- Configuration:
src/sphinx_codelinks/source_discover/config.py
- Tests:
tests/test_source_discover.py
Labels
enhancement
source-discover
configuration
Priority
Medium - This is a quality-of-life improvement that would benefit users with complex repository structures or non-standard build systems.
Description
Currently, sphinx-codelinks can optionally use the
.gitignorefile at the repository root to exclude files during source discovery. However, there are several enhancements needed to make this feature more robust and flexible.Current Behavior
When
gitignore = trueis set in the configuration, the tool uses the.gitignorefile at the source root to filter out files. This works for basic cases but has limitations.Requested Enhancements
Support nested
.gitignorefiles: Git supports.gitignorefiles in subdirectories, which apply to files within that directory and its children. Currently, only the root.gitignoreis considered.Support
.gitignorepatterns in configuration: Allow users to specify gitignore-style patterns directly in thecodelinks.tomlconfiguration without needing an actual.gitignorefile. This is useful for:.gitignoremight not be available.gitignorebut should be excluded from traceabilitySupport global gitignore: Respect the global gitignore file (
~/.gitignore_globalor configured viacore.excludesFile)Support
.git/info/exclude: This file contains patterns specific to the local repository that aren't shared via.gitignoreProposed Configuration
Use Cases
.gitignorewith project-specific patterns.gitignore.gitignoreTechnical Details
File:
src/sphinx_codelinks/source_discover/source_discover.pyThe current implementation uses
gitignore_parserlibrary to parse the.gitignorefile:Acceptance Criteria
gitignore_patternsconfiguration option for inline patternsgitignore_fileconfiguration option for custom gitignore file pathnested_gitignoreoption to recursively apply.gitignorefiles in subdirectoriesgitignore = true/falseoptionRelated
src/sphinx_codelinks/source_discover/source_discover.pysrc/sphinx_codelinks/source_discover/config.pytests/test_source_discover.pyLabels
enhancementsource-discoverconfigurationPriority
Medium - This is a quality-of-life improvement that would benefit users with complex repository structures or non-standard build systems.