diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..b33552e --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,67 @@ +# Contributing to Revenant + +Thanks for your interest in improving Revenant. Contributions of all sizes are welcome: +bug reports, fixes, new secret sources, and documentation. + +## Requirements + +- Go 1.22 or newer +- [TruffleHog](https://github.com/trufflesecurity/trufflehog) on your `PATH` (used for + detection and verification, and by the opt-in integration test) + +## Building and testing + +```bash +go build ./... +go test ./... +go vet ./... +gofmt -l cmd/ internal/ # should print nothing +``` + +CI runs `gofmt`, `go vet`, and `go test -race` on every pull request, so run them locally first. + +The integration test that shells out to a real TruffleHog binary is build-tagged and excluded +from the default run. Run it with: + +```bash +go test -tags=integration ./internal/detect/ +``` + +## Pull requests + +1. Create a feature branch off `main`. +2. Keep each change focused and covered by tests. The codebase follows TDD; add a failing test + first, then the implementation. +3. Make sure the full suite passes and the tree is gofmt-clean. +4. Open a pull request describing the change and how you tested it. + +## Project layout + +Each package under `internal/` has one clear responsibility and a small interface: + +- `target`, `enumerate` resolve what to scan (repos, members, gists). +- `discover` finds deleted and force-pushed commits. +- `fetch` retrieves commit and file contents. +- `detect`, `validate` wrap TruffleHog for detection and live verification. +- `dork` searches GitHub by code-search dork. +- `correlate` deduplicates and ranks findings; `report` renders them. +- `scan` ties per-repo sources together; `keyintel` analyzes verified keys. +- `parallel` is the bounded worker pool; `githubclient` is the rate-limit-aware HTTP client. + +New scan sources generally implement an existing interface (`scan.RepoScanner`, +`discover.Discoverer`, `keyintel.Analyzer`) and flow into the same correlate and report stages. + +## Test fixtures and credentials + +Use only synthetic or revoked credentials in test fixtures. Never commit a live secret. + +## Responsible use + +Revenant is for authorized security testing and assets you own or are permitted to test. If a +contribution involves scanning real targets, follow responsible disclosure and the target's +terms of service. + +## License + +By contributing, you agree that your contributions are licensed under the project's +[GNU GPL v3](LICENSE). diff --git a/README.md b/README.md index cc3be10..be7d0fd 100644 --- a/README.md +++ b/README.md @@ -158,7 +158,8 @@ target -> discover -> fetch -> detect -> validate -> correlate -> report built yet (the parser exists behind an interface for it). - Repos, gists, and dork hits scan concurrently (default 8 workers; tune with `--concurrency`). Brute-force probing within a single repo is still serial, since it is a rate-limited, opt-in - fallback. + fallback. On a single token, high concurrency can cause GitHub to throttle and occasionally + skip requests; pass several `--tokens` or lower `--concurrency` for maximum completeness. - Brute-force is a slow, opt-in fallback. It is rate-limited by GitHub and capped per repo. Prefer a token and the activity tier. - Live-key intelligence covers GitHub tokens. Other key types are not analyzed yet.