This is part of my Polyglot Code tools - for the main documentation, see https://polyglot.korny.info
Binary releases are working again - see https://github.com/kornysietsma/polyglot-code-scanner/releases for binary releases.
However, for M1 macs this won't work - github actions doesn't yet support M1 macs for free, so you'll have to build binaries yourself for now.
For Macs you also need to run xattr -d com.apple.quarantine polyglot-code-scanner-x86_64-macos to remove the quarantine that OSX adds to all downloaded binaries.
This application scans source code directories, identifying a range of code metrics and other data, and storing the results in a JSON file for later visualisation by the Polyglot Code Explorer
See also https://polyglot.korny.info/tools/scanner/howto for more detailed instructions for building binary releases, and running the scanner.
To compile and run from source, you'll need to install rust and cargo and then from a copy of this project, you can build a binary package with:
cargo build --releaseThe binary will be built in the target/release directory.
You can also just run it from the source directory with cargo run polyglot_code_scanner -- (other command line arguments) - this will be slower as it runs un-optimised code with more debug information. But it's a lot faster for development.
See https://polyglot.korny.info for the main documentation for this project.
You can get up-to-date command-line help by running
polyglot_code_scanner -hThe scanner does not currently support scanning git repositories that use SHA-256 object hashes (created with git init --object-format=sha256). This is due to libgit2 not yet supporting SHA-256 repositories. When libgit2 adds this support, the scanner can be updated to use it.
Git ignored files in .gitignore are not scanned.
You can also manually add .polyglot_code_scanner_ignore files anywhere in the codebase, to list extra files to be ignored - the syntax is the same as .gitignore's
Run polyglot_code_scanner -h for full options, this is just the main options:
USAGE:
polyglot_code_scanner [OPTIONS] --name <NAME> [ROOT]
ARGS:
<ROOT> Root directory, current dir if not present
OPTIONS:
-h, --help
Print help information
-n, --name <NAME>
project name - identifies the selected data for display and state storage
--id <ID>
data file ID - used to identify unique data files for browser storage, generates a UUID
if not specified
-o, --output <OUTPUT>
Output file, stdout if not present, or not used if sending to web server
--no-git
Do not scan for git repositories
--years <GIT_YEARS>
how many years of git history to parse - default only scan the last 3 years (from now,
not git head) [default: 3]
--prune-inactive-years <PRUNE_YEARS>
Prune (remove) git repositories that haven't been active in the specified number of
years. For example, `--prune-inactive-years 1` will remove all git repositories with
no commits in the last year. Non-git content is always preserved.
-c, --coupling
include temporal coupling data
-V, --version
Print version information
The --prune-inactive-years flag lets you filter out dormant repositories from your analysis. This is useful for understanding active vs stale code in large workspaces.
# Fetch 5 years of history, but only show repos active in the last year
polyglot_code_scanner --name myproject --years 5 --prune-inactive-years 1 /path/to/workspace- Git roots are atomic: If a repository has any commits within the window, the entire repository is kept, including old files
- Non-git content preserved: Directories without git history are always kept (even if
--prune-inactive-yearsis specified) - Date calculation: The cutoff is based on the latest commit date in each repository, compared against the current date
The --years and --prune-inactive-years parameters are independent:
--years N: Controls how far back to fetch commit history for metrics and analysis--prune-inactive-years N: Controls what to keep in the final output
This separation lets you do things like:
# Analyze 10 years of history in inactive repos, but don't show them
polyglot_code_scanner --years 10 --prune-inactive-years 1 /path/to/workspaceTo run a single named test from the command-line:
cargo test -- --nocapture renames_and_deletes_applied_across_historyThe --nocapture tells rust not to capture stdout/stderr - so you can add println! and eprintln! statements to help you.
To remove some extra noise and blank lines, pipe the output through grep:
cargo test -- --nocapture renames_and_deletes_applied_across_history | grep -v "running 0 tests" | grep -v "0 passed" | grep -v -e '^\s*$'Rust tests don't install a logger - normally you explicitly install loggers in your main which tests don't use.
To install a logger using the fern crate, add the following to tests:
use test_shared::*;then
install_test_logger();This sets up a simple logger which sends logs to stdout - make sure you also use the --nocapture parameter mentioned earlier.
If you want better assertions, your tests need to explicitly use the pretty_assertions crate:
use pretty_assertions::assert_eq;Releasing uses cargo-release
The basic process is:
- update the top CHANGELOG.md entry (under 'unreleased')
- commit and push changes
- release
cargo release --dry-runor for a minor change 0.1.3 to 0.2.0 :
cargo release minor --dry-runCopyright © 2019-2022 Kornelis Sietsma
Licensed under the Apache License, Version 2.0 - see LICENSE.txt for details