Read in English | Leia em Português
A Rust command-line tool for automatic validation and correction of file encodings in projects, designed to run as a pre-commit hook via Husky.
- Automatic encoding detection: Automatically detects file encoding using chardetng
- Smart conversion: Converts only when necessary to avoid unnecessary changes
- Flexible configuration: Supports configuration file with glob patterns
- Optimized performance: Parallel processing using Rayon
- Git integration: Processes only staged files or the entire project
- Interactive interface: Allows encoding selection when there's no configuration
- Respects .gitignore: Automatically ignores files listed in .gitignore
- Rust 1.70+ (install via rustup)
- Git (for pre-commit integration)
- Node.js and npm (optional, for Husky integration)
# Clone or copy the project
git clone <your-repository>
cd rs-precommit-fix-encode
# Run the installation script
chmod +x install.sh
./install.shThe installation script will:
- Compile the project in release mode
- Automatically configure Husky (if Node.js project detected)
- Create the pre-commit hook
# Compile
cargo build --release
# Binary will be at: ./target/release/rsfeCreate an rsfe.conf file in your project root. Format:
# Comments start with #
<glob-pattern> <ENCODING>
# Example:
** UTF-8 # Default for all files
**/*.js UTF-8 # JavaScript files
**/*.py UTF-8 # Python files
legacy/** AUTO # Auto-detection for legacy folder
old-data/**/*.txt WINDOWS-1252 # Specific encoding
UTF-8(recommended default)UTF-16LE(UTF-16 Little Endian)UTF-16BE(UTF-16 Big Endian)ISO-8859-1(Latin-1)WINDOWS-1252(CP1252)AUTO(automatic detection)- Any encoding supported by the encoding_rs library
Copy the example file:
cp rsfe.conf.example rsfe.confEdit as needed for your project.
# Process entire project
./target/release/rsfe
# If no rsfe.conf exists, you'll be prompted to choose the default encoding# Stage files
git add .
# Run rsfe (will process only staged files)
./target/release/rsfeIf you installed via install.sh, the hook is already configured. Each commit will:
- Run
rsfeon staged files - Convert encodings if necessary
- Re-stage modified files
- Proceed with the commit
# Normal git usage
git add .
git commit -m "My message"
# rsfe will run automaticallyIf not using Husky, add to .git/hooks/pre-commit:
#!/bin/sh
# Run rsfe
./target/release/rsfe
# If it fails, cancel the commit
if [ $? -ne 0 ]; then
echo "❌ rsfe failed. Fix errors before committing."
exit 1
fi
# Re-stage modified files
git add -u
exit 0Make the hook executable:
chmod +x .git/hooks/pre-commit- Load configuration: Reads
rsfe.confor prompts for default encoding - Collect files:
- If in git repository with staged files: process only staged files
- Otherwise: process entire project
- Filter files: Ignore binaries and respect .gitignore
- Process in parallel: Uses Rayon for multi-threaded processing
- For each file:
- Determine target encoding based on rules
- Detect current encoding
- Check if conversion is needed
- Convert if necessary
- Report results: Show how many files were converted
RSFE only converts when truly necessary, by checking:
- If there are errors when decoding with the target encoding
- If re-encoding produces different content than the original
This avoids unnecessary commits of files already in the correct encoding.
RSFE is optimized for performance:
- Parallel processing: Uses all CPU cores via Rayon
- Smart reading: Automatically ignores binary files
- Detection cache: chardetng is efficient for encoding detection
- Conditional conversion: Only converts when truly necessary
Typical benchmark (hardware dependent):
- ~10,000 files processed in ~2-5 seconds
- Necessary conversions: adds ~1-2ms per file
By default, RSFE ignores:
node_modulestargetdistbuild.git.idea.vscode- Any directory in
.gitignore
- Images:
.png,.jpg,.jpeg,.gif - Documents:
.pdf - Compressed files:
.zip,.tar,.gz - Executables:
.exe,.dll,.so,.dylib
# rsfe.conf
** UTF-8
**/*.js UTF-8
**/*.ts UTF-8
**/*.jsx UTF-8
**/*.tsx UTF-8
**/*.json UTF-8
# rsfe.conf
** UTF-8
**/*.py UTF-8
**/*.md UTF-8
requirements.txt UTF-8
# rsfe.conf
** UTF-8
legacy/** AUTO # Auto-detect
docs/old/*.txt WINDOWS-1252
Install Rust via rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shCheck Rust version:
rustc --version # Should be 1.70+Update if necessary:
rustup updateAdd the extension to the ignore list at src/main.rs:251.
Check the encoding_rs documentation for supported encodings.
rs-precommit-fix-encode/
├── src/
│ └── main.rs # Main code
├── Cargo.toml # Dependencies
├── rsfe.conf.example # Configuration example
├── install.sh # Installation script
├── .husky/
│ └── pre-commit # Husky hook
├── README.md # English documentation
└── LEIAME.md # Portuguese documentation
cargo runcargo testcargo build --release --target x86_64-unknown-linux-musl # Static Linux binary- Fork the project
- Create a branch for your feature (
git checkout -b feature/MyFeature) - Commit your changes (
git commit -m 'Add MyFeature') - Push to the branch (
git push origin feature/MyFeature) - Open a Pull Request
MIT License - see the LICENSE file for details.
Created to ensure encoding consistency in projects and avoid special character issues in commits.
- encoding_rs - Encoding library
- chardetng - Encoding detection
- Rayon - Parallelism
- Husky - Git hooks