A clean, efficient, and user-friendly duplicate file finder and remover application built for Windows. Scan directories, identify duplicate files by content hash, and safely remove them with an intuitive graphical interface.
- About This App
- Why It Was Created
- Why Windows-Only
- Features
- Specifications
- Requirements
- Installation
- How to Use
- Screenshots
- Security Policy
- Contributing
- License
- Tags & Keywords
- Acknowledgments
Duplicate File Remover is a powerful desktop application designed to help users reclaim disk space by identifying and removing duplicate files. Unlike simple filename-based duplicate finders, this tool uses cryptographic hash algorithms (MD5, SHA1, SHA256) to compare actual file contents, ensuring 100% accuracy in duplicate detection.
- Content-Based Detection: Uses file hashing to find true duplicates, not just files with the same name
- Safe Deletion: Option to move files to Recycle Bin instead of permanent deletion
- Smart Selection: Automatically select duplicates while keeping the original file
- Real-Time Progress: Visual progress bar with time estimation during scans
- Customizable Filters: Exclude patterns, file extensions, hidden/system files
- Human-Readable Sizes: Enter sizes as "100MB", "1.5GB" instead of raw bytes
This application was born out of a common frustration: managing storage space filled with duplicate files accumulated over years of backups, downloads, and file transfers. Existing solutions were either:
- Too expensive - Many professional duplicate finders require paid licenses
- Too complicated - Command-line tools with steep learning curves
- Too limited - Basic tools that only compare filenames
- Too risky - Tools that permanently delete files without Recycle Bin option
Duplicate File Remover bridges this gap by providing a free, open-source, user-friendly solution that balances power with simplicity. It's designed for everyday users who want to clean up their drives without risking accidental data loss.
- Cleaning up photo libraries with duplicate images
- Organizing music collections with duplicate tracks
- Freeing space on external drives and USB sticks
- Consolidating backup folders
- Preparing drives for migration or archival
This application is specifically designed for Windows for the following reasons:
-
Windows Explorer Integration
- Uses
explorer /select,command to open folders with files highlighted - Native Windows file operations for best compatibility
- Uses
-
Windows API Integration
- Uses
ctypesandctypes.windll.kernel32for:- Detecting hidden files (FILE_ATTRIBUTE_HIDDEN)
- Detecting system files (FILE_ATTRIBUTE_SYSTEM)
- Extended path support (\?\ prefix for long paths)
- Direct file deletion via Windows API for special characters
- Uses
-
Recycle Bin Support
- Integrates with Windows Recycle Bin via
send2trashlibrary - Uses Windows-specific APIs for safe file removal
- Integrates with Windows Recycle Bin via
-
Path Handling
- Handles Windows path conventions (backslashes, drive letters)
- Supports UNC paths (\server\share)
- Extended path support for paths > 260 characters
-
High DPI Awareness
- Uses
SetProcessDpiAwarenessfor crisp display on modern displays
- Uses
While the current version is Windows-only, the core duplicate-finding logic is platform-agnostic. A cross-platform version could be developed by abstracting the Windows-specific components.
- Fast Scanning - Two-phase scanning (size grouping → hash comparison)
- Multiple Hash Algorithms - MD5, SHA1, SHA256 support
- Safe Deletion - Move to Recycle Bin or permanent delete
- Batch Operations - Delete multiple files at once
- Smart Selection - Auto-select duplicates, keep originals
- File Size Limits - Set minimum/maximum file size filters
- Extension Filters - Include only specific file types
- Pattern Exclusion - Exclude files matching patterns (e.g.,
*.tmp) - Hidden/System File Handling - Option to skip hidden/system files
- Time Estimation - Realistic ETA based on moving average
- Detailed Logging - Timestamped operation log
- Modern Interface - Clean, intuitive tkinter GUI
- Progress Tracking - Visual progress bar with time remaining
- Sortable Results - Treeview with sortable columns
- Open Location - Jump directly to file location in Explorer
- Settings Persistence - Save preferences between sessions
| Specification | Details |
|---|---|
| Language | Python 3.7+ |
| GUI Framework | tkinter (built-in) |
| Hash Algorithms | MD5, SHA1, SHA256 |
| Chunk Size | 64 KB (65,536 bytes) for hash calculation |
| Supported File Sizes | Unlimited (tested up to several GB) |
| Settings Storage | JSON file (duplicate_remover_settings.json) |
| Log Format | In-app timestamped log |
| Metric | Performance |
|---|---|
| Scan Speed | ~100-500 files/second (depends on file size) |
| Hash Speed | ~50-200 MB/second (depends on storage speed) |
| Memory Usage | Low - uses streaming hash calculation |
| CPU Usage | Minimal - single-threaded with batch processing |
| Feature | Support |
|---|---|
| Long Paths | Yes (>260 characters with \\?\ prefix) |
| Special Characters | Yes (parentheses, spaces, Unicode) |
| Network Drives | Yes (UNC paths supported) |
| External Drives | Yes (USB, external HDD/SSD) |
| Hidden Files | Optional (configurable) |
| System Files | Optional (configurable, default: skip) |
| Requirement | Minimum | Recommended |
|---|---|---|
| OS | Windows 7 SP1 | Windows 10/11 |
| Python | 3.7 | 3.10+ |
| RAM | 512 MB | 2 GB |
| Disk Space | 50 MB | 100 MB |
| Display | 1024x768 | 1920x1080 |
tkinter (built-in)
threading (built-in)
hashlib (built-in)
os, sys (built-in)
json (built-in)
fnmatch (built-in)
subprocess (built-in)
ctypes (built-in)
send2trash (optional, for Recycle Bin support)
| Package | Purpose | Install Command |
|---|---|---|
send2trash |
Move files to Recycle Bin | pip install send2trash |
-
Clone the repository
git clone https://github.com/MAliXCS/DuplicateFileRemover.git cd DuplicateFileRemover -
Install optional dependency
pip install send2trash -
Run the application
python DuplicateFileRemover_v1.3.py
A standalone .exe file will be available in the Releases section for users who don't have Python installed.
- Click "Browse" to select the folder you want to scan
- Or type/paste the path directly in the text field
- Click "Scan" to begin the duplicate detection process
- The progress bar will show scan progress with estimated time remaining
- You can click "Stop" at any time to cancel the scan
- Duplicate files are displayed in the results table
- Files marked in red are duplicates (keeping one per group)
- The Group column shows which files belong together
- "Select All" - Select all files
- "Deselect All" - Clear selection
- "Duplicates" - Auto-select only duplicate files (recommended)
- Or manually select files by clicking them (Ctrl+Click for multiple)
- "Delete Selected" - Move selected files to Recycle Bin or permanently delete
- "Open Location" - Open the folder containing the selected file in Explorer
Click "Settings" to customize the application:
- MD5 (default) - Fast, good for most use cases
- SHA1 - More secure, slightly slower
- SHA256 - Most secure, slowest
- Enter sizes like
100MB,1.5GB, or raw bytes - Min Size - Skip files smaller than this
- Max Size - Skip files larger than this (0 = unlimited)
- Keep oldest file (default) - Keeps the original file
- Keep newest file - Keeps the most recent version
- Enter extensions like
.jpg,.png,.mp3 - Leave empty to include all file types
- Enter patterns to exclude (one per line)
- Examples:
*.tmp,*.log,Thumbs.db
- Skip hidden files - Ignore files with hidden attribute
- Skip system files - Ignore Windows system files (recommended)
- Move deleted files to Recycle Bin - Safer deletion method
- Auto-select duplicates after scan - Automatically select duplicates when scan completes
- Start with a test folder - Try the app on a small folder first to understand how it works
- Use "Open Location" - Always verify files before deleting by opening their location
- Enable Recycle Bin - Keep "Move to Recycle Bin" enabled for safety
- Scan external drives - Great for cleaning up USB drives and external storage
- Use size limits - For large drives, set min size (e.g., 1MB) to skip small files
Duplicate File Remover is designed with safety as a priority:
- Read-Only Scanning - The scanning process only reads files to calculate hashes
- No Data Transmission - All operations are local; no data is sent to any server
- Recycle Bin Default - Files are moved to Recycle Bin by default, not permanently deleted
- Preview Before Delete - You can review all files before deletion
- Open Location Feature - Verify files in Explorer before deleting
- Special Character Support - Properly handles files with special characters in names
- Long Path Support - Uses Windows extended path syntax for paths > 260 characters
- Permission Handling - Gracefully handles files without delete permissions
- Locked File Detection - Skips files that are in use by other applications
If you discover a security vulnerability, please report it responsibly:
- Do NOT open a public issue
- Email:
security@yourdomain.com(replace with actual contact) - Include:
- Description of the vulnerability
- Steps to reproduce
- Potential impact
- Suggested fix (if any)
We will respond within 48 hours and work to resolve the issue promptly.
- Always backup important data before using any file deletion tool
- Verify results using "Open Location" before deleting
- Use Recycle Bin option for an extra safety net
- Run as regular user - No administrator privileges required for normal use
Contributions are welcome! Here's how you can help:
- Check if the issue already exists
- Create a new issue with:
- Clear description
- Steps to reproduce
- Expected vs actual behavior
- Windows version and Python version
- Screenshots (if applicable)
- Open a new issue with the "Feature Request" label
- Describe the feature and its use case
- Explain why it would be valuable
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
# Clone your fork
git clone https://github.com/YOUR_USERNAME/DuplicateFileRemover.git
cd DuplicateFileRemover
# Install dependencies
pip install send2trash
# Run tests
python -m pytest tests/
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2024 MAliXCS
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
duplicate-files file-manager disk-cleanup storage-management duplicate-finder file-organizer windows-app python tkinter hash-based md5 sha256 recycle-bin file-deletion disk-space cleanup-tool utility desktop-application gui open-source
- send2trash - For safe Recycle Bin integration
- Python tkinter team - For the excellent GUI framework
- Contributors - Thank you to everyone who has contributed to this project
- Users - Thank you for using and providing feedback on this tool
- Author: MAliXCS
- Instagram: @x404ctl
- GitHub: @MAliXCS
- Issues: GitHub Issues
- Fixed "Open File Location" opening folders twice bug
- Added human-readable file size input support (B, KB, MB, GB, TB, PB)
- Fixed Select Duplicates to work reliably on all duplicate files
- Fixed Open Location opening multiple folders when files from same directory are selected
- Now opens each unique directory only once
- Fixed Settings window scrolling and layout glitches
- Improved Open Location to accurately open selected file's folder
- Enhanced real-time time estimation with moving average
- Optimized scanning performance with batch processing
- Fixed all edge cases for special character file handling
- Zero GUI glitches with proper widget management
- Initial release
- Basic duplicate file detection
- MD5 hash comparison
- Delete and Recycle Bin support
Made with ❤️ by MAliXCS



