A comprehensive bash script for automating the archival of Google Workspace users who have been moved to a "FormerEmployees" organizational unit.
IMPORTANT: This script uses GAM ONLY for read-only user discovery (listing users in the target OU). GAM makes no modifications to your Workspace. That being said, as a good admin, you should always review usage of GAM commands in the script to verify read-only operations before running. Trust but verify :)
- Automated discovery of users in specified organizational units (GAM used read-only to query user lists)
- Sequential backup using GYB (Got Your Back)
- Real-time progress monitoring - Shows backup progress every 500 messages with percentage updates
- Automatic compression of backups into tar.gz archives
- Resume capability (skips already archived users)
- Comprehensive logging with timestamps
- Detailed reports generation
- Error handling with retry logic for rate limits
- Dry-run mode for testing
- Single user mode for targeted backups
- Solarized color-coded console output for improved readability
- Progress tracking and statistics
- Enhanced security with input validation and audit logging
- OU path validation to prevent command injection
- Email address format validation
- Path validation before file deletion
This script DOES NOT install or configure GAM or GYB. You must install and configure these tools separately BEFORE using this script. The script assumes you have working GAM and GYB installations with proper OAuth credentials already configured.
-
Bash 4.0+
bash --version
-
GAM (Google Apps Manager) - PRE-INSTALLED & CONFIGURED
- Installation guide: https://github.com/GAM-team/GAM
- Must be fully configured with OAuth credentials for your domain
- Configuration stored in
~/.gam/(or customGAMCFGDIR) - Test your installation:
gam version gam info domain
- If GAM is not configured, this script will fail
-
GYB (Got Your Back) - PRE-INSTALLED & CONFIGURED
- Installation guide: https://github.com/GAM-team/got-your-back
- Must be fully configured with service account and OAuth scopes
- Configuration typically in
~/.gyb/or alongside GAM config - Test your installation:
gyb --version
- Required OAuth scopes for GYB:
https://www.googleapis.com/auth/gmail.readonlyhttps://www.googleapis.com/auth/gmail.modifyhttps://mail.google.com/
-
Standard Unix Utilities
- tar, gzip, bc, tee (typically pre-installed)
By default, GAM stores its configuration in ~/.gam/ including:
oauth2.txt- OAuth credentialsoauth2service.json- Service account fileclient_secrets.json- API credentials
You can customize this location using the GAMCFGDIR environment variable:
export GAMCFGDIR="/opt/gam-config"Security Best Practice:
# Restrict access to GAM/GYB configuration
chmod 700 ~/.gam
chmod 600 ~/.gam/*Before using this script, verify GAM and GYB work correctly:
# Test GAM can list users
gam print users query "orgUnitPath='/FormerEmployees'" fields primaryEmail
# Test GYB can estimate a backup (without actually backing up)
gyb --email test@yourdomain.com --action estimateIf either command fails, you need to configure GAM/GYB before using this script.
-
Verify GAM and GYB are installed and configured (see Prerequisites above)
-
Clone or download this repository:
git clone https://github.com/ericgjerde/archiving-gmail-users.git cd archiving-gmail-users -
Ensure the script is executable:
chmod +x archive-workspace-users.sh
-
Test the script with dry-run mode:
./archive-workspace-users.sh --dry-run
That's it! The script will use your existing GAM/GYB configuration automatically.
Archive all users in the default "FormerEmployees" OU:
./archive-workspace-users.sh--dry-run Show what would be done without executing
--user <email> Archive only the specified user
--ou <path> Use custom organizational unit path
--help Display help message
-
Dry Run (Test Mode)
./archive-workspace-users.sh --dry-run
Shows what would happen without making changes.
-
Archive a Specific User
./archive-workspace-users.sh --user john.doe@example.com
-
Use Custom Organizational Unit
./archive-workspace-users.sh --ou "/SuspendedUsers" -
Combine Options
./archive-workspace-users.sh --dry-run --ou "/FormerEmployees/2024"
The script creates and manages the following directories:
.
├── archives/ # Final compressed backups (.tar.gz files)
├── temp/ # Working directory during backup (auto-cleaned)
├── logs/ # Detailed execution logs
└── reports/ # Human-readable summary reports
- Location:
archives/ - Naming:
user@domain.com_YYYYMMDD_HHMMSS.tar.gz - Permissions: 600 (read/write for owner only)
- Contents: Complete email backup from GYB
- Location:
logs/ - Naming:
archive_YYYYMMDD_HHMMSS.log - Contents:
- Timestamped events
- GYB command output
- Error messages
- Processing details
- Location:
reports/ - Naming:
archive_report_YYYYMMDD_HHMMSS.txt - Contents:
- Execution summary
- List of archives created with sizes
- Success/failure statistics
- Reference to detailed log
======================================
Google Workspace Archive Report
======================================
Timestamp: 2024-10-16 14:30:45
Status: COMPLETED
OU: /FormerEmployees
Archives Created:
--------------------------------------
john.doe@example.com_20241016_143045.tar.gz - 2.34 GB
jane.smith@example.com_20241016_144512.tar.gz - 1.87 GB
Summary:
Total Users Processed: 2
Successful: 2
Failed: 0
Skipped (Already Archived): 0
Full log: /path/to/logs/archive_20241016_143045.log
======================================
The script automatically skips users who already have archives, allowing you to:
- Resume interrupted operations
- Re-run the script safely without duplicating work
- Process new users added to the OU
During GYB backups, the script displays real-time progress updates:
[INFO] Found 5925 message(s) to backup for user@example.com
[INFO] Progress: 500/5925 messages (8%)
[INFO] Progress: 1000/5925 messages (16%)
[INFO] Progress: 1500/5925 messages (25%)
...
[INFO] Progress: 5710/5925 messages (96%)
[SUCCESS] GYB backup completed for: user@example.com
Features:
- Shows total message count when backup starts
- Progress updates every 500 messages
- Displays both count and percentage
- No silent periods during long backups
- Background monitoring doesn't interfere with GYB
- Default delay: 0 seconds (no delay between users for maximum speed)
- Retry logic: Automatically retries on rate limit errors
- Max retries: 3 attempts per user
- Backoff: 60 second delay on rate limit detection
- Configurable: Set
DELAY_BETWEEN_USERSenvironment variable to add delays if needed
- Strict error checking (
set -euo pipefail) - Graceful handling of Ctrl+C interruptions
- Continues processing remaining users if one fails
- Detailed error logging for troubleshooting
- Separate success/failure tracking
The script includes multiple layers of security protection:
Input Validation:
- OU path validation to prevent command injection attacks
- Email address format validation (regex)
- Path validation before
rm -rfoperations - Rejects dangerous characters (quotes, semicolons, pipes, backticks, etc.)
Access Controls:
- Restrictive permissions on directories (700 - owner-only)
- Restrictive permissions on archive files (600)
- Secure handling of temporary files with automatic cleanup
- No sensitive data in logs (no passwords or tokens)
Confirmation Requirements:
- Requires typing exact OU path to confirm bulk operations
- Single-user mode requires yes/no confirmation
- Dry-run mode available for safe testing
Audit Trail:
- Logs execution user (USER), hostname (HOSTNAME)
- Logs working directory and GAM version
- Comprehensive timestamped logging of all operations
- Separate tracking of success/failure/skipped operations
Credential Management:
- Uses existing GAM/GYB OAuth configuration (not managed by this script)
- Credentials stored in ~/.gam/ with proper permissions (700)
- Service account files managed by GAM/GYB, not this script
GAM Read-Only Usage:
- GAM is used ONLY for read-only operations (querying users in target OU)
- The only GAM command used:
gam print users query "orgUnitPath='...'" fields primaryEmail,name.fullName - No modifications are made to your Google Workspace via GAM
- Always review GAM commands in any script before execution
- Small mailbox (< 1 GB): 5-15 minutes
- Medium mailbox (1-10 GB): 15-45 minutes
- Large mailbox (> 10 GB): 45-120+ minutes
Total time depends on:
- Number of users
- Mailbox sizes
- Network speed
- Google API rate limits
Ensure adequate disk space:
- Temporary storage: ~2x largest expected mailbox
- Archive storage: ~1x total of all mailboxes
- Recommended minimum: 50 GB free space
- Stable internet connection required throughout
- Recommended: 10+ Mbps download/upload speeds
- Avoid running on metered connections
-
"GAM is not properly configured or accessible"
- GAM is not installed or not in PATH
- Run
gam versionto verify installation - Check
GAMCFGDIRenvironment variable if using custom location - Verify OAuth credentials in
~/.gam/oauth2.txtexist - Re-run GAM initial setup if needed
-
"GYB is not properly configured or accessible"
- GYB is not installed or not in PATH
- Run
gyb --versionto verify installation - Verify service account configuration in GYB
- Check that
~/.gam/oauth2service.jsonexists - Ensure service account has required OAuth scopes
-
"Failed to retrieve user list from GAM"
- Verify organizational unit path is correct
- Test:
gam print orgsto see available OUs - Test:
gam print users query "orgUnitPath='/FormerEmployees'" - Check GAM authentication:
gam info domain
-
"GYB backup failed"
- Check detailed GYB logs in
logs/gyb_*.log - Test GYB manually:
gyb --email test@domain.com --action estimate - Verify service account has domain-wide delegation
- Ensure all required OAuth scopes are granted
- Check detailed GYB logs in
-
Rate limit errors
- Script will automatically retry (up to 3 times)
- May need to spread processing over multiple days
Enable verbose output:
bash -x ./archive-workspace-users.sh --dry-runView real-time progress:
tail -f logs/archive_*.logSearch for errors:
grep ERROR logs/archive_*.log- Review the Script: Always review GAM commands in the script to verify read-only operations
- Verify GAM/GYB First: Ensure they work before running this script
- Test First: Always run with
--dry-runfirst - Single User Test: Test with
--useron a small mailbox - Monitor Initial Runs: Watch logs during first few executions
- Verify Backups: Periodically test archive restoration
- Version Control: Track script changes in git
- Secure GAM Config: Protect
~/.gam/directory (chmod 700) - Keep Tools Updated: Regularly update GAM and GYB
- GAM/GYB OAuth credentials are stored in
~/.gam/(orGAMCFGDIR) - Ensure proper permissions on GAM config directory:
chmod 700 ~/.gam chmod 600 ~/.gam/*
- Service account files managed by GAM/GYB, not this script
- Consider encrypting archives for sensitive data
- Implement access controls on archive directory
- Regularly rotate service account keys in Google Cloud Console
- Audit access to archived data
- Follow your organization's data retention policies
- Never commit
~/.gam/directory to version control
For issues or questions:
- Check the troubleshooting section
- Review detailed logs in
logs/ - Test with
--dry-runmode - Verify all prerequisites are met
This script is provided as-is for Google Workspace administration purposes.
-
1.2.0 - Performance and UX improvements (Current)
- Added real-time progress monitoring during GYB backups
- Shows message count and percentage every 500 messages
- Eliminates silent periods during long backups
- Background polling avoids pipeline buffering issues
- Removed rate limiting delay between users (0 seconds default for maximum speed)
- Switched to Solarized color palette for improved terminal readability
- Cyan INFO messages instead of hard-to-read dark blue
- Fixed TOTAL_USERS display showing "of 0" (subshell variable scope issue)
- Fixed arithmetic increment errors with
set -euo pipefail - Enhanced user feedback throughout backup process
- Users without Gmail licenses are gracefully skipped with clear messaging
- Added real-time progress monitoring during GYB backups
-
1.1.0 - Security hardening release
- CRITICAL FIX: Fixed subshell counter bug (reports now show correct statistics)
- Added OU path validation to prevent command injection
- Added email address format validation
- Added path validation before file deletions
- Enhanced confirmation (requires typing exact OU path)
- Added single-user mode confirmation
- Added audit trail logging (user, host, working directory)
- Tightened directory permissions (750 → 700)
- Automatic cleanup of temporary user list files
- Removed disk space check (admin responsibility)
- Changed default OU paths to remove spaces (/FormerEmployees, /SuspendedUsers)
-
1.0.0 - Initial release with full feature set
- User discovery and confirmation
- Sequential processing with rate limiting
- Comprehensive logging and reporting
- Resume capability
- Dry-run mode
- Single user mode
- Error handling and retry logic