This guide covers installation and running your first collection.
- Python 3.9+
- Access to cloud environment (AWS, Azure, GCP, or M365)
- Appropriate permissions (see Required Permissions)
curl -sL https://github.com/LFigg/cca-cloudshell/archive/refs/heads/main.tar.gz | tar xz
cd cca-cloudshell-main
./setup.shgit clone https://github.com/LFigg/cca-cloudshell.git
cd cca-cloudshell
./setup.sh# Install Python dependencies
pip install -r requirements.txt
# For specific clouds only:
pip install boto3 # AWS
pip install azure-identity azure-mgmt-compute azure-mgmt-storage # Azure (partial)
pip install google-cloud-compute google-cloud-storage # GCP (partial)
pip install msgraph-sdk azure-identity # M365The easiest way to run collection is using the unified entry point:
# Auto-detect credentials and run
python3 collect.py
# Setup wizard for first-time users
python3 collect.py --setup
# Specify cloud directly
python3 collect.py --cloud aws
python3 collect.py --cloud azure
python3 collect.py --cloud gcp
python3 collect.py --cloud m365The unified collector will:
- Auto-detect which cloud credentials are configured
- Verify your credentials and permissions
- Run the appropriate collector(s) with sensible defaults (cost and change rate collection enabled)
- Prompt only for optional configuration
When using the interactive menu, you'll be prompted to include data protection cost collection after selecting a cloud platform:
Data protection cost collection analyzes AWS Backup, EBS snapshot,
and other backup-related costs from AWS Cost Explorer.
Also collect data protection costs? [Y/n]:
Cost collection is enabled by default - just press Enter to confirm, or type n to skip.
You can also run collectors directly:
# In AWS CloudShell (credentials automatic)
python3 aws_collect.py
# Local with AWS CLI configured
aws configure # if not already done
python3 aws_collect.py# In Azure Cloud Shell (credentials automatic)
python3 azure_collect.py
# Local with Azure CLI
az login
python3 azure_collect.py# In Google Cloud Shell (credentials automatic)
python3 gcp_collect.py
# Local with gcloud CLI
gcloud auth application-default login
python3 gcp_collect.py --project my-project-id# Set credentials (see M365 Collector docs for app registration)
export MS365_TENANT_ID="your-tenant-id"
export MS365_CLIENT_ID="your-client-id"
export MS365_CLIENT_SECRET="your-client-secret"
python3 m365_collect.pyFor repeated runs or complex configurations, use a YAML config file:
# Generate a sample config
python3 collect.py --generate-config aws > cca-config.yaml
# Edit cca-config.yaml, then run with it
python3 collect.py --config cca-config.yamlSee config-examples/ for sample configurations.
Config files support environment variable substitution:
aws:
role_arn: ${CCA_ROLE_ARN} # Required
external_id: ${CCA_EXTERNAL_ID:-} # Optional with defaultUse the setup scripts in setup/ to configure permissions:
# AWS - Deploy IAM role via CloudFormation
./setup/setup-aws-permissions.sh
# Azure - Assign Reader role to subscriptions
./setup/setup-azure-permissions.sh
# GCP - Grant Viewer role to projects
./setup/setup-gcp-permissions.shSee Required Permissions for details on what access is needed.
Each collector generates:
| File | Description |
|---|---|
cca_<cloud>_inv_<time>.json |
Full resource inventory |
cca_<cloud>_sum_<time>.json |
Aggregated summary |
cca_log_<time>.log |
Collection log for troubleshooting |
Collectors show real-time progress with:
- Spinner animation during collection
- Progress bar showing region/subscription progress
- Resource counts as they're discovered
- Summary table at completion
When output is piped (non-TTY), plain text progress messages are shown instead.
- Config Examples - Sample YAML configurations
- AWS Collector - Multi-account, regions, all options
- Azure Collector - Subscriptions, resource types
- GCP Collector - Projects, regions, resources
- M365 Collector - App registration, permissions
- Cost Collector - Backup/snapshot spending
- Output Formats - JSON schema, CSV fields
- Required Permissions - IAM policies for each cloud
- Setup Scripts - Automated permission configuration
After collection, generate reports for analysis:
# Protection status report
python scripts/generate_protection_report.py cca_aws_inv_*.json protection_report.xlsx
# Comprehensive assessment report (multi-tab Excel)
python scripts/generate_assessment_report.py cca_*_inv_*.json assessment.xlsx
# Include cost data in assessment
python scripts/generate_assessment_report.py cca_aws_inv_*.json --cost cca_cost_*.json -o assessment.xlsxBy default, resource IDs are redacted in output files for privacy. Use these flags to control what's included:
# Include full resource IDs/ARNs in output
python3 aws_collect.py --include-resource-ids
python3 azure_collect.py --include-resource-ids
python3 gcp_collect.py --include-resource-ids
# Azure - include individual recovery points (verbose, can be slow)
python3 azure_collect.py --include-recovery-points