Cloud Console Automation Agent – submit natural-language cloud console tasks, review an automatically generated plan, confirm login, and watch a mock (or real) browser-use agent execute the run while capturing evidence.
- Overview
- Architecture
- Features
- Prerequisites
- Quick Start
- Project Structure
- Development
- Documentation
- Testing
- Deployment
- Troubleshooting
- Contributing
- License
Cloud Console Manager is an intelligent desktop application that allows you to control Google Cloud Platform through natural language commands. Instead of manually clicking through the GCP Console interface, simply tell the agent what you want to do:
- "Create a new VM instance named test-vm in us-central1"
- "List all Cloud Storage buckets"
- "Delete unused VM instances"
- "Set up a load balancer for my production VMs"
The application embeds a real Chrome browser showing the GCP Console, while an AI agent (powered by Google's Gemini) autonomously performs the requested actions. You can watch it work in real-time, pause execution, or manually intervene when needed.
✅ Natural Language Control: Use plain English instead of complex CLI commands ✅ Visual Feedback: Watch the agent work in an embedded browser ✅ Safe & Controllable: Pause/resume execution, manual intervention ✅ Full History: Review past tasks with screenshots and logs ✅ Session Persistence: Automatic login cookie management
Cloud Console Manager is a three-tier desktop solution:
- Django orchestration backend (
/backend) exposes the REST/WebSocket APIs, task queue, governance/audit subsystems, and agent lifecycle controls. - Next.js + Electron wizard (
/frontend) renders the 4-step Submit → Review → Login → Run UI plus role selection, approvals, comments, and evidence cards. - Browser-Use automation stack (bundled under
/browser_use) drives Playwright/Chromium (or a mock agent whenAGENT_USE_MOCK=true) to perform user commands.
Key platform capabilities now include:
- Role-aware sessions with per-role permissions and filtered task views.
- Natural-language task submission with auto-generated plans, dry-run previews, and policy-driven approvals.
- Governance primitives: audit logs, comments, evidence bundles, CSV exports, and monitoring metrics.
The components interact as follows:
┌─────────────────────────────────────────────────────────────────┐
│ ELECTRON DESKTOP APP │
│ ┌───────────────────────────┬───────────────────────────────┐ │
│ │ Embedded Browser │ Chat Interface │ │
│ │ (GCP Console) │ - Send commands │ │
│ │ - Real Chrome │ - Agent thinking │ │
│ │ - User can click │ - Action logs │ │
│ │ - Agent controlled │ - Play/Pause control │ │
│ └───────────────────────────┴───────────────────────────────┘ │
└──────────────────────┬──────────────────────────────────────────┘
│ REST API + WebSocket
┌──────────────────────▼──────────────────────────────────────────┐
│ DJANGO BACKEND │
│ - Agent lifecycle management │
│ - Task queuing & execution │
│ - WebSocket real-time updates │
│ - Session & cookie persistence │
│ - Chat history & screenshot storage │
└──────────────────────┬──────────────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────────────┐
│ BROWSER-USE AGENT FRAMEWORK │
│ - AI-powered browser automation │
│ - Gemini 2.5 Flash LLM integration │
│ - Chrome DevTools Protocol (CDP) │
│ - DOM analysis & action execution │
└──────────────────────────────────────────────────────────────────┘
Backend:
- Django 5.x (REST API & WebSocket)
- Django Channels (WebSocket communication)
- SQLite (data persistence)
- browser-use (AI agent framework)
- Google Vertex AI (Gemini LLM)
Frontend:
- Electron 28+ (desktop framework)
- Next.js 14+ (React framework)
- TypeScript (type safety)
- Tailwind CSS (styling)
- Zustand (state management)
- Socket.io (WebSocket client)
Agent:
- browser-use library (async Python)
- cdp-use (Chrome DevTools Protocol)
- Gemini 2.5 Flash (LLM)
- Natural language command processing
- Intelligent GCP Console navigation
- Automatic action execution with retries
- Context-aware decision making
- Native desktop application (Windows, macOS, Linux)
- Embedded browser view of GCP Console
- Real-time agent thinking and action display
- Chat-style command interface
- Play/Pause: Stop agent execution anytime
- Manual Intervention: Click in browser when needed
- Task Queue: Submit multiple commands
- Action Review: See every step the agent takes
- Complete task history
- Screenshot timeline for each task
- Detailed execution logs
- Success/failure tracking
- Encrypted cookie storage
- Automatic GCP login management
- Local data storage (no cloud sync)
- Session persistence across restarts
- Operating System: Windows 10+, macOS 11+, or Linux (Ubuntu 20.04+)
- Display: Minimum 1200x700 resolution
- Memory: 4GB RAM minimum, 8GB recommended
- Storage: 2GB free space
Backend:
- Python 3.11 or higher
- Chrome/Chromium browser installed
- pip or uv package manager
Frontend:
- Node.js 18 or higher
- npm or yarn package manager
Google Cloud:
- GCP project with Vertex AI enabled
- Service account with appropriate permissions
- Service account credentials JSON file
Prerequisite: Install Docker Desktop (or Docker Engine) with Docker Compose Plugin enabled.
- From the repo root run
docker compose up --build(ormake devto use the Makefile wrapper). - Open http://localhost:3000 for the Next.js wizard and http://localhost:8000/api/tasks/ for the Django API.
- Containers default to mock agent mode (
AGENT_USE_MOCK=true) and persist SQLite + media data inside thebackend-datavolume. - Stop or inspect with
make stop/make logs. - Run backend integration tests in-container anytime via
make test-backend; run frontend unit tests withmake test-frontend.
Switching to a real agent: edit docker-compose.yml (or create docker-compose.override.yml) to set AGENT_USE_MOCK=false, provide your AGENT_MODEL, Vertex/Gemini credentials, and any Playwright-specific env like BROWSER_CDP_PORT. The backend image already includes Chromium + Playwright dependencies—just ensure the required API keys are mounted/available before docker compose up.
- Clone & bootstrap
git clone https://github.com/dpraj007/CC_Manager-.git
cd CC_Manager-
uv venv --python 3.11
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -r backend/requirements.txt
uv pip install -e .
cp backend/.env.example backend/.env
# Optional (only for real browser automation): uvx playwright install chromium- Run the Django API
cd backend
python manage.py migrate
python manage.py runserver 0.0.0.0:8000Leave the terminal running. You can confirm readiness anytime with:
curl http://localhost:8000/health- Run the Next.js wizard
cd frontend
npm install
# Optionally point at a different backend:
# export NEXT_PUBLIC_API_URL=http://localhost:8000
npm run devVisit http://localhost:3000 and walk through Submit → Review → Login → Run. The UI pings /health before enabling actions, so if the backend is down you’ll see a “Retry connection” banner instead of silent failures. AGENT_USE_MOCK=true in backend/.env, so mock evidence flows out-of-the-box; flip it to false plus real credentials when you are ready for a full browser automation run.
-
First run script
-
Enter a natural-language command on the Submit step (for example, “List all VM instances in my project”).
-
Confirm the command on the Review step or go back to edit it.
-
Manually log into the embedded Google Cloud Console, then use the Login step to check/confirm authentication. By default
AGENT_USE_MOCK=true, so local runs use a safe mock agent—set it tofalseinbackend/.envwhen you are ready for the full browser-use automation. -
Click Run task to trigger execution. The Run step polls
/api/tasks/{id}/for status transitions and automatically fetches/api/screenshots/?task_id=...once the task completes.
-
Start backend
source .venv/bin/activate python backend/manage.py runserverEnsure
backend/.envhasAGENT_USE_MOCK=trueso the flow does not require real cloud credentials. -
Start frontend
cd frontend npm run devVisit
http://localhost:3000.
The production agent now parses a small, safe catalog of cloud insights through a planner → structured executor → evidence pipeline. Commands outside this catalog remain “unsupported” and the Run button stays disabled. See docs/capabilities.md for the full capability matrix.
| Capability | Natural-language cues | Requirements | Returned Data |
|---|---|---|---|
list_vms |
“how many vm instances”, “list the GCE instances” | include project <id> (zone/region optional) |
VM rows with status, zone, machine type, IPs |
list_projects |
“list projects”, “show all projects” | none | Project id, display name, state |
list_service_accounts |
“service accounts in project X” | include project <id> |
account display name, email, disabled flag |
check_bigquery_api |
“is BigQuery API enabled for X” | include project <id> |
API enablement state + timestamp |
Every successful run persists a structured JSON payload (task.result_payload) and the UI renders it in the Run + History views alongside the action timeline and screenshots. Mock mode emits believable data for each capability so demos feel identical to live mode.
- Provide Google Cloud credentials and agent env vars:
AGENT_USE_MOCK=false AGENT_MODEL=ChatBrowserUse GOOGLE_CLOUD_PROJECT=your-project GOOGLE_CLOUD_LOCATION=us-central1 GOOGLE_APPLICATION_CREDENTIALS=/secrets/gcp-creds.json BROWSER_CDP_PORT=9222 SECRET_REDACT_HINTS=password,secret,api_key - Install Playwright in the backend image (
python -m playwright install chromium) – already baked into the provided Dockerfile. - When using Docker Compose, mount the service-account JSON plus persistent browser artifacts:
services: backend: volumes: - backend-data:/data - ./secrets/gcp-creds.json:/secrets/gcp-creds.json:ro - ./storage/screenshots:/data/storage/screenshots - ./storage/profiles:/data/storage/profiles
- From the Next.js wizard, submit one of the supported tasks. The Review step shows the recognized plan type + parameters, and the Run step streams structured action logs, metrics, and real screenshots captured from Chromium.
Every real run logs planner decisions, task state transitions, and agent actions via apps.governance.audit_logger. Sensitive values (keywords listed in SECRET_REDACT_HINTS or registered secrets) are automatically redacted before persisting logs, comments, or action metadata.
-
Submit – Type a natural-language instruction (e.g., “Create a mock VM named demo”), optionally add a business summary, and choose whether to run as a dry-run preview.
-
Review – Inspect the generated plan steps, see whether policy requires approval, and (if you are an Admin) approve sensitive tasks.
-
Login – Switch to the embedded browser window, complete the manual Google login (in mock mode you can simply click “I’ve logged in”). The wizard polls
/api/sessions/login-status/and surfaces the session ID + login flag. -
Run – Click “Run task”. The frontend calls
POST /api/tasks/{id}/execute/, then polls/api/tasks/{id}/until it transitions frompending → running → completed/failed. Whencompleted, it fetches/api/tasks/{id}/evidence/to render screenshots, comments, and logs. -
Collaborate & Report – Add review comments, refresh audit logs/metrics, and download the CSV report for business stakeholders.
# All commands from repo root
docker compose up --build # or: make dev
# backend → http://localhost:8000
# frontend → http://localhost:3000The compose file (and Makefile) default to mock mode, mount sqlite/screenshots into the backend-data volume, and wire the frontend to the backend via NEXT_PUBLIC_API_URL=http://backend:8000. Stop the stack with docker compose down, tail logs with make logs, and run backend tests in-container via make test-backend. When you are ready for live automation, override AGENT_USE_MOCK=false and provide the relevant Vertex/Gemini credentials through docker-compose.override.yml or environment variables.
The project includes example scripts for direct browser automation without the full desktop application. The main example script is gcp_manual_login_persistent_new17.py:
Prerequisites:
- Ensure you have installed dependencies using
uv sync --all-extras(oruv syncif you prefer) - Credentials Setup:
- The actual Google Cloud service account credentials are stored in the JSON file (e.g.,
nice-script-404403-9d12fbf6a127.jsonin the project root) - The
.envfile does NOT store credentials - it only contains a path reference to the JSON file - Configure your
.envfile to point to the credentials JSON file:GOOGLE_CLOUD_PROJECT=your-project-id GOOGLE_CLOUD_LOCATION=us-central1 GOOGLE_APPLICATION_CREDENTIALS=/home/ubuntu/SE/All_project/BROWSER_USE_TEST/cloud_console_automation_agent/nice-script-404403-9d12fbf6a127.json
- Important: Update
GOOGLE_APPLICATION_CREDENTIALSwith the absolute path to yournice-script-404403-9d12fbf6a127.jsonfile. The credentials themselves remain in the JSON file, not in.env.
- The actual Google Cloud service account credentials are stored in the JSON file (e.g.,
Run the example script:
# From the project root directory
uv run python examples/gcp_manual_login_persistent_new17.pyWhat the script does:
- Opens a browser and navigates to GCP Console sign-in page
- Waits for you to manually log in (handles 2FA, security checks)
- After login, accepts natural language commands interactively
- Executes GCP console tasks using AI agents (Gemini 2.5 Flash)
- Maintains a persistent browser session across multiple commands
- Captures screenshots and logs all actions in
runs/directory
Example usage:
$ uv run python examples/gcp_manual_login_persistent_new17.py
[Agent navigates to GCP console...]
Log in manually, then press Enter …
[Complete login in browser]
<Enter>
Next GCP action (or 'exit'): List all VM instances
[Agent executes task...]
Next GCP action (or 'exit'): Create a new storage bucket
[Agent executes task...]
Next GCP action (or 'exit'): exit
Note: Use uv run to ensure you're using the correct Python environment with all dependencies installed. Alternatively, you can activate the virtual environment first:
source .venv/bin/activate # On Windows: .venv\Scripts\activate
python examples/gcp_manual_login_persistent_new17.pyFor detailed documentation about this script, see examples/README_gcp_manual_login_persistent_new17.md.
CC_Manager-/
├── backend/ # Django backend server
│ ├── config/ # Django configuration
│ │ ├── settings/ # Environment-specific settings
│ │ ├── urls.py # URL routing
│ │ └── asgi.py # WebSocket support
│ ├── apps/ # Django applications
│ │ ├── agents/ # Agent lifecycle management
│ │ ├── tasks/ # Task execution & queuing
│ │ ├── chat/ # Chat history storage
│ │ ├── sessions/ # Session & cookie management
│ │ └── screenshots/ # Screenshot capture & storage
│ ├── core/ # Shared utilities
│ ├── storage/ # File storage (cookies, screenshots)
│ ├── tests/ # Backend tests
│ ├── manage.py # Django management
│ └── requirements.txt # Python dependencies
│
├── frontend/ # Electron + Next.js frontend
│ ├── main/ # Electron main process
│ │ ├── index.ts # Application entry point
│ │ ├── window.ts # Window management
│ │ ├── browser-view.ts # Embedded browser controller
│ │ ├── backend-launcher.ts # Backend process manager
│ │ └── ipc-handlers.ts # IPC communication
│ ├── renderer/ # Next.js renderer process
│ │ ├── app/ # Next.js App Router pages
│ │ ├── components/ # React components
│ │ │ ├── browser/ # Browser view components
│ │ │ ├── chat/ # Chat interface
│ │ │ ├── controls/ # Play/pause controls
│ │ │ └── history/ # Task history
│ │ ├── hooks/ # Custom React hooks
│ │ ├── stores/ # Zustand state stores
│ │ ├── services/ # API & WebSocket clients
│ │ └── types/ # TypeScript types
│ ├── package.json # Node dependencies
│ └── electron-builder.yml # Build configuration
│
├── browser_use/ # browser-use agent framework
│ ├── agent/ # Agent orchestration
│ ├── browser/ # Browser session management
│ ├── dom/ # DOM extraction & analysis
│ ├── llm/ # LLM integration layer
│ └── tools/ # Action registry
│
├── docs/ # Documentation
│ ├── backend_design_spec.md # Backend architecture
│ ├── frontend_design_spec.md # Frontend architecture
│ └── testing_plan.md # Testing strategy
│
├── tests/ # Integration tests
├── examples/ # Usage examples
├── docker/ # Docker configuration
│
├── README.md # This file
├── CLAUDE.md # Instructions for Claude Code
├── LICENSE # MIT License
└── pyproject.toml # Python project config
The running list of specification epics / stories and their implementation state lives in
docs/user_story_coverage.md. Update that file whenever a story’s status changes or when the missing specification artifacts are committed.
cd backend
# Activate virtual environment
source venv/bin/activate
# Run development server with auto-reload
uvicorn config.asgi:application --reload --port 8000
# Run tests
python manage.py test
# Create database migrations
python manage.py makemigrations
python manage.py migrate
# Access admin panel
python manage.py createsuperuser
# Visit http://localhost:8000/admin/cd frontend
# Start Next.js dev server (hot reload)
npm run dev:next
# Start Electron (restart on main process changes)
npm run dev:electron
# Run concurrently (both at once)
npm run dev
# Build for production
npm run build
npm run electron:build
# Type checking
npm run type-check
# Linting
npm run lint- Backend changes: Edit Python files, server auto-reloads
- Frontend UI: Edit
renderer/components, hot reload active - Electron main: Edit
main/files, restart Electron app - Database changes: Create migrations, apply with
migrate - API changes: Update both backend endpoints and frontend services
Detailed documentation is available in the /docs directory:
- Backend Design Specification: Complete backend architecture, API endpoints, database models, WebSocket protocol
- Frontend Design Specification: Frontend architecture, component hierarchy, state management, UI design
- Testing Plan: Comprehensive testing strategy and test cases
Additional resources:
- CLAUDE.md: Instructions for working with this codebase using Claude Code
- Examples: Usage examples and code samples
- Browser-Use Docs: browser-use framework documentation
cd backend
# Run all tests
python manage.py test
# Run specific app tests
python manage.py test apps.agents
python manage.py test apps.tasks
# Run with coverage
pip install coverage
coverage run --source='.' manage.py test
coverage reportcd frontend
# Run unit tests (when implemented)
npm test
# Run E2E tests (when implemented)
npm run test:e2e
# Type checking
npm run type-check# From repository root
cd tests/
# Run integration test suite (when implemented)
pytest -vBuild for current platform:
cd frontend
npm run build
npm run electron:buildPlatform-specific builds:
npm run electron:build:mac # macOS (.dmg)
npm run electron:build:win # Windows (.exe)
npm run electron:build:linux # Linux (.AppImage, .deb)Output location:
- Windows:
frontend/dist/win-unpacked/and.exeinstaller - macOS:
frontend/dist/mac/and.dmgdisk image - Linux:
frontend/dist/linux-unpacked/and.AppImage
The built application is self-contained:
- Includes Django backend bundled
- Includes Python runtime
- Includes Node.js/Electron runtime
- Requires only Chrome/Chromium on target system
Installation:
- Download platform-specific installer
- Run installer (may require admin/sudo)
- Launch "Cloud Console Manager"
- Configure Google Cloud credentials on first run
- Log into GCP Console
- Start automating!
Backend won't start:
- Check Python version:
python3 --version(needs 3.11+) - Verify dependencies:
pip install -r requirements.txt - Check port 8000 not in use:
lsof -i :8000(macOS/Linux) - Review backend logs for errors
Agent execution fails:
- Verify Google Cloud credentials in
.env - Ensure Vertex AI API is enabled in GCP project
- Check service account has necessary permissions
- Confirm
AGENT_MODELis valid (default:gemini-2.5-flash)
Database errors:
- Delete
db.sqlite3and run migrations again - Check file permissions on database file
- For SQLite locked errors, ensure only one backend instance running
Electron won't start:
- Check Node.js version:
node --version(needs 18+) - Clear node_modules and reinstall:
rm -rf node_modules && npm install - Check backend is running on port 8000
- Review Electron main process logs in terminal
Browser view not showing:
- Ensure Chrome/Chromium installed on system
- Check IPC communication logs
- Verify browser-view bounds calculation
- Try restarting application
WebSocket connection fails:
- Confirm backend WebSocket endpoint:
ws://localhost:8000/ws/agent/ - Check firewall not blocking WebSocket connections
- Verify
NEXT_PUBLIC_WS_URLin.env.local - Review browser console for errors
"Permission denied" errors in GCP:
- Service account needs proper IAM roles
- Check project-level permissions
- Some operations require Owner/Editor role
Browser automation not working:
- GCP Console UI may have changed (agents adapt but may need updates)
- Clear browser profile:
rm -rf backend/storage/profiles/ - Check CDP port not in use:
lsof -i :9222
Login session expires:
- Manually log in again when prompted
- Cookie encryption key changed (check
DJANGO_SECRET_KEY) - Clear stored cookies:
rm -rf backend/storage/cookies/
Contributions are welcome! Here's how to get started:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes:
- Backend: Follow Django best practices, add tests
- Frontend: Follow React/TypeScript conventions, add types
- Documentation: Update relevant .md files
- Test thoroughly:
- Run backend tests:
python manage.py test - Run frontend type check:
npm run type-check - Test manually in application
- Run backend tests:
- Commit changes:
git commit -m "Add amazing feature" - Push to branch:
git push origin feature/amazing-feature - Open Pull Request
Python (Backend):
- Use tabs for indentation (not spaces)
- Type hints with modern syntax (
str | None, notOptional[str]) - Async/await for concurrent operations
- Django conventions for models/views/serializers
TypeScript (Frontend):
- Strict mode enabled
- Explicit types for function parameters/returns
- React hooks for state/effects
- Zustand for global state
See CLAUDE.md for detailed development guidelines.
- ✅ Basic agent execution
- ✅ Embedded browser view
- ✅ Chat interface
- ✅ Task history
- ✅ Cookie persistence
- Multi-cloud support (AWS, Azure)
- Task templates/presets
- Video recording of sessions
- Export chat/history to PDF
- Keyboard shortcuts
- Multi-user support
- Cloud sync of task history
- Browser-Use Cloud integration
- Custom agent prompts
- Plugin system
This project is licensed under the MIT License - see the LICENSE file for details.
- browser-use: AI browser automation framework
- Django: Python web framework
- Electron: Desktop application framework
- Next.js: React framework
- Google Cloud: Vertex AI and Gemini LLM
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: dpraj007@example.com
Made with ❤️ for GCP automation