Cloud Console Manager

Cloud Console Automation Agent – submit natural-language cloud console tasks, review an automatically generated plan, confirm login, and watch a mock (or real) browser-use agent execute the run while capturing evidence.

📋 Table of Contents

Overview
Architecture
Features
Prerequisites
Quick Start
Project Structure
Development
Documentation
Testing
Deployment
Troubleshooting
Contributing
License

Overview

Cloud Console Manager is an intelligent desktop application that allows you to control Google Cloud Platform through natural language commands. Instead of manually clicking through the GCP Console interface, simply tell the agent what you want to do:

"Create a new VM instance named test-vm in us-central1"
"List all Cloud Storage buckets"
"Delete unused VM instances"
"Set up a load balancer for my production VMs"

The application embeds a real Chrome browser showing the GCP Console, while an AI agent (powered by Google's Gemini) autonomously performs the requested actions. You can watch it work in real-time, pause execution, or manually intervene when needed.

Key Benefits

✅ Natural Language Control: Use plain English instead of complex CLI commands ✅ Visual Feedback: Watch the agent work in an embedded browser ✅ Safe & Controllable: Pause/resume execution, manual intervention ✅ Full History: Review past tasks with screenshots and logs ✅ Session Persistence: Automatic login cookie management

Architecture

Cloud Console Manager is a three-tier desktop solution:

Django orchestration backend (/backend) exposes the REST/WebSocket APIs, task queue, governance/audit subsystems, and agent lifecycle controls.
Next.js + Electron wizard (/frontend) renders the 4-step Submit → Review → Login → Run UI plus role selection, approvals, comments, and evidence cards.
Browser-Use automation stack (bundled under /browser_use) drives Playwright/Chromium (or a mock agent when AGENT_USE_MOCK=true) to perform user commands.

Key platform capabilities now include:

Role-aware sessions with per-role permissions and filtered task views.
Natural-language task submission with auto-generated plans, dry-run previews, and policy-driven approvals.
Governance primitives: audit logs, comments, evidence bundles, CSV exports, and monitoring metrics.

The components interact as follows:

┌─────────────────────────────────────────────────────────────────┐
│                    ELECTRON DESKTOP APP                          │
│  ┌───────────────────────────┬───────────────────────────────┐  │
│  │   Embedded Browser        │   Chat Interface              │  │
│  │   (GCP Console)           │   - Send commands             │  │
│  │   - Real Chrome           │   - Agent thinking            │  │
│  │   - User can click        │   - Action logs               │  │
│  │   - Agent controlled      │   - Play/Pause control        │  │
│  └───────────────────────────┴───────────────────────────────┘  │
└──────────────────────┬──────────────────────────────────────────┘
                       │ REST API + WebSocket
┌──────────────────────▼──────────────────────────────────────────┐
│                    DJANGO BACKEND                                │
│  - Agent lifecycle management                                    │
│  - Task queuing & execution                                      │
│  - WebSocket real-time updates                                   │
│  - Session & cookie persistence                                  │
│  - Chat history & screenshot storage                             │
└──────────────────────┬──────────────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────────────┐
│                BROWSER-USE AGENT FRAMEWORK                       │
│  - AI-powered browser automation                                 │
│  - Gemini 2.5 Flash LLM integration                              │
│  - Chrome DevTools Protocol (CDP)                                │
│  - DOM analysis & action execution                               │
└──────────────────────────────────────────────────────────────────┘

Technology Stack

Backend:

Django 5.x (REST API & WebSocket)
Django Channels (WebSocket communication)
SQLite (data persistence)
browser-use (AI agent framework)
Google Vertex AI (Gemini LLM)

Frontend:

Electron 28+ (desktop framework)
Next.js 14+ (React framework)
TypeScript (type safety)
Tailwind CSS (styling)
Zustand (state management)
Socket.io (WebSocket client)

Agent:

browser-use library (async Python)
cdp-use (Chrome DevTools Protocol)
Gemini 2.5 Flash (LLM)

Features

🤖 AI-Powered Automation

Natural language command processing
Intelligent GCP Console navigation
Automatic action execution with retries
Context-aware decision making

🖥️ Desktop Experience

Native desktop application (Windows, macOS, Linux)
Embedded browser view of GCP Console
Real-time agent thinking and action display
Chat-style command interface

🎮 User Control

Play/Pause: Stop agent execution anytime
Manual Intervention: Click in browser when needed
Task Queue: Submit multiple commands
Action Review: See every step the agent takes

📊 History & Monitoring

Complete task history
Screenshot timeline for each task
Detailed execution logs
Success/failure tracking

🔐 Security & Persistence

Encrypted cookie storage
Automatic GCP login management
Local data storage (no cloud sync)
Session persistence across restarts

Prerequisites

System Requirements

Operating System: Windows 10+, macOS 11+, or Linux (Ubuntu 20.04+)
Display: Minimum 1200x700 resolution
Memory: 4GB RAM minimum, 8GB recommended
Storage: 2GB free space

Software Requirements

Backend:

Python 3.11 or higher
Chrome/Chromium browser installed
pip or uv package manager

Frontend:

Node.js 18 or higher
npm or yarn package manager

Google Cloud:

GCP project with Vertex AI enabled
Service account with appropriate permissions
Service account credentials JSON file

Containerized Quickstart

Prerequisite: Install Docker Desktop (or Docker Engine) with Docker Compose Plugin enabled.

From the repo root run docker compose up --build (or make dev to use the Makefile wrapper).
Open http://localhost:3000 for the Next.js wizard and http://localhost:8000/api/tasks/ for the Django API.
Containers default to mock agent mode (AGENT_USE_MOCK=true) and persist SQLite + media data inside the backend-data volume.
Stop or inspect with make stop / make logs.
Run backend integration tests in-container anytime via make test-backend; run frontend unit tests with make test-frontend.

Switching to a real agent: edit docker-compose.yml (or create docker-compose.override.yml) to set AGENT_USE_MOCK=false, provide your AGENT_MODEL, Vertex/Gemini credentials, and any Playwright-specific env like BROWSER_CDP_PORT. The backend image already includes Chromium + Playwright dependencies—just ensure the required API keys are mounted/available before docker compose up.

Quick Start

Local Development (Mock Mode by Default)

Clone & bootstrap

git clone https://github.com/dpraj007/CC_Manager-.git
cd CC_Manager-
uv venv --python 3.11
source .venv/bin/activate  # Windows: .venv\Scripts\activate
uv pip install -r backend/requirements.txt
uv pip install -e .
cp backend/.env.example backend/.env
# Optional (only for real browser automation): uvx playwright install chromium

Run the Django API

cd backend
python manage.py migrate
python manage.py runserver 0.0.0.0:8000

Leave the terminal running. You can confirm readiness anytime with:

curl http://localhost:8000/health

Run the Next.js wizard

cd frontend
npm install
# Optionally point at a different backend:
# export NEXT_PUBLIC_API_URL=http://localhost:8000
npm run dev

Visit http://localhost:3000 and walk through Submit → Review → Login → Run. The UI pings /health before enabling actions, so if the backend is down you’ll see a “Retry connection” banner instead of silent failures. AGENT_USE_MOCK=true in backend/.env, so mock evidence flows out-of-the-box; flip it to false plus real credentials when you are ready for a full browser automation run.

First run script
Enter a natural-language command on the Submit step (for example, “List all VM instances in my project”).
Confirm the command on the Review step or go back to edit it.
Manually log into the embedded Google Cloud Console, then use the Login step to check/confirm authentication. By default AGENT_USE_MOCK=true, so local runs use a safe mock agent—set it to false in backend/.env when you are ready for the full browser-use automation.
Click Run task to trigger execution. The Run step polls /api/tasks/{id}/ for status transitions and automatically fetches /api/screenshots/?task_id=... once the task completes.

Demo Script (Submit → Review → Login → Run)

Start backend
```
source .venv/bin/activate
python backend/manage.py runserver
```
Ensure backend/.env has AGENT_USE_MOCK=true so the flow does not require real cloud credentials.
Start frontend
```
cd frontend
npm run dev
```
Visit http://localhost:3000.

Supported Real Automations (AGENT_USE_MOCK=false)

The production agent now parses a small, safe catalog of cloud insights through a planner → structured executor → evidence pipeline. Commands outside this catalog remain “unsupported” and the Run button stays disabled. See docs/capabilities.md for the full capability matrix.

Capability	Natural-language cues	Requirements	Returned Data
`list_vms`	“how many vm instances”, “list the GCE instances”	include `project <id>` (zone/region optional)	VM rows with status, zone, machine type, IPs
`list_projects`	“list projects”, “show all projects”	none	Project id, display name, state
`list_service_accounts`	“service accounts in project X”	include `project <id>`	account display name, email, disabled flag
`check_bigquery_api`	“is BigQuery API enabled for X”	include `project <id>`	API enablement state + timestamp

Every successful run persists a structured JSON payload (task.result_payload) and the UI renders it in the Run + History views alongside the action timeline and screenshots. Mock mode emits believable data for each capability so demos feel identical to live mode.

Enabling the real agent

Provide Google Cloud credentials and agent env vars:

AGENT_USE_MOCK=false
AGENT_MODEL=ChatBrowserUse
GOOGLE_CLOUD_PROJECT=your-project
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_APPLICATION_CREDENTIALS=/secrets/gcp-creds.json
BROWSER_CDP_PORT=9222
SECRET_REDACT_HINTS=password,secret,api_key

Install Playwright in the backend image (python -m playwright install chromium) – already baked into the provided Dockerfile.

When using Docker Compose, mount the service-account JSON plus persistent browser artifacts:

services:
  backend:
    volumes:
      - backend-data:/data
      - ./secrets/gcp-creds.json:/secrets/gcp-creds.json:ro
      - ./storage/screenshots:/data/storage/screenshots
      - ./storage/profiles:/data/storage/profiles

From the Next.js wizard, submit one of the supported tasks. The Review step shows the recognized plan type + parameters, and the Run step streams structured action logs, metrics, and real screenshots captured from Chromium.

Every real run logs planner decisions, task state transitions, and agent actions via apps.governance.audit_logger. Sensitive values (keywords listed in SECRET_REDACT_HINTS or registered secrets) are automatically redacted before persisting logs, comments, or action metadata.

Submit – Type a natural-language instruction (e.g., “Create a mock VM named demo”), optionally add a business summary, and choose whether to run as a dry-run preview.
Review – Inspect the generated plan steps, see whether policy requires approval, and (if you are an Admin) approve sensitive tasks.
Login – Switch to the embedded browser window, complete the manual Google login (in mock mode you can simply click “I’ve logged in”). The wizard polls /api/sessions/login-status/ and surfaces the session ID + login flag.
Run – Click “Run task”. The frontend calls POST /api/tasks/{id}/execute/, then polls /api/tasks/{id}/ until it transitions from pending → running → completed/failed. When completed, it fetches /api/tasks/{id}/evidence/ to render screenshots, comments, and logs.
Collaborate & Report – Add review comments, refresh audit logs/metrics, and download the CSV report for business stakeholders.

Containerized Development

# All commands from repo root
docker compose up --build             # or: make dev
# backend → http://localhost:8000
# frontend → http://localhost:3000

The compose file (and Makefile) default to mock mode, mount sqlite/screenshots into the backend-data volume, and wire the frontend to the backend via NEXT_PUBLIC_API_URL=http://backend:8000. Stop the stack with docker compose down, tail logs with make logs, and run backend tests in-container via make test-backend. When you are ready for live automation, override AGENT_USE_MOCK=false and provide the relevant Vertex/Gemini credentials through docker-compose.override.yml or environment variables.

Running Example Scripts (Optional)

The project includes example scripts for direct browser automation without the full desktop application. The main example script is gcp_manual_login_persistent_new17.py:

Prerequisites:

Ensure you have installed dependencies using uv sync --all-extras (or uv sync if you prefer)
Credentials Setup:
- The actual Google Cloud service account credentials are stored in the JSON file (e.g., nice-script-404403-9d12fbf6a127.json in the project root)
- The .env file does NOT store credentials - it only contains a path reference to the JSON file
- Configure your .env file to point to the credentials JSON file:
```
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_APPLICATION_CREDENTIALS=/home/ubuntu/SE/All_project/BROWSER_USE_TEST/cloud_console_automation_agent/nice-script-404403-9d12fbf6a127.json
```
- Important: Update GOOGLE_APPLICATION_CREDENTIALS with the absolute path to your nice-script-404403-9d12fbf6a127.json file. The credentials themselves remain in the JSON file, not in .env.

Run the example script:

# From the project root directory
uv run python examples/gcp_manual_login_persistent_new17.py

What the script does:

Opens a browser and navigates to GCP Console sign-in page
Waits for you to manually log in (handles 2FA, security checks)
After login, accepts natural language commands interactively
Executes GCP console tasks using AI agents (Gemini 2.5 Flash)
Maintains a persistent browser session across multiple commands
Captures screenshots and logs all actions in runs/ directory

Example usage:

$ uv run python examples/gcp_manual_login_persistent_new17.py
[Agent navigates to GCP console...]

Log in manually, then press Enter …
[Complete login in browser]
<Enter>

Next GCP action (or 'exit'): List all VM instances
[Agent executes task...]

Next GCP action (or 'exit'): Create a new storage bucket
[Agent executes task...]

Next GCP action (or 'exit'): exit

Note: Use uv run to ensure you're using the correct Python environment with all dependencies installed. Alternatively, you can activate the virtual environment first:

source .venv/bin/activate  # On Windows: .venv\Scripts\activate
python examples/gcp_manual_login_persistent_new17.py

For detailed documentation about this script, see examples/README_gcp_manual_login_persistent_new17.md.

Project Structure

CC_Manager-/
├── backend/                        # Django backend server
│   ├── config/                    # Django configuration
│   │   ├── settings/             # Environment-specific settings
│   │   ├── urls.py               # URL routing
│   │   └── asgi.py               # WebSocket support
│   ├── apps/                     # Django applications
│   │   ├── agents/               # Agent lifecycle management
│   │   ├── tasks/                # Task execution & queuing
│   │   ├── chat/                 # Chat history storage
│   │   ├── sessions/             # Session & cookie management
│   │   └── screenshots/          # Screenshot capture & storage
│   ├── core/                     # Shared utilities
│   ├── storage/                  # File storage (cookies, screenshots)
│   ├── tests/                    # Backend tests
│   ├── manage.py                 # Django management
│   └── requirements.txt          # Python dependencies
│
├── frontend/                      # Electron + Next.js frontend
│   ├── main/                     # Electron main process
│   │   ├── index.ts              # Application entry point
│   │   ├── window.ts             # Window management
│   │   ├── browser-view.ts       # Embedded browser controller
│   │   ├── backend-launcher.ts   # Backend process manager
│   │   └── ipc-handlers.ts       # IPC communication
│   ├── renderer/                 # Next.js renderer process
│   │   ├── app/                  # Next.js App Router pages
│   │   ├── components/           # React components
│   │   │   ├── browser/          # Browser view components
│   │   │   ├── chat/             # Chat interface
│   │   │   ├── controls/         # Play/pause controls
│   │   │   └── history/          # Task history
│   │   ├── hooks/                # Custom React hooks
│   │   ├── stores/               # Zustand state stores
│   │   ├── services/             # API & WebSocket clients
│   │   └── types/                # TypeScript types
│   ├── package.json              # Node dependencies
│   └── electron-builder.yml      # Build configuration
│
├── browser_use/                   # browser-use agent framework
│   ├── agent/                    # Agent orchestration
│   ├── browser/                  # Browser session management
│   ├── dom/                      # DOM extraction & analysis
│   ├── llm/                      # LLM integration layer
│   └── tools/                    # Action registry
│
├── docs/                          # Documentation
│   ├── backend_design_spec.md    # Backend architecture
│   ├── frontend_design_spec.md   # Frontend architecture
│   └── testing_plan.md           # Testing strategy
│
├── tests/                         # Integration tests
├── examples/                      # Usage examples
├── docker/                        # Docker configuration
│
├── README.md                      # This file
├── CLAUDE.md                      # Instructions for Claude Code
├── LICENSE                        # MIT License
└── pyproject.toml                # Python project config

User Story Coverage

The running list of specification epics / stories and their implementation state lives in
docs/user_story_coverage.md. Update that file whenever a story’s status changes or when the missing specification artifacts are committed.

Development

Backend Development

cd backend

# Activate virtual environment
source venv/bin/activate

# Run development server with auto-reload
uvicorn config.asgi:application --reload --port 8000

# Run tests
python manage.py test

# Create database migrations
python manage.py makemigrations
python manage.py migrate

# Access admin panel
python manage.py createsuperuser
# Visit http://localhost:8000/admin/

Frontend Development

cd frontend

# Start Next.js dev server (hot reload)
npm run dev:next

# Start Electron (restart on main process changes)
npm run dev:electron

# Run concurrently (both at once)
npm run dev

# Build for production
npm run build
npm run electron:build

# Type checking
npm run type-check

# Linting
npm run lint

Development Workflow

Backend changes: Edit Python files, server auto-reloads
Frontend UI: Edit renderer/ components, hot reload active
Electron main: Edit main/ files, restart Electron app
Database changes: Create migrations, apply with migrate
API changes: Update both backend endpoints and frontend services

Documentation

Detailed documentation is available in the /docs directory:

Backend Design Specification: Complete backend architecture, API endpoints, database models, WebSocket protocol
Frontend Design Specification: Frontend architecture, component hierarchy, state management, UI design
Testing Plan: Comprehensive testing strategy and test cases

Additional resources:

CLAUDE.md: Instructions for working with this codebase using Claude Code
Examples: Usage examples and code samples
Browser-Use Docs: browser-use framework documentation

Testing

Backend Tests

cd backend

# Run all tests
python manage.py test

# Run specific app tests
python manage.py test apps.agents
python manage.py test apps.tasks

# Run with coverage
pip install coverage
coverage run --source='.' manage.py test
coverage report

Frontend Tests

cd frontend

# Run unit tests (when implemented)
npm test

# Run E2E tests (when implemented)
npm run test:e2e

# Type checking
npm run type-check

Integration Tests

# From repository root
cd tests/

# Run integration test suite (when implemented)
pytest -v

Deployment

Desktop Application Build

Build for current platform:

cd frontend
npm run build
npm run electron:build

Platform-specific builds:

npm run electron:build:mac     # macOS (.dmg)
npm run electron:build:win     # Windows (.exe)
npm run electron:build:linux   # Linux (.AppImage, .deb)

Output location:

Windows: frontend/dist/win-unpacked/ and .exe installer
macOS: frontend/dist/mac/ and .dmg disk image
Linux: frontend/dist/linux-unpacked/ and .AppImage

Distribution

The built application is self-contained:

Includes Django backend bundled
Includes Python runtime
Includes Node.js/Electron runtime
Requires only Chrome/Chromium on target system

Installation:

Download platform-specific installer
Run installer (may require admin/sudo)
Launch "Cloud Console Manager"
Configure Google Cloud credentials on first run
Log into GCP Console
Start automating!

Troubleshooting

Backend Issues

Backend won't start:

Check Python version: python3 --version (needs 3.11+)
Verify dependencies: pip install -r requirements.txt
Check port 8000 not in use: lsof -i :8000 (macOS/Linux)
Review backend logs for errors

Agent execution fails:

Verify Google Cloud credentials in .env
Ensure Vertex AI API is enabled in GCP project
Check service account has necessary permissions
Confirm AGENT_MODEL is valid (default: gemini-2.5-flash)

Database errors:

Delete db.sqlite3 and run migrations again
Check file permissions on database file
For SQLite locked errors, ensure only one backend instance running

Frontend Issues

Electron won't start:

Check Node.js version: node --version (needs 18+)
Clear node_modules and reinstall: rm -rf node_modules && npm install
Check backend is running on port 8000
Review Electron main process logs in terminal

Browser view not showing:

Ensure Chrome/Chromium installed on system
Check IPC communication logs
Verify browser-view bounds calculation
Try restarting application

WebSocket connection fails:

Confirm backend WebSocket endpoint: ws://localhost:8000/ws/agent/
Check firewall not blocking WebSocket connections
Verify NEXT_PUBLIC_WS_URL in .env.local
Review browser console for errors

Common Issues

"Permission denied" errors in GCP:

Service account needs proper IAM roles
Check project-level permissions
Some operations require Owner/Editor role

Browser automation not working:

GCP Console UI may have changed (agents adapt but may need updates)
Clear browser profile: rm -rf backend/storage/profiles/
Check CDP port not in use: lsof -i :9222

Login session expires:

Manually log in again when prompted
Cookie encryption key changed (check DJANGO_SECRET_KEY)
Clear stored cookies: rm -rf backend/storage/cookies/

Contributing

Contributions are welcome! Here's how to get started:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes:
- Backend: Follow Django best practices, add tests
- Frontend: Follow React/TypeScript conventions, add types
- Documentation: Update relevant .md files
Test thoroughly:
- Run backend tests: python manage.py test
- Run frontend type check: npm run type-check
- Test manually in application
Commit changes: git commit -m "Add amazing feature"
Push to branch: git push origin feature/amazing-feature
Open Pull Request

Code Style

Python (Backend):

Use tabs for indentation (not spaces)
Type hints with modern syntax (str | None, not Optional[str])
Async/await for concurrent operations
Django conventions for models/views/serializers

TypeScript (Frontend):

Strict mode enabled
Explicit types for function parameters/returns
React hooks for state/effects
Zustand for global state

See CLAUDE.md for detailed development guidelines.

Roadmap

v1.0 (Current)

✅ Basic agent execution
✅ Embedded browser view
✅ Chat interface
✅ Task history
✅ Cookie persistence

v1.1 (Planned)

v2.0 (Future)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

browser-use: AI browser automation framework
Django: Python web framework
Electron: Desktop application framework
Next.js: React framework
Google Cloud: Vertex AI and Gemini LLM

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: dpraj007@example.com

Made with ❤️ for GCP automation

⭐ Star this repo | 🐛 Report Bug | 💡 Request Feature

# Cloud_Automation_Agent-

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
Testing		Testing
backend		backend
bin		bin
browser_use		browser_use
docker		docker
docs		docs
examples		examples
frontend		frontend
static		static
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CLOUD.md		CLOUD.md
Dockerfile		Dockerfile
Dockerfile.fast		Dockerfile.fast
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
gemini_setup.md		gemini_setup.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Cloud Console Manager

📋 Table of Contents

Overview

Key Benefits

Architecture

Technology Stack

Features

🤖 AI-Powered Automation

🖥️ Desktop Experience

🎮 User Control

📊 History & Monitoring

🔐 Security & Persistence

Prerequisites

System Requirements

Software Requirements

Containerized Quickstart

Quick Start

Local Development (Mock Mode by Default)

Demo Script (Submit → Review → Login → Run)

Supported Real Automations (AGENT_USE_MOCK=false)

Enabling the real agent

Containerized Development

Running Example Scripts (Optional)

Project Structure

User Story Coverage

Development

Backend Development

Frontend Development

Development Workflow

Documentation

Testing

Backend Tests

Frontend Tests

Integration Tests

Deployment

Desktop Application Build

Distribution

Troubleshooting

Backend Issues

Frontend Issues

Common Issues

Contributing

Code Style

Roadmap

v1.0 (Current)

v1.1 (Planned)

v2.0 (Future)

License

Acknowledgments

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages