|
| 1 | +# Nebius Provider |
| 2 | + |
| 3 | +This directory contains the Nebius provider implementation for the compute package. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The Nebius provider implements the CloudClient interface defined in `pkg/v1` to provide access to Nebius AI Cloud infrastructure. This implementation is based on the official Nebius API documentation at https://github.com/nebius/api and uses the Nebius Go SDK. |
| 8 | + |
| 9 | +## Supported Features |
| 10 | + |
| 11 | +Based on the Nebius API documentation, the following features are **SUPPORTED**: |
| 12 | + |
| 13 | +### Instance Management |
| 14 | +- ✅ **Create Instance**: `InstanceService.Create` in compute/v1/instance_service.proto |
| 15 | +- ✅ **Get Instance**: `InstanceService.Get` and `InstanceService.GetByName` |
| 16 | +- ✅ **List Instances**: `InstanceService.List` with pagination support |
| 17 | +- ✅ **Terminate Instance**: `InstanceService.Delete` |
| 18 | +- ✅ **Stop Instance**: `InstanceService.Stop` |
| 19 | +- ✅ **Start Instance**: `InstanceService.Start` |
| 20 | +- ✅ **Update Instance**: `InstanceService.Update` |
| 21 | + |
| 22 | +### GPU Cluster Management |
| 23 | +- ✅ **Create GPU Cluster**: `GpuClusterService.Create` in compute/v1/gpu_cluster_service.proto |
| 24 | +- ✅ **Get GPU Cluster**: `GpuClusterService.Get` and `GpuClusterService.GetByName` |
| 25 | +- ✅ **List GPU Clusters**: `GpuClusterService.List` with pagination support |
| 26 | +- ✅ **Delete GPU Cluster**: `GpuClusterService.Delete` |
| 27 | +- ✅ **Update GPU Cluster**: `GpuClusterService.Update` |
| 28 | + |
| 29 | +### Machine Images |
| 30 | +- ✅ **Get Images**: `ImageService.Get`, `ImageService.GetByName`, `ImageService.GetLatestByFamily` |
| 31 | +- ✅ **List Images**: `ImageService.List` with filtering support |
| 32 | + |
| 33 | +### Quota Management |
| 34 | +- ✅ **Get Quotas**: `QuotaAllowanceService` in quotas/v1/quota_allowance_service.proto |
| 35 | + |
| 36 | +## Unsupported Features |
| 37 | + |
| 38 | +The following features are **NOT SUPPORTED** (no clear API endpoints found): |
| 39 | + |
| 40 | +### Instance Operations |
| 41 | +- ❌ **Reboot Instance**: No reboot endpoint found in instance_service.proto |
| 42 | +- ❌ **Instance Tags**: No dedicated tagging service found |
| 43 | +- ❌ **Change Instance Type**: No instance type modification endpoint |
| 44 | + |
| 45 | +### Volume Management |
| 46 | +- ❌ **Resize Instance Volume**: Volume resizing not clearly documented |
| 47 | + |
| 48 | +### Location Management |
| 49 | +- ❌ **Get Locations**: No location listing service found |
| 50 | + |
| 51 | +### Firewall Management |
| 52 | +- ❌ **Firewall Rules**: Network security handled through VPC service, not instance-level firewall rules |
| 53 | + |
| 54 | +## Implementation Approach |
| 55 | + |
| 56 | +This implementation uses the `NotImplCloudClient` pattern for unsupported features: |
| 57 | +- Supported features have TODO implementations with API service references |
| 58 | +- Unsupported features return `ErrNotImplemented` (handled by embedded NotImplCloudClient) |
| 59 | +- Full CloudClient interface compliance is maintained |
| 60 | + |
| 61 | +## Nebius API |
| 62 | + |
| 63 | +The provider integrates with the Nebius AI Cloud API: |
| 64 | +- Base URL: `{service-name}.api.nebius.cloud:443` (gRPC) |
| 65 | +- Authentication: Service account based (JWT tokens) |
| 66 | +- SDK: `github.com/nebius/gosdk` |
| 67 | +- Documentation: https://github.com/nebius/api |
| 68 | +- API Type: Locational (location-specific endpoints) |
| 69 | + |
| 70 | +## Key Features |
| 71 | + |
| 72 | +Nebius AI Cloud is known for: |
| 73 | +- GPU instances and GPU clusters for AI/ML workloads |
| 74 | +- Comprehensive compute, storage, and networking services |
| 75 | +- gRPC-based API with strong typing |
| 76 | +- Service account authentication with JWT tokens |
| 77 | +- Location-specific API endpoints |
| 78 | +- Advanced operations tracking and idempotency |
| 79 | +- Integration with VPC, IAM, billing, and quota services |
| 80 | +- Container registry and managed services |
| 81 | + |
| 82 | +## TODO |
| 83 | + |
| 84 | +- [ ] Implement actual API integration for supported features |
| 85 | +- [ ] Add proper service account authentication handling |
| 86 | +- [ ] Add comprehensive error handling and retry logic |
| 87 | +- [ ] Add logging and monitoring |
| 88 | +- [ ] Add comprehensive testing |
| 89 | +- [ ] Investigate VPC integration for networking features |
0 commit comments