diff --git a/backend/README.md b/backend/README.md index 13abf1a..b1d568d 100644 --- a/backend/README.md +++ b/backend/README.md @@ -17,23 +17,21 @@ FastAPI-based backend for the InterXAI interview automation platform. Handles al 11. [Code Quality](#code-quality) 12. [Docker](#docker) - ## Tech Stack -| Technology | Purpose | -|---|---| -| **FastAPI** | Async REST API framework | -| **SQLAlchemy 2.0** | Async ORM | -| **Alembic** | Schema migrations | -| **TaskIQ + Redis** | Async background job queue | -| **LangChain + LiteLLM** | LLM orchestration | -| **Groq** | LLM inference provider | -| **PyPDF2** | Resume PDF text extraction | -| **Supabase** | File storage (resume PDFs) | -| **PyJWT + bcrypt** | Authentication | -| **Pydantic v2** | Request/response validation and settings | -| **uv** | Python package manager | - +| Technology | Purpose | +| ----------------------- | ---------------------------------------- | +| **FastAPI** | Async REST API framework | +| **SQLAlchemy 2.0** | Async ORM | +| **Alembic** | Schema migrations | +| **TaskIQ + Redis** | Async background job queue | +| **LangChain + LiteLLM** | LLM orchestration | +| **Groq** | LLM inference provider | +| **PyPDF2** | Resume PDF text extraction | +| **Supabase** | File storage (resume PDFs) | +| **PyJWT + bcrypt** | Authentication | +| **Pydantic v2** | Request/response validation and settings | +| **uv** | Python package manager | ## Project Structure @@ -73,7 +71,7 @@ backend/ │ │ │ ├── utils/ # Concrete implementations │ │ ├── authorization.py # JWT auth dependencies (get_current_user, etc.) -│ │ ├── supabase.py # SupabaseStorageProvider +│ │ ├── supabase.py and vercel_blob.py # SupabaseStorageProvider and VercelBlobStorageProvider │ │ └── ... # BcryptHasher, JwtEncrypter, PDF extractor │ │ │ ├── ai/ # LLM agents and prompts @@ -100,7 +98,6 @@ backend/ └── mypy.ini # Mypy type checker configuration ``` - ## Setup & Installation This project uses [`uv`](https://github.com/astral-sh/uv) for dependency management. @@ -113,7 +110,6 @@ curl -LsSf https://astral.sh/uv/install.sh | sh uv sync --dev ``` - ## Configuration All settings are loaded from a `.env` file via `pydantic-settings`. Create `backend/.env`: @@ -156,7 +152,6 @@ from app.config import settings print(settings.DATABASE_URL) ``` - ## Running the Server ```bash @@ -168,10 +163,10 @@ uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4 ``` Interactive API docs are served automatically: + - **Swagger UI**: `http://localhost:8000/docs` - **ReDoc**: `http://localhost:8000/redoc` - ## Background Jobs (TaskIQ) InterXAI uses [TaskIQ](https://taskiq-python.github.io/) with a Redis broker for asynchronous resume processing. The worker runs independently from the API server. @@ -200,7 +195,6 @@ When a candidate applies for an interview (`POST /applications/{interview_id}`): The broker uses `taskiq-redis` and supports optional SSL for production Redis connections. The broker is started and stopped via FastAPI's `lifespan` context manager in `main.py`. - ## Database Migrations Migrations are managed with [Alembic](https://alembic.sqlalchemy.org/). @@ -221,7 +215,6 @@ uv run alembic history --verbose > **Note:** Alembic uses a sync connection even for async SQLAlchemy setups. This is configured in `alembic/env.py`. - ## Architecture Deep Dive ### Dependency Injection Pattern @@ -240,12 +233,12 @@ async def get_interview( Auth guards are composable dependencies: -| Dependency | Purpose | -|---|---| -| `get_current_user()` | Validates JWT, returns authenticated user | -| `verify_ownership()` | Ensures the user owns the requested resource | -| `verify_org_ownership()` | Ensures the org owns the requested resource | -| `is_organization()` | Restricts route to organization accounts only | +| Dependency | Purpose | +| ------------------------ | --------------------------------------------- | +| `get_current_user()` | Validates JWT, returns authenticated user | +| `verify_ownership()` | Ensures the user owns the requested resource | +| `verify_org_ownership()` | Ensures the org owns the requested resource | +| `is_organization()` | Restricts route to organization accounts only | ### Interface / Implementation Pattern @@ -299,44 +292,42 @@ StorageException → 502 AIError → 500 ``` - ## API Endpoints ### Users (`/users`) -| Method | Path | Auth | Description | -|---|---|---|---| -| `POST` | `/users/signup` | — | Register a new candidate account | -| `POST` | `/users/login` | — | Authenticate, receive JWT | -| `GET` | `/users/{user_id}` | User | Get user profile | -| `PUT` | `/users/{user_id}` | User | Update profile | -| `DELETE` | `/users/{user_id}` | User | Delete account | +| Method | Path | Auth | Description | +| -------- | ------------------ | ---- | -------------------------------- | +| `POST` | `/users/signup` | — | Register a new candidate account | +| `POST` | `/users/login` | — | Authenticate, receive JWT | +| `GET` | `/users/{user_id}` | User | Get user profile | +| `PUT` | `/users/{user_id}` | User | Update profile | +| `DELETE` | `/users/{user_id}` | User | Delete account | ### Organizations (`/organizations`) -| Method | Path | Auth | Description | -|---|---|---|---| -| `POST` | `/organizations/signup` | — | Register a new organization | -| `GET` | `/organizations/{org_id}` | Org | Get organization details | -| `PUT` | `/organizations/{org_id}` | Org | Update organization | -| `DELETE` | `/organizations/{org_id}` | Org | Delete organization | +| Method | Path | Auth | Description | +| -------- | ------------------------- | ---- | --------------------------- | +| `POST` | `/organizations/signup` | — | Register a new organization | +| `GET` | `/organizations/{org_id}` | Org | Get organization details | +| `PUT` | `/organizations/{org_id}` | Org | Update organization | +| `DELETE` | `/organizations/{org_id}` | Org | Delete organization | ### Interviews (`/interviews`) -| Method | Path | Auth | Description | -|---|---|---|---| -| `POST` | `/interviews/` | Org | Create a new interview | -| `GET` | `/interviews/` | Any | List (orgs see own, users see open) | -| `GET` | `/interviews/applied` | User | List interviews the user has applied to | -| `GET` | `/interviews/{interview_id}` | Org | Get full interview details | +| Method | Path | Auth | Description | +| ------ | ---------------------------- | ---- | --------------------------------------- | +| `POST` | `/interviews/` | Org | Create a new interview | +| `GET` | `/interviews/` | Any | List (orgs see own, users see open) | +| `GET` | `/interviews/applied` | User | List interviews the user has applied to | +| `GET` | `/interviews/{interview_id}` | Org | Get full interview details | ### Applications (`/applications`) -| Method | Path | Auth | Description | -|---|---|---|---| -| `POST` | `/applications/{interview_id}` | User | Apply with a resume PDF | -| `GET` | `/applications/{interview_id}` | Org | List all applications for an interview | - +| Method | Path | Auth | Description | +| ------ | ------------------------------ | ---- | -------------------------------------- | +| `POST` | `/applications/{interview_id}` | User | Apply with a resume PDF | +| `GET` | `/applications/{interview_id}` | Org | List all applications for an interview | ## AI Pipeline @@ -361,6 +352,7 @@ class ResumeEvaluatorResponse(BaseModel): ``` The agent: + 1. Renders a `ChatPromptTemplate` with the request data 2. Calls `LiteLLMProvider.generate()` → Groq API 3. Parses the JSON response with LangChain's `JsonOutputParser` @@ -370,7 +362,6 @@ The agent: Wraps `langchain_litellm.ChatLiteLLM` and maps provider-specific exceptions to the custom `AIError` hierarchy, keeping the rest of the application decoupled from the LLM provider. - ## Code Quality All checks are run from the `backend/` directory. @@ -392,15 +383,16 @@ uv run mypy . ### Configuration **`ruff.toml`** + - Line length: `100` - Enabled rule sets: `E, W, F, I, N, UP, B, C4, SIM, ARG, PTH` - Excluded: `alembic/versions/` **`mypy.ini`** + - Strict mode enabled - `untyped-decorator` disabled for `app/background/celery/` (Celery decorator limitation) - ## Docker ### Building Images @@ -431,8 +423,8 @@ docker-compose logs -f taskiq_worker **Services started by Docker Compose:** -| Service | Port | Description | -|---|---|---| -| `api` | `8000` | FastAPI application server | -| `taskiq_worker` | — | Background job worker | -| `redis` | `6379` | TaskIQ broker and result backend | +| Service | Port | Description | +| --------------- | ------ | -------------------------------- | +| `api` | `8000` | FastAPI application server | +| `taskiq_worker` | — | Background job worker | +| `redis` | `6379` | TaskIQ broker and result backend | diff --git a/backend/app/background/celery/celery.py b/backend/app/background/celery/celery.py index 3c69122..c17c9bc 100644 --- a/backend/app/background/celery/celery.py +++ b/backend/app/background/celery/celery.py @@ -12,8 +12,8 @@ from app.logger import get_logger from app.models.application import Application from app.models.interview import CustomInterview +from app.utils.default_providers import default_storage_provider from app.utils.pdf import extract_pdf_content -from app.utils.supabase_provider import SupabaseStorageProvider logger = get_logger(__name__) @@ -45,7 +45,7 @@ def process_resume_task(file_bytes_b64: str, file_name: str, application_id: int """ logger.info("Received resume processing job for application %d", application_id) file_bytes = base64.b64decode(file_bytes_b64) - provider = SupabaseStorageProvider() + provider = default_storage_provider() async def process_and_evaluate() -> None: from sqlalchemy.ext.asyncio import async_sessionmaker, create_async_engine diff --git a/backend/app/config.py b/backend/app/config.py index 4e8ac27..d796edb 100644 --- a/backend/app/config.py +++ b/backend/app/config.py @@ -26,6 +26,9 @@ class Settings(BaseSettings): SUPABASE_KEY: str = "" SUPABASE_BUCKET_NAME: str = "resumes" + # Vercel Blob + BLOB_READ_WRITE_TOKEN: str = "" + # Providers STORAGE_PROVIDER: str = "supabase" BACKGROUND_WORKER: str = "taskiq" diff --git a/backend/app/utils/vercel_blob.py b/backend/app/utils/vercel_blob.py new file mode 100644 index 0000000..1fafb9c --- /dev/null +++ b/backend/app/utils/vercel_blob.py @@ -0,0 +1,67 @@ +import asyncio +from typing import cast + +import httpx +from vercel_blob import delete, put + +from app.exceptions.storage import ( + StorageDeleteError, + StorageDownloadError, + StorageUploadError, +) +from app.interfaces.storage_proivder import StorageProviderInterface +from app.logger import get_logger + +logger = get_logger(__name__) + + +class VercelBlobStorageProvider(StorageProviderInterface): + async def upload(self, file: bytes, file_name: str) -> str: + try: + response = await asyncio.to_thread( + put, + file_name, + file, + { + "access": "public", + }, + ) + + return cast(str, response["url"]) + + except Exception as e: + logger.error( + "Vercel upload failed: %s", + str(e), + exc_info=True, + ) + raise StorageUploadError(f"Failed to upload file to storage: {str(e)}") from e + + async def delete(self, file_name: str) -> None: + try: + await asyncio.to_thread(delete, file_name) + + except Exception as e: + logger.error( + "Vercel delete failed: %s", + str(e), + exc_info=True, + ) + raise StorageDeleteError(f"Failed to delete file from storage: {str(e)}") from e + + async def download(self, file_name: str) -> bytes: + try: + async with httpx.AsyncClient() as client: + response = await client.get(file_name) + + response.raise_for_status() + + return response.content + + except Exception as e: + logger.error( + "Vercel download failed: %s", + str(e), + exc_info=True, + ) + raise StorageDownloadError(f"Failed to download file from storage: {str(e)}") from e diff --git a/backend/pyproject.toml b/backend/pyproject.toml index 6f94e71..3544381 100644 --- a/backend/pyproject.toml +++ b/backend/pyproject.toml @@ -23,6 +23,8 @@ dependencies = [ "taskiq-redis>=0.5.0", "python-multipart>=0.0.27", "supabase>=2.30.0", + "vercel-blob>=0.4.2", + "httpx>=0.28.1", ] [dependency-groups] diff --git a/backend/tests/test_vercel_blob_provider.py b/backend/tests/test_vercel_blob_provider.py new file mode 100644 index 0000000..40bb6e3 --- /dev/null +++ b/backend/tests/test_vercel_blob_provider.py @@ -0,0 +1,111 @@ +import asyncio +from unittest.mock import Mock, patch + +import pytest + +from app.exceptions.storage import ( + StorageDeleteError, + StorageDownloadError, + StorageUploadError, +) +from app.utils.vercel_blob import VercelBlobStorageProvider + + +def test_upload_success() -> None: + provider = VercelBlobStorageProvider() + + with patch("app.utils.vercel_blob.put") as mock_put: + mock_put.return_value = { + "url": "https://example.com/test.pdf", + } + + result = asyncio.run( + provider.upload( + b"test-data", + "test.pdf", + ) + ) + + assert result == "https://example.com/test.pdf" + + +def test_upload_failure() -> None: + provider = VercelBlobStorageProvider() + + with patch("app.utils.vercel_blob.put") as mock_put: + mock_put.side_effect = Exception("Upload failed") + + with pytest.raises(StorageUploadError): + asyncio.run( + provider.upload( + b"test-data", + "test.pdf", + ) + ) + + +def test_download_success() -> None: + provider = VercelBlobStorageProvider() + + mock_response = Mock() + mock_response.content = b"file-content" + mock_response.raise_for_status.return_value = None + + with patch( + "httpx.AsyncClient.get", + return_value=mock_response, + ): + result = asyncio.run( + provider.download( + "https://example.com/test.pdf", + ) + ) + + assert result == b"file-content" + + +def test_download_failure() -> None: + provider = VercelBlobStorageProvider() + + with ( + patch( + "httpx.AsyncClient.get", + side_effect=Exception("Download failed"), + ), + pytest.raises(StorageDownloadError), + ): + asyncio.run( + provider.download( + "https://example.com/test.pdf", + ) + ) + + +def test_delete_success() -> None: + provider = VercelBlobStorageProvider() + + with patch("app.utils.vercel_blob.delete") as mock_delete: + asyncio.run( + provider.delete( + "https://example.com/test.pdf", + ) + ) + + mock_delete.assert_called_once() + + +def test_delete_failure() -> None: + provider = VercelBlobStorageProvider() + + with ( + patch( + "app.utils.vercel_blob.delete", + side_effect=Exception("Delete failed"), + ), + pytest.raises(StorageDeleteError), + ): + asyncio.run( + provider.delete( + "https://example.com/test.pdf", + ) + ) diff --git a/backend/uv.lock b/backend/uv.lock index a0aed5f..666bd3c 100644 --- a/backend/uv.lock +++ b/backend/uv.lock @@ -223,6 +223,7 @@ dependencies = [ { name = "bcrypt" }, { name = "fastapi" }, { name = "greenlet" }, + { name = "httpx" }, { name = "langchain-litellm" }, { name = "pydantic", extra = ["email"] }, { name = "pydantic-settings" }, @@ -235,6 +236,7 @@ dependencies = [ { name = "taskiq" }, { name = "taskiq-redis" }, { name = "uvicorn", extra = ["standard"] }, + { name = "vercel-blob" }, ] [package.dev-dependencies] @@ -251,6 +253,7 @@ requires-dist = [ { name = "bcrypt", specifier = ">=4.0.0" }, { name = "fastapi", specifier = ">=0.115.0" }, { name = "greenlet", specifier = ">=3.4.0" }, + { name = "httpx", specifier = ">=0.28.1" }, { name = "langchain-litellm", specifier = ">=0.6.4" }, { name = "pydantic", extras = ["email"], specifier = ">=2.13.0" }, { name = "pydantic-settings", specifier = ">=2.13.1" }, @@ -263,6 +266,7 @@ requires-dist = [ { name = "taskiq", specifier = ">=0.11.0" }, { name = "taskiq-redis", specifier = ">=0.5.0" }, { name = "uvicorn", extras = ["standard"], specifier = ">=0.30.0" }, + { name = "vercel-blob", specifier = ">=0.4.2" }, ] [package.metadata.requires-dev] @@ -2821,6 +2825,21 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/e4/16/c1fd27e9549f3c4baf1dc9c20c456cd2f822dbf8de9f463824b0c0357e06/uvloop-0.22.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:6cde23eeda1a25c75b2e07d39970f3374105d5eafbaab2a4482be82f272d5a5e", size = 4296730, upload-time = "2025-10-16T22:17:00.744Z" }, ] +[[package]] +name = "vercel-blob" +version = "0.4.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "certifi" }, + { name = "requests" }, + { name = "tqdm" }, + { name = "urllib3" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/42/bd/d299ee6ef69db7b3b6bd636e592ea0f5d87d65f0a9a69f68db4818117265/vercel_blob-0.4.2.tar.gz", hash = "sha256:1c8e24c618cb62d7ddaa91d6a01b94bd5e8856ed0cfc7525fb6bfe064e2790c6", size = 15281, upload-time = "2025-06-07T06:16:06.891Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/5d/19/fadf607014ab74305e1eb91027215401d65b29af7e1f67cb4d1c17785796/vercel_blob-0.4.2-py3-none-any.whl", hash = "sha256:4150fc596b198489529275de46e577353a200fb292a39b9aecb76f76425fc05d", size = 15655, upload-time = "2025-06-07T06:16:05.343Z" }, +] + [[package]] name = "watchfiles" version = "1.1.1"