Successfully implemented asynchronous CV embedding using RabbitMQ, BGE-base model, and Pinecone VectorDB. CVs are now automatically chunked, embedded, and stored for semantic search.
The SCORE you see (e.g., 0.7031, 0.5069) is a cosine similarity score from Pinecone vector search:
- Range: 0.0 to 1.0
- Meaning: How similar a CV chunk is to your search query
- 0.7+: Very relevant match
- 0.5-0.7: Moderate match
- <0.5: Weak match
Example: If you search for "Python developer", chunks with Python experience will have scores like 0.85, while unrelated chunks might be 0.30.
Note: This is different from the CV scoring system (0-100) used in the /api/score endpoint.
┌─────────────┐
│ Frontend │ User uploads CV (text or PDF)
└──────┬──────┘
│ POST /api/upload_cv_text or /api/upload_cv_pdf
▼
┌─────────────────┐
│ API Gateway │ Extracts PDF text (if PDF), forwards to services
│ (Port 8000) │
└──────┬───────────┘
│
├─────────────────────────────────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ GeminiService │ │ StoringService │
│ (Port 8002) │ │ (Port 8001) │
│ │ │ │
│ Structures CV │ │ Stores in MongoDB│
│ using Gemini AI │ │ Calculates hash │
│ │ │ (cv_id) │
└──────┬───────────┘ └──────┬───────────┘
│ │
│ Returns structured_sections │
│ │
└──────────────────┬────────────────────┘
│
▼
┌───────────────────────┐
│ StoringService │
│ Publishes Event │
└───────────┬───────────┘
│
│ publish_cv_event(cv_id)
│
▼
┌───────────────────────┐
│ RabbitMQ Queue │
│ cv_embedding_queue │
│ │
│ Message: │
│ {"cv_id": "..."} │
│ │
│ 📈 SPIKE APPEARS │
│ (Unacked: 1) │
└───────────┬───────────┘
│
│ Consumer picks up message
│
▼
┌───────────────────────┐
│ VectorService │
│ (Port 8003) │
│ │
│ 1. Fetch CV from │
│ StoringService │
│ 2. Chunk sections │
│ 3. Embed chunks │
│ 4. Upload to │
│ Pinecone │
│ 5. ACK message │
└───────────┬───────────┘
│
│ Message acknowledged
│
▼
┌───────────────────────┐
│ RabbitMQ Queue │
│ │
│ 📉 SPIKE DISAPPEARS │
│ (Ready: 0) │
└───────────────────────┘
│
▼
┌───────────────────────┐
│ Pinecone VectorDB │
│ tailorcv-cv-chunks │
│ │
│ ✅ 20 chunks stored │
│ (768-dim vectors) │
└───────────────────────┘
Time Event RabbitMQ Status
─────────────────────────────────────────────────────────
T0 CV uploaded Queue: 0 messages
T1 StoringService publishes Queue: 1 message (Ready)
📈 SPIKE APPEARS
T2 VectorService receives Queue: 1 message (Unacked)
T3 VectorService processing... Queue: 1 message (Unacked)
(fetching, chunking, embedding)
T4 VectorService uploads to Queue: 1 message (Unacked)
Pinecone
T5 VectorService ACKs message Queue: 0 messages
📉 SPIKE DISAPPEARS
- Spike Appears: When
StoringServicepublishescv_idto RabbitMQ - Spike Stays: While
VectorServiceis processing (message is "Unacked") - Spike Disappears: When
VectorServicesendsbasic_ack()after successful upload
- Ready: Messages waiting to be consumed
- Unacked: Messages being processed (spike visible)
- Total: Ready + Unacked
Strategy: Semantic, type-based chunking for optimal embedding quality.
| Section Type | Chunking Strategy | Example |
|---|---|---|
| Experience | Each bullet point = 1 chunk | "Led team of 5 developers" → separate chunk |
| Projects | Each bullet point = 1 chunk | "Built REST API with FastAPI" → separate chunk |
| Summary | Entire text = 1 chunk | Full summary paragraph |
| Skills | All skills combined = 1 chunk | "Python, Java, Docker, Kubernetes" |
| Education | Each degree = 1 chunk | "BSc Computer Science - Concordia" |
| Leadership | Each role = 1 chunk | "Mentor - Co-op - Concordia University" |
| Certifications | Each cert = 1 chunk | "AWS Certified Solutions Architect" |
def chunk_structured_sections(structured_sections, cv_id):
"""
Intelligently chunks CV sections for embedding.
Returns: List of chunks with:
- cv_id: CV identifier
- section: Section name (experience, projects, etc.)
- text: Chunk text content
- metadata: Additional context (company, title, dates, etc.)
"""Result: Your CV created 20 chunks:
- 9 experience chunks (one per bullet)
- 8 project chunks (one per bullet)
- 3 other chunks (summary, skills, education/leadership)
Model: BAAI/bge-base-en-v1.5
- Dimension: 768 (reduced from 1024 to avoid memory issues)
- Type: Sentence Transformer
- Purpose: Converts text chunks into 768-dimensional vectors
Why BGE-base instead of BGE-large?
- BGE-large (1024 dim) caused "paging file too small" errors on Windows
- BGE-base (768 dim) uses less memory and works reliably
- Still provides excellent semantic search quality
Loading Strategy:
- Model loaded once on startup (cached globally)
- Reused for all embedding operations
- Prevents repeated downloads
Index: tailorcv-cv-chunks
- Dimension: 768 (matches BGE-base)
- Metric: Cosine similarity
- Cloud: AWS (us-east-1)
- Type: Serverless
Auto-Index Management:
- Detects if index exists
- Checks dimension compatibility
- Auto-deletes and recreates if dimension mismatch (only if empty)
- Prevents dimension errors
Upsert Strategy:
- Batch upsert (100 vectors per request)
- Unique IDs:
{cv_id}_{section}_{chunk_index} - Metadata includes:
cv_id,section,text, and section-specific fields
Example Vector ID:
8a5b9213..._experience_0
8a5b9213..._projects_11
8a5b9213..._leadership_19
Queue: cv_embedding_queue
- Durable: Yes (survives RabbitMQ restart)
- Prefetch: 1 (process one message at a time)
Error Handling:
- Memory/Paging Errors: NOT requeued (prevents infinite loop)
- Network Errors: Requeued for retry
- Other Errors: Requeued for retry
Consumer Flow:
def callback(ch, method, properties, body):
1. Parse cv_id from message
2. Call process_cv_for_embedding(cv_id)
3. If success: basic_ack() → message removed
4. If memory error: basic_nack(requeue=False) → message discarded
5. If other error: basic_nack(requeue=True) → retry laterKey Fix: Prevents infinite loops by detecting memory errors and NOT requeuing them.
Main Function: process_cv_for_embedding(cv_id)
Flow:
- Fetch CV from StoringService (
GET /internal/get_cv/{cv_id}) - Extract
structured_sectionsfrom CV document - Chunk sections using intelligent algorithm
- Embed chunks using BGE-base model
- Upload embedded chunks to Pinecone
Error Handling:
- Raises exceptions on failure
- Consumer handles retry logic
- Logs all steps for debugging
- Ready: 0
- Unacked: 0
- Total: 0
- Graph: Flat line at 0
- Ready: 0
- Unacked: 1 (message being processed)
- Total: 1
- Graph: Red spike at 1.0
- Ready: 0
- Unacked: 0
- Total: 0
- Graph: Returns to 0 (spike disappears)
✅ CV uploaded via frontend
✅ Structured by GeminiService
✅ Stored in MongoDB
✅ Published to RabbitMQ
✅ Consumed by VectorService
✅ Chunked into 20 semantic units
✅ Embedded using BGE-base (768-dim)
✅ Uploaded to Pinecone
✅ RabbitMQ message acknowledged
✅ Queue empty
- Index:
tailorcv-cv-chunks - Dimension: 768 ✅
- Record Count: 20 ✅
- Chunks Visible: Yes ✅
vector_service/app/embedder.py- Chunking and embedding logicvector_service/app/pinecone_client.py- Pinecone integrationvector_service/app/mq_consumer.py- RabbitMQ consumervector_service/app/service.py- Service orchestration
storing_service/app/events.py- Added RabbitMQ publisherstoring_service/app/service.py- Added publish call after CV storagevector_service/app/main.py- Added RabbitMQ consumer startupvector_service/requirements.txt- Added dependencies (pinecone, sentence-transformers, pika)
- Granularity: Each bullet point is searchable independently
- Context: Metadata preserves company, title, dates
- Quality: Better embedding quality than large paragraphs
- Memory: Fits in system memory (no paging file errors)
- Quality: Still excellent for semantic search
- Speed: Faster than BGE-large
- Non-blocking: CV upload returns immediately
- Scalable: Can process multiple CVs in parallel
- Resilient: Failed CVs don't block new uploads
- Managed: No infrastructure to maintain
- Fast: Sub-millisecond search
- Scalable: Handles millions of vectors
- Similarity Search: Implement
search_top_k_cvsendpoint - Tailored Bullets: Generate job-specific bullet points using similar chunks
- Batch Upload: Process 5000 CVs dataset
- Redis Caching: Cache latest CV for faster retrieval
Phase 3 Status: ✅ COMPLETE
- ✅ Asynchronous embedding pipeline working
- ✅ Intelligent chunking implemented
- ✅ BGE-base embedding integrated
- ✅ Pinecone storage functional
- ✅ RabbitMQ consumer with error handling
- ✅ End-to-end flow tested and verified
Your CV is now searchable in Pinecone! 🎉