Skip to content

Latest commit

 

History

History
54 lines (44 loc) · 3.06 KB

File metadata and controls

54 lines (44 loc) · 3.06 KB

GEMINI.md - RubyPKG Instructional Context

Project Overview

RubyPKG is a high-performance, neuro-symbolic Retrieval-Augmented Generation (RAG) system specifically designed for personal knowledge management. It bridges the gap between unstructured markdown notes and rigid vector search by combining symbolic linguistic analysis with neural contextualization.

Core Technologies

  • Language: Ruby 3.3+ (leveraging Prism parser and Fiber Concurrency)
  • Database: PostgreSQL 14+ with pgvector for semantic search
  • Cache: Redis 6+ via Ohm for idempotent processing and metadata caching
  • NLP: ruby-spacy (symbolic extraction) or PragmaticSegmenter (fallback)
  • AI Integration: ruby_llm (v1.12+) supporting OpenRouter (neural context) and Ollama (local embeddings)
  • Architecture: Orchestration-based hybrid pipeline with Circuit Breakers for resilient API usage

Building and Running

Prerequisites

  • PostgreSQL with vector extension installed.
  • Redis server running locally or accessible via network.
  • Ollama (optional) for local embedding generation (default: nomic-embed-text).

Key Commands

  • Install Dependencies: bundle install
  • Database Setup: ./bin/setup_database (Initializes schema, extensions, and tables)
  • Index Knowledge: ./bin/ruby-pkg index -v -p /path/to/notebook
  • Query knowledge: ./bin/ruby-pkg query "your natural language question"
  • Consolidate Tags: ./bin/ruby-pkg consolidate --csv data/tags.csv
  • Run Tests: bundle exec rspec
  • Linting: bundle exec rubocop

Development Conventions

Coding Style

  • Standards: Adheres to Standard Ruby / RuboCop conventions.
  • Concurrency: Uses async and fiber_concurrency for non-blocking I/O (Database and API calls).
  • Metadata: Prefers jsonb for complex symbolic metadata and text[] for tags in PostgreSQL.
  • Safety: Uses atomic "write-then-rename" strategies for notebook file modifications.

Integration Patterns

  • LLM Contexts: Avoid global RubyLLM overrides. Always use local RubyLLM.context objects within processor instances to respect environment-specific endpoints (e.g., tinybot).
  • Schema Strictness: When using Structured Outputs with ruby_llm, always specify additionalProperties: false in JSON schemas for Azure/OpenAI compliance.
  • Recursive Error Handling: LLM provider errors should be unwrapped from nested proxy responses (OpenRouter/Azure) before logging.

Git Guidelines

  • Commit Messages: Follow Conventional Commits (e.g., feat(indexer): ..., fix(db): ...).
  • Branching: Primary development occurs on the development branch.

Directory Structure

  • bin/: CLI executables (ruby-pkg, setup_database).
  • lib/ruby_pkg/: Core logic and orchestration.
  • lib/ruby_pkg/models/: Redis/Ohm models for caching.
  • db/: SQL schema and migration definitions.
  • docs/: Comprehensive VitePress-ready technical documentation.
  • config/: YAML-based configuration management (tty-config).
  • spec/: RSpec test suite.