RubyPKG is a high-performance, neuro-symbolic Retrieval-Augmented Generation (RAG) system specifically designed for personal knowledge management. It bridges the gap between unstructured markdown notes and rigid vector search by combining symbolic linguistic analysis with neural contextualization.
- Language: Ruby 3.3+ (leveraging Prism parser and Fiber Concurrency)
- Database: PostgreSQL 14+ with
pgvectorfor semantic search - Cache: Redis 6+ via
Ohmfor idempotent processing and metadata caching - NLP:
ruby-spacy(symbolic extraction) orPragmaticSegmenter(fallback) - AI Integration:
ruby_llm(v1.12+) supporting OpenRouter (neural context) and Ollama (local embeddings) - Architecture: Orchestration-based hybrid pipeline with Circuit Breakers for resilient API usage
- PostgreSQL with
vectorextension installed. - Redis server running locally or accessible via network.
- Ollama (optional) for local embedding generation (default:
nomic-embed-text).
- Install Dependencies:
bundle install - Database Setup:
./bin/setup_database(Initializes schema, extensions, and tables) - Index Knowledge:
./bin/ruby-pkg index -v -p /path/to/notebook - Query knowledge:
./bin/ruby-pkg query "your natural language question" - Consolidate Tags:
./bin/ruby-pkg consolidate --csv data/tags.csv - Run Tests:
bundle exec rspec - Linting:
bundle exec rubocop
- Standards: Adheres to Standard Ruby / RuboCop conventions.
- Concurrency: Uses
asyncandfiber_concurrencyfor non-blocking I/O (Database and API calls). - Metadata: Prefers
jsonbfor complex symbolic metadata andtext[]for tags in PostgreSQL. - Safety: Uses atomic "write-then-rename" strategies for notebook file modifications.
- LLM Contexts: Avoid global
RubyLLMoverrides. Always use localRubyLLM.contextobjects within processor instances to respect environment-specific endpoints (e.g.,tinybot). - Schema Strictness: When using Structured Outputs with
ruby_llm, always specifyadditionalProperties: falsein JSON schemas for Azure/OpenAI compliance. - Recursive Error Handling: LLM provider errors should be unwrapped from nested proxy responses (OpenRouter/Azure) before logging.
- Commit Messages: Follow Conventional Commits (e.g.,
feat(indexer): ...,fix(db): ...). - Branching: Primary development occurs on the
developmentbranch.
bin/: CLI executables (ruby-pkg,setup_database).lib/ruby_pkg/: Core logic and orchestration.lib/ruby_pkg/models/: Redis/Ohm models for caching.db/: SQL schema and migration definitions.docs/: Comprehensive VitePress-ready technical documentation.config/: YAML-based configuration management (tty-config).spec/: RSpec test suite.