Skip to content

ansible/aap-rag-content

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

531 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

aap-rag-content

Overview

This repository contains files for building a RAG (Retrieval-Augmented Generation) vector DB, which is used with the Ansible Lightspeed Intelligent Assistant (ALIA).

The source documents saved here were extracted from the document archive built for Red Hat Offline Knowledge Portal (OKP), which provides off-line access to Red Hat product documentation. Copies of markdown files are stored in this repository.

For generating vector DB, we use the aap_rag_content Python package located in the src/ directory, which provides document processing and vector database generation capabilities. The custom_processor_aap.py script uses this package to process AAP documentation and build the vector database.

Input files for Vector DBs

Input files that are used as the source of vector DBs are stored in the following directories:

  • /red_hat_content This directory stores the Markdown files extracted from the Mimir archive.
  • /aap-product-docs-plaintext This directory has Markdown files and their metadata files that are generated from the raw files under /red_hat_content. Metadata files are stored in the .metadata subdirectory with the same file name as its source and .json file extension.
  • /additional_docs This directory contains additional Markdown files that are used as the input for vector DB. Its .metadata subdirectory contains metadata JSON files.

Development Setup

Before running tests or working with the codebase, set up your development environment:

make setup

This command will:

  • Create a virtual environment (.venv/)
  • Install all project dependencies
  • Install development tools (pytest, black, mypy, ruff, etc.)

The project uses uv for dependency management, which provides fast and reliable package installation.

Running Unit Tests

To run the unit tests for the aap_rag_content package:

make unit-test

This will execute all tests in the tests/ directory using pytest with verbose output. The test suite includes tests for:

  • Metadata processing
  • Document processing
  • Vector database generation
  • Utility functions
  • Vector query round-trip validation
  • Solution guide query verification (requires a complete vector DB; see below)

Vector Query Tests

The test_vector_query.py file contains two test classes:

  • TestVectorQueryRoundTrip verifies basic FAISS operations (index creation, semantic relevance, result count, embedding dimensions) using the real sentence-transformers/all-mpnet-base-v2 model from the embeddings_model/ directory. These tests build a small in-memory index from sample documents and run automatically whenever the embedding model files are present.

  • TestSolutionGuidesQuery validates that the vector DB built by custom_processor_aap.py returns chunks from the correct solution guide documents. It loads the FAISS index from the built faiss_store.db and checks that:

    • A query mentioning "IBM Instana" returns chunks from the Instana solution guide
    • A query mentioning "AIOps with ServiceNow" returns chunks from the ServiceNow guide
    • A query mentioning "Red Hat AI Inference Server" returns chunks from the RHAIIS guide

    These tests require a vector DB that includes solution guide content. In the Konflux build, this happens automatically (tests run after custom_processor_aap.py). To run them locally, decompress the pre-built DB first:

    gzip -dk rag/llama_stack_vector_db/faiss_store.db.gz
    

    Then run only the solution guide tests:

    PYTHONPATH=src:$PYTHONPATH uv run pytest tests/tests/test_vector_query.py::TestSolutionGuidesQuery -v
    

    If the DB does not contain solution guide chunks, these tests skip automatically with a descriptive message.

Build a container image

Following command builds a container image, which includes the generated vector DB.

make build-image-aap

Push a container image to quay.io

podman login quay.io
podman push aap-rag-content quay.io/ansible/aap-rag-content

About

No description or website provided.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors