Skip to content

Latest commit

 

History

History
107 lines (86 loc) · 5.05 KB

File metadata and controls

107 lines (86 loc) · 5.05 KB

How it works

Transcript Download and Processing Flow

flowchart TD
    A([Start]):::start --> B[Create or get video collection in ChromaDB]
    B --> C{Check if transcript exists}
    C -->|Exists| D[Skip processing]
    C -->|Does not exist| F[Retrieve video metadata]
    F --> G[Download transcript]
    G --> H[Grammatically correct the transcript using LLM]
    H --> I[Chunk transcript into pieces of 768-1136 characters, keeping paragraphs intact]
    I --> L[Store embeddings in Vector Store]
    D --> N
    L --> N([End download process]):::finish

    classDef start fill:#4CAF50,stroke:#fff,stroke-width:2px;
    classDef finish fill:#FFB74D,stroke:#fff,stroke-width:2px;
Loading

This flowchart outlines the process of downloading and processing video transcripts in the system. The steps are as follows:

  1. Start downloading transcript: The process begins with initiating the download of the video transcript.
  2. Create or get video collection in ChromaDB: A collection representing the video is either created or retrieved from ChromaDB.
  3. Check if transcript exists: The system checks whether a transcript already exists for the video, by checking the count of stored elements in the video collection.
    • If the transcript exists, the system skips further processing.
    • If the transcript does not exist, the system proceeds to retrieve the video metadata and then download the transcript.
  4. Download transcript: The transcript is downloaded from the video source.
  5. Grammatically correct the transcript using LLM: The transcript is then corrected for grammar and formatting using a large language model (LLM).
  6. Chunk transcript: The transcript is divided into chunks, each containing 768-1136 characters, while ensuring that paragraphs remain intact.
  7. Store embeddings in Vector Store: The embeddings of the processed transcript are stored in the video's collection in ChromaDB for future use.
  8. End download process: The download and processing cycle ends.

Chatting Flow

flowchart TD
    A([Start]):::start --> G[User enters a query]
    G --> H{Ask LLM to classify user query: Does it require context retrieval?}
    H -- Yes --> I[Extract key terms from user input using LLM]
    subgraph RetrievalPhase["Retrieval Phase"]
    I --> J[Generate embedding from key terms]
    J --> K[Retrieve relevant transcript chunks from vector store using embedding]
    end
    subgraph ContextBuilding["Context Building Phase"]
    K --> L[Create empty prompt]
    L --> M[Add video metadata as markdown]
    M --> N[Add first paragraph of transcript]
    N --> O[Add each relevant paragraph from retrieval]
    end
    H -- No --> P[Create empty prompt]
    O --> Q[Add user query to prompt]
    P --> Q
    Q --> V{Should chat history be used in inference?}
    V -- Yes --> W[Append prompt to existing chat history]
    V -- No --> X[Create new chat history and append prompt to it]
    X --> B
    W --> B[Generate response using LLM]
    B --> S{Update chat history?}
    S -- Yes --> T[Update chat history to be equal the LLM response]
    S -- No --> U[Return LLM response]
    T --> U
    U --> Z([End]):::finish
    classDef start fill:#4CAF50,stroke:#fff,stroke-width:2px;
    classDef finish fill:#FFB74D,stroke:#fff,stroke-width:2px;
Loading

The chatting flow presented above details how a user's query is processed and answered by the system. The steps are as follows:

  1. Start: The process begins when a user enters a query.

  2. Query Classification: The system uses an LLM to determine if the user's query requires context retrieval or not.

  3. Context Retrieval And Building:

    • If the query needs context, the system extracts key terms from the user's input using the LLM.
    • It then generates an embedding from these key terms.
    • Using this embedding, it retrieves relevant transcript chunks from the vector store.
    • An empty prompt is created.
    • Video metadata is added as markdown.
    • The first paragraph of the transcript is included.
    • Each relevant paragraph retrieved earlier is added sequentially to build up the context.
  4. No Context Needed:

    • If the query doesn't need context, an empty prompt is directly created.
  5. Appending User Query To The Prompt:

    • In both cases, the user's query is added to the prompt.
  6. Using (Or Not) The Current Chat Message History

    • The system checks if the chat history should be used in the inference
    • If yes, the constructed prompt is appended to the chat message history.
    • If not, a new empty chat history is created, and the constructed prompt is appended as a user message.
  7. Inference

    • The LLM generates a response based on the resulting chat history from the previous step.
  8. Chat History Update:

    • The system checks if the chat history should be updated with the new query and response.
    • If yes, the chat history is updated to be equal to the LLM response.
    • If no, the chat history is left untouched and the LLM response is returned.
  9. End: The process concludes with the final response being provided to the user.