flowchart TD
A([Start]):::start --> B[Create or get video collection in ChromaDB]
B --> C{Check if transcript exists}
C -->|Exists| D[Skip processing]
C -->|Does not exist| F[Retrieve video metadata]
F --> G[Download transcript]
G --> H[Grammatically correct the transcript using LLM]
H --> I[Chunk transcript into pieces of 768-1136 characters, keeping paragraphs intact]
I --> L[Store embeddings in Vector Store]
D --> N
L --> N([End download process]):::finish
classDef start fill:#4CAF50,stroke:#fff,stroke-width:2px;
classDef finish fill:#FFB74D,stroke:#fff,stroke-width:2px;
This flowchart outlines the process of downloading and processing video transcripts in the system. The steps are as follows:
- Start downloading transcript: The process begins with initiating the download of the video transcript.
- Create or get video collection in ChromaDB: A collection representing the video is either created or retrieved from ChromaDB.
- Check if transcript exists: The system checks whether a transcript already exists for the video, by checking the count of stored elements in the video collection.
- If the transcript exists, the system skips further processing.
- If the transcript does not exist, the system proceeds to retrieve the video metadata and then download the transcript.
- Download transcript: The transcript is downloaded from the video source.
- Grammatically correct the transcript using LLM: The transcript is then corrected for grammar and formatting using a large language model (LLM).
- Chunk transcript: The transcript is divided into chunks, each containing 768-1136 characters, while ensuring that paragraphs remain intact.
- Store embeddings in Vector Store: The embeddings of the processed transcript are stored in the video's collection in ChromaDB for future use.
- End download process: The download and processing cycle ends.
flowchart TD
A([Start]):::start --> G[User enters a query]
G --> H{Ask LLM to classify user query: Does it require context retrieval?}
H -- Yes --> I[Extract key terms from user input using LLM]
subgraph RetrievalPhase["Retrieval Phase"]
I --> J[Generate embedding from key terms]
J --> K[Retrieve relevant transcript chunks from vector store using embedding]
end
subgraph ContextBuilding["Context Building Phase"]
K --> L[Create empty prompt]
L --> M[Add video metadata as markdown]
M --> N[Add first paragraph of transcript]
N --> O[Add each relevant paragraph from retrieval]
end
H -- No --> P[Create empty prompt]
O --> Q[Add user query to prompt]
P --> Q
Q --> V{Should chat history be used in inference?}
V -- Yes --> W[Append prompt to existing chat history]
V -- No --> X[Create new chat history and append prompt to it]
X --> B
W --> B[Generate response using LLM]
B --> S{Update chat history?}
S -- Yes --> T[Update chat history to be equal the LLM response]
S -- No --> U[Return LLM response]
T --> U
U --> Z([End]):::finish
classDef start fill:#4CAF50,stroke:#fff,stroke-width:2px;
classDef finish fill:#FFB74D,stroke:#fff,stroke-width:2px;
The chatting flow presented above details how a user's query is processed and answered by the system. The steps are as follows:
-
Start: The process begins when a user enters a query.
-
Query Classification: The system uses an LLM to determine if the user's query requires context retrieval or not.
-
Context Retrieval And Building:
- If the query needs context, the system extracts key terms from the user's input using the LLM.
- It then generates an embedding from these key terms.
- Using this embedding, it retrieves relevant transcript chunks from the vector store.
- An empty prompt is created.
- Video metadata is added as markdown.
- The first paragraph of the transcript is included.
- Each relevant paragraph retrieved earlier is added sequentially to build up the context.
-
No Context Needed:
- If the query doesn't need context, an empty prompt is directly created.
-
Appending User Query To The Prompt:
- In both cases, the user's query is added to the prompt.
-
Using (Or Not) The Current Chat Message History
- The system checks if the chat history should be used in the inference
- If yes, the constructed prompt is appended to the chat message history.
- If not, a new empty chat history is created, and the constructed prompt is appended as a user message.
-
Inference
- The LLM generates a response based on the resulting chat history from the previous step.
-
Chat History Update:
- The system checks if the chat history should be updated with the new query and response.
- If yes, the chat history is updated to be equal to the LLM response.
- If no, the chat history is left untouched and the LLM response is returned.
-
End: The process concludes with the final response being provided to the user.