- All LLMs/VLMs/CLIPs serve as API with cache enabled, because loading a LLM/VLM/CLIP is expensive and we never modify them.
- LLM functions in
utils_llm.py, VLM functions inutils_vlm.py, CLIP functions inutils_clip.py, and others inutils_general.py. - Write unit tests to understand major functions.
- Set up OpenAI API key:
export OPENAI_API_KEY='[your key]' - Pip install environments:
pip install vllm - Configure global variables in
global_vars.py - Run
python -m vllm.entrypoints.openai.api_server --model HuggingFaceM4/Idefics3-8B-Llama3 --port 8080 --max_model_len 5000 - Run
python -m serve.utils_llmorpython -m serve.utils_vlmto test.
- Pip install environments:
pip install open-clip-torch flask - Configure global variables in
global_vars.py - Run
python serve/clip_server.py - Run
python -m serve.utils_clipto test.