VisDiff API Servers

Design Choices

All LLMs/VLMs/CLIPs serve as API with cache enabled, because loading a LLM/VLM/CLIP is expensive and we never modify them.
LLM functions in utils_llm.py, VLM functions in utils_vlm.py, CLIP functions in utils_clip.py, and others in utils_general.py.
Write unit tests to understand major functions.

BLIP: pip install salesforce-lavis
LLaVA: git clone git@github.com:haotian-liu/LLaVA.git; cd LLaVA; pip install -e .

Configure global variables in global_vars.py
Run python serve/vlm_server_[vlm].py. It takes a while to load the VLM, especially the first time to download the VLM. (Note: concurrency is disabled as it surprisingly leads to worse GPU utilization)
Run python -m serve.utils_vlm to test the VLM.