diff --git a/.gitignore b/.gitignore index dab1498..ad140d6 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,3 @@ -!*.png \ No newline at end of file +!*.png +__pycache__/ +*.pyc \ No newline at end of file diff --git a/.runpod/hub.json b/.runpod/hub.json index 74f2740..dfe1e0c 100644 --- a/.runpod/hub.json +++ b/.runpod/hub.json @@ -323,6 +323,17 @@ "advanced": true } }, + { + "key": "IS_EMBEDDING", + "input": { + "name": "Is Embedding Model", + "type": "boolean", + "description": "Set to true if using an embedding model", + "default": false, + "required": false, + "advanced": true + } + }, { "key": "SKIP_TOKENIZER_INIT", "input": { diff --git a/README.md b/README.md index 54f301e..4308cb1 100644 --- a/README.md +++ b/README.md @@ -39,6 +39,7 @@ All behaviour is controlled through environment variables: | `FILE_STORAGE_PATH` | Directory for storing uploaded/generated files | "sglang_storage" | | | `DATA_PARALLEL_SIZE` | Data parallelism size | 1 | | | `LOAD_BALANCE_METHOD` | Load balancing strategy | "round_robin" | "round_robin", "shortest_queue" | +| `IS_EMBEDDING` | Set to true for embedding models | false | boolean (true or false) | | `SKIP_TOKENIZER_INIT` | Skip tokenizer init | false | boolean (true or false) | | `TRUST_REMOTE_CODE` | Allow custom models from Hub | false | boolean (true or false) | | `LOG_REQUESTS` | Log inputs and outputs of requests | false | boolean (true or false) | diff --git a/engine.py b/engine.py index a07578b..c0a7168 100644 --- a/engine.py +++ b/engine.py @@ -64,6 +64,7 @@ def start_server(self): # Boolean flags boolean_flags = [ + "IS_EMBEDDING", "SKIP_TOKENIZER_INIT", "TRUST_REMOTE_CODE", "LOG_REQUESTS",