vllm 部署问题

CUDA_VISIBLE_DEVICES=3  python3 -m vllm.entrypoints.openai.api_server \
 --served-model-name gelab-zero-4b-preview \
 --allowed-local-media-path /   \
 --mm-encoder-tp-mode data \
 --mm_processor_cache_type shm \
 --mm_processor_kwargs "{\"max_pixels\":5000000}" \
 --max-model-len 25480  \
 --chat-template-content-format string \
 --limit-mm-per-prompt "{\"image\":10}" \
 --model /models/stepfun-ai/GELab-Zero-4B-preview/  \
 --port 8533

使用如上的部署配置，vllm==0.12.0 版本，坐标点击不准确，请问能否提供官方的步数脚本？

Step 3 took: 0.89 seconds
Executing command: adb -s NMUC210285 shell input tap 1013.04 84.00000000000001
Step 3/400 done. Action: OrderedDict({'cot': "我看到当前界面是小红书的主页。我的任务是搜索'故宫的旅游攻略'。为了开始搜索，我需要先点击顶部的搜索图标。我观察到屏幕右上角有一个放大镜图标，这是搜索功能的入口。因此，我决定点击这个图标。", 'explain': '我需要点击搜索图标来进入搜索页面，以便输入我的搜索关键词。', 'action': 'CLICK', 'point': [938, 35], 'summary': "任务目标是打开小红书并搜索'故宫的旅游攻略'。已成功打开小红书，下一步是点击搜索按钮进入搜索页面。", 'action_type': 'CLICK'})
LLM gelab-zero-4b-preview inference time: 0.82 seconds
llm ask id: chatcmpl-b1a249c48c9a6746
Step 4 took: 0.85 seconds
Executing command: adb -s NMUC210285 shell input tap 1013.04 76.8
Step 4/400 done. Action: OrderedDict({'cot': "我看到当前界面是小红书的首页。根据任务目标，我需要搜索'故宫的旅游攻略'。上一步点击搜索按钮的动作没有成功，因为界面没有变化。因此，我需要再次尝试点击右上角的搜索图标，以进入搜索页面。", 'explain': '我需要点击搜索图标来打开搜索页面，以便输入搜索内容。', 'action': 'CLICK', 'point': [938, 32], 'summary': "任务目标是打开小红书并搜索'故宫的旅游攻略'。已成功打开小红书，但上一步点击搜索按钮失败。现在将再次尝试点击搜索按钮。", 'action_type': 'CLICK'})


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vllm 部署问题 #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

vllm 部署问题 #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions