Qwen-Omni Realtime API - Invalid session.finish event, session failure on ws.close(), and VAD token consumption inquiry

### Context / Background

I am using the **Qwen3.5-Omni-Flash-Realtime** model over a WebSocket connection. I reviewed the sample implementation provided in your repository at:
`https://github.com/aliyun/alibabacloud-bailian-speech-demo/blob/master/samples/conversation/omni/python/run_server_vad.py`

### Problem 1: Invalid `session.finish` event & Missing Graceful Shutdown

When attempting to gracefully shut down the WebSocket connection, I tried sending a `session.finish` event type, same as the sdk of Realtime API specifications:

```json
{
    "event_id": "...",
    "type": "session.finish"
}

```

However, the server returns an error stating that this is an **invalid event type**.

Looking closely at your sample code (`run_server_vad.py`), it only calls `conversation.close()`, which internally just executes `ws.close()`. This abruptly terminates the WebSocket connection without a proper higher-level elegant handshake or session termination event.

### Problem 2: Dashboard showing "Failed" sessions & Token usage tracking

Because the connection is abruptly killed via `ws.close()`, every single one of my Realtime API sessions is marked as **"Failed"** in my Qwen Cloud Console / DashScope Dashboard. This makes it extremely difficult to track my exact token usage, session histories, and accurate analytics.

<img width="851" height="527" alt="Image" src="https://github.com/user-attachments/assets/c8c27fd5-5043-4b29-ac24-953268221dbd" />

<img width="452" height="685" alt="Image" src="https://github.com/user-attachments/assets/78feb200-61c3-4826-be89-1304775da247" />

* Is the rejection of `session.finish` expected behavior, or is there another protocol-compliant event to gracefully close a session?
* Is there a bug on the server side that records abrupt `ws.close()` actions as session failures?

---

### Question 3: VAD Token Consumption

Additionally, I have a question regarding the Voice Activity Detection (VAD) mode.
When VAD is enabled, **does the server consume any tokens during the idle state** (i.e., when the client is streaming background silence/noise before the actual speech or user message is triggered/recognized)?

### Environment

* Model: Qwen3.5-Omni-Flash-Realtime

Any clarification or guidance on how to properly perform a graceful shutdown and understand VAD token mechanics would be greatly appreciated. Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen-Omni Realtime API - Invalid session.finish event, session failure on ws.close(), and VAD token consumption inquiry #140

Context / Background

Problem 1: Invalid `session.finish` event & Missing Graceful Shutdown

Problem 2: Dashboard showing "Failed" sessions & Token usage tracking

Question 3: VAD Token Consumption

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Qwen-Omni Realtime API - Invalid session.finish event, session failure on ws.close(), and VAD token consumption inquiry #140

Description

Context / Background

Problem 1: Invalid session.finish event & Missing Graceful Shutdown

Problem 2: Dashboard showing "Failed" sessions & Token usage tracking

Question 3: VAD Token Consumption

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Problem 1: Invalid `session.finish` event & Missing Graceful Shutdown