decoding speed



I wonder how to decode in 70ms for 10s audio  as you mention in "The SenseVoice-Small model utilizes a non-autoregressive end-to-end framework, leading to exceptionally low inference latency. It requires only 70ms to process 10 seconds of audio, which is 15 times faster than Whisper-Large."

it took me  200ms  to decode 5s audio on GPU
But I don't use onnx and quantize, is it the cause why it is more time-comsuming than that as you declare?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

decoding speed #280

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

decoding speed #280

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions