Skip to content

PatchouliTIS/FireredASR-vLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8,688 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

vLLM

Easy, fast, and cheap LLM serving for everyone

| Documentation | Blog | Paper | Twitter/X | User Forum | Developer Slack |


Summary

Based on vLLM-0.10.0

The current repo is a specialized adaptation tailored to the original FireredASR-LLM model architecture and input parameters, containing extensive hard-coded elements. Significant work remains to be done before it can be merged into the main vLLM branch:

  • Modify the FireredASR-LLM model files to match the standard loading procedure in vLLM
  • Modify the input format to support raw features data
  • Remove the separate fireredasr directory in vllm/model_executor/models

Getting Started

  1. Run tools/merge_lora_weights.py under the directory of FireRedASR-LLM-L to get the complete Qwen2-7B LLM model with LoRA weights.

  2. Run tools/save_tokenizer.py to get the specific tokenizer of Qwen2-7B model.

  3. Set the soft link of Qwen2-7B-Instruct under the directory of FireRedASR-LLM-L to Qwen2-7B-Instruct-LoRA.

  4. Copy the file tools/fireredasr_config_template.json to the directory of FireRedASR-LLM-L as FireRedASR-LLM-L/config.json.

  5. Install vLLM from source:

    Visit offical documentation to learn more.

    Recommended environment:

    • flash-attn==2.8.3
    • torch==2.7.1

Simple Example

See files examples/fireredasr_vllm_example.py

Sampling Parameters

Parameter Default Description
max_tokens min(2048,len(audio)) Maximum number of tokens to generate(should be adjusted to the actual length of audio file)
min_tokens 0 Minimum number of tokens to generate
temperature 0.1 Sampling temperature
top_p 1.0 Top-p (nucleus) sampling
repetition_penalty 1.05 Penalty for repeating tokens

About

Naive implementation of FireredASR in vLLM.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

  •  

Packages

 
 
 

Contributors