Hello!
I'm trying to use your pre-trained model with this command:
CUDA_VISIBLE_DEVICES=4,5,6,7 python inference.py -i -m llama-2-7b-chat --eval_name concat_recur
However, there is an unexpected generation stop when inputting the query:
help me list popular songs written by Taylor Swift.
The result is shown as follows:

It stops generating more content and outputs </s> instead.
Are there any other settings I missed?
Hello!
I'm trying to use your pre-trained model with this command:
CUDA_VISIBLE_DEVICES=4,5,6,7 python inference.py -i -m llama-2-7b-chat --eval_name concat_recurHowever, there is an unexpected generation stop when inputting the query:
help me list popular songs written by Taylor Swift.
The result is shown as follows:

It stops generating more content and outputs
</s>instead.Are there any other settings I missed?