In order to prevent exploits that try to overload servers by generating endless responses, we should establish a limit for the number of output tokens.
If the users requests, we should maybe add a "keep generating" button, depending on complexity and @robin-lecomte 's opinion on this.
In order to prevent exploits that try to overload servers by generating endless responses, we should establish a limit for the number of output tokens.
If the users requests, we should maybe add a "keep generating" button, depending on complexity and @robin-lecomte 's opinion on this.