As the number of inference iterations increases, the GPU memory usage continues to grow. May I ask if there are any methods to optimize it? <img width="1770" height="654" alt="Image" src="https://github.com/user-attachments/assets/92cd2c44-c8f5-4a04-ba0e-a8d4133dd562" />