Releases: dvruette/gidd
Releases · dvruette/gidd
v1.1.0
- Improved consistency with the paper (surrounding the mixing distribution and ELBO weights)
- Added pre-tokenization and document splitting to prevent over-sampling of short documents. This helps with consistency with the literature and should prevent overfitting for long training runs.
- Optimized VRAM usage and runtime. For lowest VRAM usage, it is recommended to run with torch.compile enabled as well as setting
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True.
Bugfixes
Initial release
v1.0.0 Update README.md