[Bug Fix] Fix Train-Inference Mismatch by Jayce-Ping · Pull Request #61 · X-GenGroup/Flow-Factory

Jayce-Ping · 2026-03-11T15:45:29Z

There were small mismatch of next_latents_mean, next_latents and log_prob between train and inference, the causes are:

Different precision of next_latents were used for log_prob computation: During inference, it was actually used as float32 while during training, next_latents were passed directly into the scheduler.step as bfloat16. Casting next_latents to the actual input_dtype will fix the issue.
Small precision difference between timesteps and sigmas. Passing next_t for forward function during sampling fixes it.

There were small mismatch of `next_latents_mean`, `next_latents` and `log_prob` between train and inference, the causes are: 1. Different precision of `next_latents` were used for `log_prob` computation: During inference, it was actually used as `float32` while during training, `next_latents` were passed directly into the `scheduler.step` as `bfloat16`. Casting `next_latents` to the actual `input_dtype` will fix the issue. 2. Small precision difference between `timesteps` and `sigmas`. Passing `next_t` for `forward` function during sampling fixes it.

Jayce-Ping added 4 commits March 11, 2026 23:43

Remove debug tools

3a7d285

Add latents_storage_dtype

69e1ca6

merge

99dc0a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Fix] Fix Train-Inference Mismatch#61

[Bug Fix] Fix Train-Inference Mismatch#61
Jayce-Ping wants to merge 4 commits intomainfrom
Fix_train-inference_mismatch

Jayce-Ping commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Jayce-Ping commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant