Skip to content

Flow identity initialization #72

@francesco-vaselli

Description

@francesco-vaselli

Hello and thanks for the work on the package,

I am doing some tests with the identity initialization for rational quadratic splines.
When using the new identity init implemented in #65 , giving in input x = torch.tensor([1, 1e-2 ,1e-6, 1e-8, 1e2], dtype=torch.float32) the following is the inverse for the untrained network (which should be initialized as the identity):

# in spline def: enable_identity_init=True

# transform back
flow.transform_to_noise(x.view(-1,1))
tensor([[  1.7013],
        [  1.3796],
        [  1.3739],
        [  1.3739],
        [100.0000]], grad_fn=<AddmmBackward0>)

If instead I manually set the weights of the last layer to 0 in the last layer of the transform network (as done in the normflows package) I get the identity as expected:

# in spline def: enable_identity_init=False

# in the model def
        if init_identity:
          torch.nn.init.constant_(autoregressive_net.final_layer.weight, 0.0)
          torch.nn.init.constant_(
              autoregressive_net.final_layer.bias,
              np.log(np.exp(1 - min_derivative) - 1),
          )

# stuff

# transform back
flow.transform_to_noise(x.view(-1,1))
tensor([[1.0000e+00],
        [1.0000e-02],
        [1.0000e-06],
        [1.0000e-08],
        [1.0000e+02]], grad_fn=<AddmmBackward0>)

I was wondering whether you could help me figure out this difference in behavior.
If this seems potentially useful I can work more than gladly on a pull request.
Best regards,
Francesco

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions