Skip to content

W2V2 LayerNorm location #1

@CaptainPrice2023

Description

@CaptainPrice2023

Hi, thanks for the sharing! I have a question about the adapter location in W2V2.

W2V2 transformer encoder applies LN after attention. But after adding adapter, should the adapter computation be conducted after LN layers instead of before it?

hidden_states = self.dropout(hidden_states)
hidden_states = attn_residual + hidden_states
# adapter
if args.adapter: adapt_h = self.adapter(hidden_states)
hidden_states = self.layer_norm(hidden_states)
hidden_states = hidden_states + self.feed_forward(hidden_states)
if args.adapter: hidden_states = hidden_states+ adapt_h
hidden_states = self.final_layer_norm(hidden_states)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions