nullptr dereference in WASM code path

### Bug description
When trying to run the WASM compiled version of bergamot-translator, some models (specifically, the student.base models from browsermt/students) produce invalid output. Typically a bunch of repetitions of a single word or character per sentence. Not unlike [this bug report](https://bugzilla.mozilla.org/show_bug.cgi?id=1714438#c10).

In an attempt to figure out what was going on, I basically compiled the WASM code path into a native app so I could run it through `llvm`. This caught a nullptr dereference inside intgemm which was ultimately caused by the `nullptr` which is supposed to be `const float* input_bias` this line: 
https://github.com/browsermt/marian-dev/blob/53c4f7e4537dbf7782c583e98f50e513f0a27541/src/tensors/cpu/intgemm_interface.h#L386

This then translates/binds/magics into

https://github.com/browsermt/marian-dev/blob/53c4f7e4537dbf7782c583e98f50e513f0a27541/src/tensors/cpu/wasm_intgemm_fallback.cpp#L66-L72

That nullptr then [ends up here](https://github.com/kpu/intgemm/blob/f1f59bb3b32aad5686eeb41c742279d47be71ce8/intgemm/callbacks/implementations.inl#L216) as `config.bias_addr`:

```cpp
template <> class CallbackImpl<CPUType::CPU_NAME, UnquantizeAndAddBiasAndWrite> {
  ...
    auto result = kernels::unquantize(input, mult_reg);
    result = kernels::add_bias(result, config.bias_addr, info.col_idx);
    kernels::write(result, config.output_addr, info.row_idx * info.cols + info.col_idx);
  ...
};
```

And I'm not sure [which of these implementations of `kernels::add_bias`](https://github.com/kpu/intgemm/blob/f1f59bb3b32aad5686eeb41c742279d47be71ce8/intgemm/kernels/implementations.inl#L66-L89) it ends up at, but none of these seem to be happy with `nullptr`.

As an experiment, I re-implemented the fallback function to handle the nulltr and call the callback without the bias term if the bias was null:

```cpp
extern "C" void int8PrepareBias(const int8_t* input_B_prepared,
                                        float scale_A,
                                        float zero_point_A,
                                        float scale_B,
                                        float zero_point_B,
                                        Index width,
                                        Index cols_B,
                                        const float* input_bias,
                                        float* output) {
  float unquant_factor = (-1) * ((127.0f / scale_A) * (127.0f / scale_B)) / (127.0f);
  if (input_bias == nullptr) {
    intgemm::Int8Shift::PrepareBias(
        input_B_prepared,
        width,
        cols_B,
        intgemm::callbacks::UnquantizeAndWrite(unquant_factor, output));
  } else {
    intgemm::Int8Shift::PrepareBias(
        input_B_prepared,
        width,
        cols_B,
        intgemm::callbacks::UnquantizeAndAddBiasAndWrite(unquant_factor, input_bias, output));
  }
}
```

That fixes both the nullptr dereference error and the broken model output for my non-wasm wasm build.

I'm reporting this as a bug as I imagine there was some reasoning behind writing a `nullptr` there.

I'm also surprised that what seems to be buggy code compiled to a (mostly) functioning wasm build that works for the tiny models. The base and tiny11 models don't seem to differ all that much. Same layers it seems, they're just larger in base?

Lastly, I'm not sure how to fix this. The quick hack above won't work since that's the fallback code path. Something similar would need to be added to the intgemm code in Mozilla's tree.

# Reproduce

Tested by building app/bergamot.cpp from bergamot-translator, after patching cmake files to not pass emcc specific flags when COMPILE_WASM is defined. Also changed wasm_intgemm_interface.h to remove the compiler attributes, and wasm_intgemm_fallback.cpp to remove all the `Fallback` bits from the names so that those functions get called directly.

I also needed to patch the config.intgemm8bitalpha.yml to include:

```yml
max-length-break: 128
mini-batch-words: 1024
```

After that, this works (or without the patch to fallback.cpp, gives a useful crash):

```
> lldb -- app/bergamot --model-config-paths ~/.config/translateLocally/ende.student.base-1647129297/config.intgemm8bitalpha.yml
(lldb) target create "app/bergamot"
Current executable set to '/Users/jelmer/Workspace/statmt/firefox-translations/bergamot-translator/build/app/bergamot' (x86_64).
(lldb) settings set -- target.run-args  "--model-config-paths" "/Users/jelmer/.config/translateLocally/ende.student.base-1647129297/config.intgemm8bitalpha.yml"
(lldb) run
Process 3022 launched: '/Users/jelmer/Workspace/statmt/firefox-translations/bergamot-translator/build/app/bergamot' (x86_64)
Hello world!
Hallo Welt!
Process 3022 exited with status = 0 (0x00000000)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

nullptr dereference in WASM code path #81

Bug description

Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	float unquant_factor = (-1) * ((127.0f / scale_A) * (127.0f / scale_B)) / (127.0f);
	intgemm::Int8Shift::PrepareBias(
	input_B_prepared,
	width,
	cols_B,
	intgemm::callbacks::UnquantizeAndAddBiasAndWrite(unquant_factor, input_bias, output));
	}

nullptr dereference in WASM code path #81

Description

Bug description

Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions