Possible numerical error in log-norm computation

In current implementation, emissions and the predictions subtract their own maximum values respectively. But consider this case

```
emission[0, 0] = [0, -1000]
prediction[0, 0] = [-1000, 0]
->
# current impl
logNorm[0, 0, 0] = log(exp(emission[0, 0]-maxEs) @ exp(prediction[0, 0]-maxPs)) + maxEs + maxPs
                             = log(exp([0, -1000]) @ exp([-1000, 0]))
                             = log([1, exp(-1000)] @ [exp(-1000), 1])  <-- exp(-1000) would give 0 in FP32 precision
                             = log(0)
                             = -inf

# correct result
logNorm[0, 0, 0] = log(2) - 1000
```
I also tried convert emission and prediction into FP64 before calculating the logNorm, but it still didn't work in my asr experiment.

The broadcast-sum way is more numerical stable, but would consume `O(B*T*U*V)` memory.

```
logNorm = torch.log_softmax(emission.unsqueeze(2) + prediction.unsqueeze(1), dim=-1)
```



https://github.com/awni/transducer/blob/e90c6f45f10ccb404befddb0a99463fa6cb2e753/transducer/torch_binding.py#L162-L167

	maxEs = emissions.max(dim=2, keepdim=True)[0]
	maxPs = predictions.max(dim=2, keepdim=True)[0]
	log_norms = torch.log(torch.bmm(
	torch.exp(emissions - maxEs),
	torch.exp((predictions - maxPs)).transpose(1, 2)))
	log_norms = log_norms + maxEs + maxPs.transpose(1, 2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible numerical error in log-norm computation #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Possible numerical error in log-norm computation #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions