Describe the bug
In some cases verifier rejects a program and instead logs with useful info shows this message:
load program: bad address (hit verifier bug, increase LogSizeStart to fit the log and check dmesg)
However, logs says this:
...
2192: (56) if w0 != 0x0 goto pc+373
The sequence of 8193 jumps is too complex.
verification time 850563 usec
stack depth 64+8+0+8+8+64+8+0+8+0+40+24+280+112+80+40+8+16+0+0+0+0+16+8+8+24+0+8+0+16+0+0+0+0+0+0
processed 336006 insns (limit 1000000) max_states_per_insn 19 total_states 13435 peak_states 2210 mark_read 177
Here is a simple but improper fix:
|
case errors.Is(err, unix.EFAULT): |
|
// EFAULT is returned when the kernel hits a verifier bug, and always |
|
// overrides ENOSPC, defeating the buffer growth strategy. Warn the user |
|
// that they may need to increase the buffer size manually. |
|
return nil, fmt.Errorf("load program: %w (hit verifier bug, increase LogSizeStart to fit the log and check dmesg)", err) |
// EFAULT is returned when the kernel hits a verifier bug, and always
// overrides ENOSPC, defeating the buffer growth strategy. Warn the user
// that they may need to increase the buffer size manually.
- return nil, fmt.Errorf("load program: %w (hit verifier bug, increase LogSizeStart to fit the log and check dmesg)", err)
+ return nil, internal.ErrorWithLog("load program (hit verifier bug, increase LogSizeStart to fit the log and check dmesg)", err, logBuf)
case errors.Is(err, unix.EINVAL):
if bytes.Contains(tail, coreBadCall) {
It took several weeks of blind debugging when we could use at least some info with the logs
How to reproduce
Version information
main branch
Describe the bug
In some cases verifier rejects a program and instead logs with useful info shows this message:
However, logs says this:
Here is a simple but improper fix:
ebpf/prog.go
Lines 526 to 530 in 18b5c5b
It took several weeks of blind debugging when we could use at least some info with the logs
How to reproduce
Version information
main branch