#causality-bug

1 experiment

EXP-0052026-04-07

Residual Byte Patching: 3.5x Faster and 0.6 BPB Better — After Catching a Causality Bug in Learned Boundaries

Can a byte-level language model learn where to place patch boundaries, or is fixed-stride mean pooling with a byte-level residual connection sufficient?

Fixed mean pooling + broadcast upsample + byte residual is a strict Pareto improvement over full byte resolution. Varian…
Learned soft boundaries contained a critical causality bug. The Gaussian soft assignment matrix had non-zero weights acr…

#negative-result#byte-level#patching#ssm#causality-bug