Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> However, as the name “batch-invariant” suggests, the technique is currently limited to handling variations related only to the batch dimension, making it robust to continuous batching and other batch-size–related changes, but not to other forms of nondeterminism like changing the TP sizes or GPU types.

https://arxiv.org/abs/2506.09501



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: