Does Rust's array bounds checking affect performance? Does Rust's array bounds checking affect performance? arrays arrays

Does Rust's array bounds checking affect performance?


Unfortunately, the cost of a bounds check is not a straightforward thing to estimate. It's certainly not "one cycle per check", or any such easy to guess cost. It will have nonzero impact, but it might be insignificant.

In theory, it would be possible to measure the cost of bounds checking on basic types like Vec by modifying Rust to disable them and running a large-scale ecosystem test. This would give some kind of rule of thumb, but without doing it, it's quite hard to know whether this will be closer to a ten percent or a tenth of a percent overhead.

There are some ways you can do better than timing and guessing, though. These rules of thumb apply mostly to desktop-class hardware; lower end hardware or something that targets a different niche will have different characteristics.

If your indices are derived from the container size, there is a good chance that the compiler might be able to eliminate the bounds checks entirely. At this point the only cost of the bounds checks in a release build is that it intermittently interferes with optimizations, which could, but normally doesn't, impede other optimizations.

If your code is branchy, memory access heavy or otherwise hard to optimise, and the bounds to check are easy to access, there is a good chance that bounds checking will manage to happen mostly in the CPU's spare bandwidth, with branch prediction helping out specifically, in which case the overall cost will be particularly small, especially compared to the cost of the rest of the code.

If your bounds to check are behind several layers of pointers, it is plausible that you will hit issues with memory latency, and will suffer correspondingly. However, it is also plausible that speculation and prediction machinery in the CPU will manage to hide this; this is very context-dependent. If you are taking references to the data inside, rather than dereferencing it at the same time as the bounds check, this risk magnifies.

If your bounds checks are in a tight arithmetic loop that doesn't saturate the core, you aren't likely to hurt throughput directly except by impeding other compiler optimisations. However, impeding other compiler optimisations can be arbitrarily bad, anywhere from no difference to preventing SIMD and causing a factor-10 slowdown.

If your bounds checks are in a tight arithmetic loop that does saturate the core, you take on the above risk and have a direct execution penalty of roughly half a cycle per bounds check.

If your code is large enough to stress the instruction cache, then you need to worry about the impact on code size. This is normally modest, but is particularly hard to measure the runtime impact of.

Peter Cordes adds some further points in comments. First, bounds checks imply loads and stores, so you're going to be running a mixed load which is most likely to bottleneck on issue/rename. Second, even predicted branches executed in parallel take resources from the predictor, which can cause other branches to predict worse.

This might seem intimidating, and it is. That is why it's important to measure and understand your performance at the level that is relevant for you and your code.

It is also the case that since Rust was "born" with bounds checking, it has produced means to reduce their cost, such as pervasive zero-cost references, iterators (which absorb, but don't actually remove, bounds checks), and an unusual set of nice utility functions. If you find yourself hitting a pathological case, Rust also offers unsafe escape hatches.