Below you will find pages that utilize the taxonomy term “Compiler”
How the Compiler Knows Your Load Is Contiguous
The most important backend optimization in SPMD: recognizing contiguous memory access through ChangeType and BinOp chains
Read MoreByte Iteration at 32 Lanes: The Decomposed Index Path
How to iterate a []byte on AVX2 without drowning in index-register pressure
Read MorePattern Matching Outperformed Hand-Written SIMD
How compiler pattern detection on idiomatic Go outperformed explicit cross-lane SIMD builtins in our SPMD proof of concept
Read MoreLoop Peeling: Where Most of the Speed Comes From
How SSA-level loop peeling enables the all-ones mask fast path that delivers ~2x of SPMD benchmark wins
Read MoreHow SPMD Lives in the Compiler: Lessons from Building It
The mask-stack detour, predicated SSA, and why SPMD has to live at the heart of the compiler
Read More