Right now the index looks like:
label 1, offset 1, label 2, offset 2
but it could be like:
label 1, label 2, offset 1, offset 2
Which would be more cache aware. The final read of offset will likely incur an additional cache line, but with e.g. 100k labels, there would be 17 reads already, and clustering them better will probably be faster.