Speed up pure-Go implementation by ensuring we use ROLQ instructions

The compiler requires a little massaging to emit rotate instructions.
This ends up being a fairly significant performance boost.

name                   old time/op    new time/op    delta
Hashes/xxhash,n=4KB-4     616ns ± 2%     466ns ± 1%  -24.33%  (p=0.008 n=5+5)

name                   old speed      new speed      delta
Hashes/xxhash,n=4KB-4  6.49GB/s ± 2%  8.57GB/s ± 1%  +32.04%  (p=0.008 n=5+5)
1 file changed
tree: 96b475aedd0d80d1aa8df6a6d6a85f3a6699ac86
  1. xxhsum/
  2. LICENSE.txt
  3. README.md
  4. xxhash.go
  5. xxhash_amd64.go
  6. xxhash_amd64.s
  7. xxhash_amd64_test.go
  8. xxhash_other.go
  9. xxhash_test.go
README.md

xxhash

GoDoc

xxhash is a Go implementation of the 64-bit xxHash algorithm, XXH64. This is a high-quality hashing algorithm that is much faster than anything in the Go standard library.

On amd64 there is an even faster assembly implementation that runs at over 10 GB/s on my laptop.