Small, Fast 64 bit hash function

2024-11-17 19:23:00
nrk.neocities.org

ChibiHash: Small, Fast 64 bit hash function

16 Nov 2024

If you need a small and fast 64 bit hash function that can be copy-pasted
easily, then here’s one that I cooked up in an hour or so: chibihash64.h.

Some key features:

Small: ~60 loc in C
Fast: See benchmark table below
Portable: Doesn’t use hardware specific instructions (e.g SSE)
Good Quality: Passes smhasher, so should be good quality (I think)
Unencumbered: Released into the public domain
Free of undefined behavior and gives same result regardless of host system’s endianness.
Non-cryptographic

Here’s some benchmark against other similar hash functions:

Name	Large input (GiB/sec)	Small input (Cycles/Hash)
chibihash64	18.08	49
xxhash64	12.59	50
city64	14.95	35
spooky64	13.83	59

It’s the fastest of the bunch for large string throughput.
For small string (hardcoded special cases for input below or equal 32 bytes.

Design overview

The function starts by declaring some constants and initializing a 32 byte
state.

const uint64_t P1 = UINT64_C(0x2B7E151628AED2A5);
const uint64_t P2 = UINT64_C(0x9E3793492EEDC3F7);
const uint64_t P3 = UINT64_C(0x3243F6A8885A308D);

uint64_t h[4] = { P1, P2, P3, seed };

The constants are nothing special, it’s just e (Euler’s number), pi (π) and
golden ratio written out in hex with the last digit adjusted to be odd (because
multiply by even number isn’t invertable).

for (; l >= 32; l -= 32) {
    for (int i = 0; i > 24));
    }
}

This is the meat of the algorithm.
It processes 8 bytes at a time mixing it in via xor and multiply.
However, multiplication is an “upwards” operation where the low bits affect the
high bits but the high bits don’t get to affect the low ones.
This creates some biases.
So you’d want to mix the high bits into the lower ones to reduce the bias.
This commonly achieved via an xor-shift: x ^= x >> 32;.
A bitwise rotation bringing the high bits down also works,
which is what I opted for.

Initially I was mixing it into h[i] but I noticed that mixing it into the next
index sped things up noticeably.
Probably due breaking up some dependency and allowing for better
instruction-level-parallelism (ILP).

Speaking of ILP, with compiler optimization, the inner loop should get unrolled,
which would open up more avenue for utilizing ILP better.
This is similar principle to how xxhash achieves it’s high speed.

The way it loads the 8 bytes is also important.
The correct way is to load via shift+or:

static inline uint64_t
chibihash64__load64le(const uint8_t *p)
{
    return (uint64_t)p[0]

This is free of any UB, works on any alignment and on any machine regardless of
it’s endianness.
It’s also fast, gcc and clang recognize this pattern and optimize it into
a single mov instruction on x86 targets.

h[0] += ((uint64_t)len > 32);
if (l & 1) {
    h[0] ^= k[0];
    --l, ++k;
}
h[0] *= P2; h[0] ^= h[0] >> 31;

Now we mix in the length along with a byte if the remaining length is odd
(you’ll see why in a minute).

for (int i = 1; l >= 8; l -= 8, k += 8, ++i) {
    h[i] ^= chibihash64__load64le(k);
    h[i] *= P2; h[i] ^= h[i] >> 31;
}

Keep chopping of 8 bytes until there’s less than that remaining.

for (int i = 0; l > 0; l -= 2, k += 2, ++i) {
    h[i] ^= (k[0] | ((uint64_t)k[1] > 31;
}

Now we chop off the last remaining bytes, 2 bytes at a time.
Since we mixed in a byte in case of an odd length already, the remaining length
now must be even, thus it’s safe to read 2 bytes at a time.
This gives a decent dump in small string performance compared to doing it one
byte at a time.

uint64_t x = seed;
x ^= h[0] * ((h[2] >> 32)|1);
x ^= h[1] * ((h[3] >> 32)|1);
x ^= h[2] * ((h[0] >> 32)|1);
x ^= h[3] * ((h[1] >> 32)|1);

Now we combine all the state into a single 64 bit value.
I’m not sure if this is a good way to do it, but it seems to work fine.

x ^= x >> 27; x *= UINT64_C(0x3C79AC492BA7B653);
x ^= x >> 33; x *= UINT64_C(0x1C69B3F74AC4AE35);
x ^= x >> 27;

And finally mix it in well for some good avalancing behavior.
The finisher used here is a direct copy of moremur, which should be a good
finisher with decent speed.

Mathemetical foundation

There are none.
Everything here is “empirically derived”
(I kept making changes until the tests passed).

When NOT to use

The introduction should make it clear on why you’d want to use this.
Here are some reasons to avoid using this:

For cryptographic purposes.
For protecting against collision attacks on hash-tables
(SipHash is the recommended one for this purpose).
When you need very strong probability against collisions: ChibiHash does very
minimal amount of mixing compared to other hashes (e.g xxhash64). And so
chances of collision should in theory be higher.

Conclusion

I started writing this because all the 64 bit hash functions I came across were
either too slow (FNV-1a, one byte at a time processing), or too large spanning
hundreds of lines of code, or non-portable due to using hardware specific
instructions.
Being small and portable, the goal is to be able to use ChibiHash as a good
“default” for non-cryptographic 64-bit hashing needs.

Tags: [ c, hashing ]

Source Link

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: HACKER NEWS

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Small, Fast 64 bit hash function

Design overview

Mathemetical foundation

When NOT to use

Conclusion

Support Techcratic

Bitcoin QR Code

Garmin Dash Cam 47, 1080p and 140-degree FOV, Monitor Your Vehicle While Away w/ New…

Best free board game deal: Target Circle members can buy two board games and get one free

Related Posts

Leave a Reply Cancel reply

Tech News

Tech News

Tech News

Tech News​

Site Links

Tech News