Skip to content

Hash Functions

Hash Function Implementations

This module provides various hash function implementations that can be used as alternatives to Python's built-in hash() function. These are commonly used in hash tables, hash maps, and other data structures requiring hash-based operations.

Hash functions convert input data of arbitrary size to fixed-size values.

Good hash functions should:

  • Be deterministic (same input always produces same output)
  • Distribute values uniformly across the hash space
  • Be fast to compute
  • Minimize collisions

Included algorithms:

  • Polynomial Hash / Polynomial Rolling Hash (for strings)
  • FNV-1a Hash
  • Simple Multiplicative Hash
  • Jenkins One-at-a-Time Hash
  • DJB2 Hash
  • SDBM Hash
  • MurmurHash3 (simplified version)

CompositeHash(hash_functions, combination_method='xor')

Bases: HashFunction

Composite hash function that combines multiple hash functions.

Uses multiple hash functions and combines their results to potentially improve distribution and reduce collisions.

Initialize composite hash function.

Parameters:

Name Type Description Default
hash_functions list[HashFunction]

List of hash functions to combine

required
combination_method str

How to combine results ('xor', 'add', 'multiply')

'xor'

hash(key)

Compute composite hash.

DJB2Hash

Bases: HashFunction

DJB2 hash function by Dan J. Bernstein.

A simple and fast hash function that works well in practice. Formula: hash = hash * 33 + c (where c is each character)

hash(key)

Compute DJB2 hash.

FNVHash(use_64bit=False)

Bases: HashFunction

FNV-1a (Fowler-Noll-Vo) hash function.

A fast, non-cryptographic hash function with good distribution properties. The 1a variant processes bytes in a different order than FNV-1.

Initialize FNV hash function.

Parameters:

Name Type Description Default
use_64bit bool

Whether to use 64-bit or 32-bit variant

False

hash(key)

Compute FNV-1a hash.

HashFunction

Base class for hash functions.

hash(key)

Compute hash value for the given key.

Parameters:

Name Type Description Default
key Any

Key to hash

required

Returns:

Type Description
int

Hash value as integer

JenkinsHash

Bases: HashFunction

Jenkins One-at-a-Time hash function.

A simple but effective hash function by Bob Jenkins. Good for short keys and has reasonable distribution properties.

hash(key)

Compute Jenkins one-at-a-time hash.

MultiplicativeHash(multiplier=2654435761, word_size=32)

Bases: HashFunction

Simple multiplicative hash function.

Uses the formula: hash = (key * A) mod 2^w where A is a carefully chosen constant and w is the word size.

Initialize multiplicative hash function.

Parameters:

Name Type Description Default
multiplier int

Multiplication constant (default is golden ratio * 2^32)

2654435761
word_size int

Word size in bits

32

hash(key)

Compute multiplicative hash.

MurmurHash3(seed=0)

Bases: HashFunction

Simplified MurmurHash3 implementation.

MurmurHash3 is a fast, non-cryptographic hash function with excellent distribution properties. This is a simplified 32-bit version.

Initialize MurmurHash3.

Parameters:

Name Type Description Default
seed int

Seed value for the hash function

0

hash(key)

Compute MurmurHash3.

PolynomialHash(base=31, modulus=2 ** 31 - 1)

Bases: HashFunction

Polynomial rolling hash function.

Uses the formula: hash = (a₁ * p^(n-1) + a₂ * p^(n-2) + ... + aₙ) mod m where p is a prime number and m is a large prime modulus.

This is particularly effective for string hashing and supports rolling hash operations for sliding window algorithms.

Initialize polynomial hash function.

Parameters:

Name Type Description Default
base int

Base value (should be prime, typically 31 or 37)

31
modulus int

Modulus value (should be large prime)

2 ** 31 - 1

hash(key)

Compute polynomial hash.

rolling_hash(old_char, new_char, old_hash, window_size)

Compute rolling hash for sliding window.

Parameters:

Name Type Description Default
old_char str

Character being removed from window

required
new_char str

Character being added to window

required
old_hash int

Previous hash value

required
window_size int

Size of the sliding window

required

Returns:

Type Description
int

New hash value

SDBMHash

Bases: HashFunction

SDBM hash function.

Used in the SDBM database. Simple and effective for many applications. Formula: hash = hash * 65599 + c

hash(key)

Compute SDBM hash.

analyze_hash_distribution(hash_func, keys, num_buckets)

Analyze hash distribution across buckets.

Parameters:

Name Type Description Default
hash_func HashFunction

Hash function to test

required
keys list

List of keys to hash

required
num_buckets int

Number of buckets to distribute across

required

Returns:

Type Description
dict

Dictionary with distribution statistics

benchmark_hash_functions(keys, hash_functions, iterations=1)

Benchmark multiple hash functions for speed.

Parameters:

Name Type Description Default
keys list

List of keys to hash

required
hash_functions list[tuple[str, HashFunction]]

List of (name, hash_function) tuples

required
iterations int

Number of iterations to run

1

Returns:

Type Description
dict

Dictionary with timing results

create_hash_function(algorithm='fnv', **kwargs)

Factory function to create hash function instances.

Available algorithms:

algorithms = {
    'polynomial': PolynomialHash,
    'poly': PolynomialHash,
    'fnv': FNVHash,
    'fnv1a': FNVHash,
    'multiplicative': MultiplicativeHash,
    'mult': MultiplicativeHash,
    'jenkins': JenkinsHash,
    'djb2': DJB2Hash,
    'sdbm': SDBMHash,
    'murmur': MurmurHash3,
    'murmur3': MurmurHash3,
}

Parameters:

Name Type Description Default
algorithm str

Name of the hash algorithm

'fnv'
**kwargs

Additional arguments for the hash function

{}

Returns:

Type Description
HashFunction

Hash function instance

Raises:

Type Description
ValueError

If algorithm is not recognized