Hash Functions
Hash Function Implementations
This module provides various hash function implementations that can be used as alternatives to Python's built-in hash() function. These are commonly used in hash tables, hash maps, and other data structures requiring hash-based operations.
Hash functions convert input data of arbitrary size to fixed-size values.
Good hash functions should:
- Be deterministic (same input always produces same output)
- Distribute values uniformly across the hash space
- Be fast to compute
- Minimize collisions
Included algorithms:
- Polynomial Hash / Polynomial Rolling Hash (for strings)
- FNV-1a Hash
- Simple Multiplicative Hash
- Jenkins One-at-a-Time Hash
- DJB2 Hash
- SDBM Hash
- MurmurHash3 (simplified version)
CompositeHash(hash_functions, combination_method='xor')
Bases: HashFunction
Composite hash function that combines multiple hash functions.
Uses multiple hash functions and combines their results to potentially improve distribution and reduce collisions.
Initialize composite hash function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hash_functions
|
list[HashFunction]
|
List of hash functions to combine |
required |
combination_method
|
str
|
How to combine results ('xor', 'add', 'multiply') |
'xor'
|
hash(key)
Compute composite hash.
DJB2Hash
Bases: HashFunction
DJB2 hash function by Dan J. Bernstein.
A simple and fast hash function that works well in practice. Formula: hash = hash * 33 + c (where c is each character)
hash(key)
Compute DJB2 hash.
FNVHash(use_64bit=False)
Bases: HashFunction
FNV-1a (Fowler-Noll-Vo) hash function.
A fast, non-cryptographic hash function with good distribution properties. The 1a variant processes bytes in a different order than FNV-1.
Initialize FNV hash function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
use_64bit
|
bool
|
Whether to use 64-bit or 32-bit variant |
False
|
hash(key)
Compute FNV-1a hash.
HashFunction
Base class for hash functions.
hash(key)
Compute hash value for the given key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
Any
|
Key to hash |
required |
Returns:
| Type | Description |
|---|---|
int
|
Hash value as integer |
JenkinsHash
Bases: HashFunction
Jenkins One-at-a-Time hash function.
A simple but effective hash function by Bob Jenkins. Good for short keys and has reasonable distribution properties.
hash(key)
Compute Jenkins one-at-a-time hash.
MultiplicativeHash(multiplier=2654435761, word_size=32)
Bases: HashFunction
Simple multiplicative hash function.
Uses the formula: hash = (key * A) mod 2^w where A is a carefully chosen constant and w is the word size.
Initialize multiplicative hash function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
multiplier
|
int
|
Multiplication constant (default is golden ratio * 2^32) |
2654435761
|
word_size
|
int
|
Word size in bits |
32
|
hash(key)
Compute multiplicative hash.
MurmurHash3(seed=0)
Bases: HashFunction
Simplified MurmurHash3 implementation.
MurmurHash3 is a fast, non-cryptographic hash function with excellent distribution properties. This is a simplified 32-bit version.
Initialize MurmurHash3.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seed
|
int
|
Seed value for the hash function |
0
|
hash(key)
Compute MurmurHash3.
PolynomialHash(base=31, modulus=2 ** 31 - 1)
Bases: HashFunction
Polynomial rolling hash function.
Uses the formula: hash = (a₁ * p^(n-1) + a₂ * p^(n-2) + ... + aₙ) mod m where p is a prime number and m is a large prime modulus.
This is particularly effective for string hashing and supports rolling hash operations for sliding window algorithms.
Initialize polynomial hash function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base
|
int
|
Base value (should be prime, typically 31 or 37) |
31
|
modulus
|
int
|
Modulus value (should be large prime) |
2 ** 31 - 1
|
hash(key)
Compute polynomial hash.
rolling_hash(old_char, new_char, old_hash, window_size)
Compute rolling hash for sliding window.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
old_char
|
str
|
Character being removed from window |
required |
new_char
|
str
|
Character being added to window |
required |
old_hash
|
int
|
Previous hash value |
required |
window_size
|
int
|
Size of the sliding window |
required |
Returns:
| Type | Description |
|---|---|
int
|
New hash value |
SDBMHash
Bases: HashFunction
SDBM hash function.
Used in the SDBM database. Simple and effective for many applications. Formula: hash = hash * 65599 + c
hash(key)
Compute SDBM hash.
analyze_hash_distribution(hash_func, keys, num_buckets)
Analyze hash distribution across buckets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hash_func
|
HashFunction
|
Hash function to test |
required |
keys
|
list
|
List of keys to hash |
required |
num_buckets
|
int
|
Number of buckets to distribute across |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Dictionary with distribution statistics |
benchmark_hash_functions(keys, hash_functions, iterations=1)
Benchmark multiple hash functions for speed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
keys
|
list
|
List of keys to hash |
required |
hash_functions
|
list[tuple[str, HashFunction]]
|
List of (name, hash_function) tuples |
required |
iterations
|
int
|
Number of iterations to run |
1
|
Returns:
| Type | Description |
|---|---|
dict
|
Dictionary with timing results |
create_hash_function(algorithm='fnv', **kwargs)
Factory function to create hash function instances.
Available algorithms:
algorithms = {
'polynomial': PolynomialHash,
'poly': PolynomialHash,
'fnv': FNVHash,
'fnv1a': FNVHash,
'multiplicative': MultiplicativeHash,
'mult': MultiplicativeHash,
'jenkins': JenkinsHash,
'djb2': DJB2Hash,
'sdbm': SDBMHash,
'murmur': MurmurHash3,
'murmur3': MurmurHash3,
}
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
algorithm
|
str
|
Name of the hash algorithm |
'fnv'
|
**kwargs
|
Additional arguments for the hash function |
{}
|
Returns:
| Type | Description |
|---|---|
HashFunction
|
Hash function instance |
Raises:
| Type | Description |
|---|---|
ValueError
|
If algorithm is not recognized |