Lumiera
0.pre.03
»edit your freedom«
|
Definition at line 49 of file hash-generator-test.cpp.
Private Types | |
typedef boost::hash< string > | BoostStringHasher |
typedef std::map< size_t, string > | StringsTable |
Private Member Functions | |
void | demonstrate_boost_hash_weakness () |
virtual void | run (Arg) |
void | verify_Knuth_workaround () |
|
inlineprivate |
When hashing just the plain string representation of integers, we get collisions already with small numbers below 100000. This is counter-intuitive, as the generated hash values are 17 digits long and could span much wider scale.
This problem is especially dangerous when storing objects keyed by a string-id, which is generated from running numbers.
Definition at line 76 of file hash-generator-test.cpp.
|
inlineprivate |
The boost::hash function is known to perform poorly on strings with common prefix plus running number. The mentioned trick (attributed to Donald Knuth) is spread the input numbers by something below the full domain, best close to the golden ratio; bonus points if this number is also a prime. An additional factor of 2 does not hurt (so in case of 64bit platform).
In our case, it is sufficient to apply this trick to the trailing four digits; without this trick, we get the first collisions after about 20000 running numbers.
Definition at line 124 of file hash-generator-test.cpp.
References lib::idi::anonymous_namespace{entry-id.hpp}::KNUTH_MAGIC, and lib::rani().