Categories: containers, functors | Component type: concept |
A Hash Function is a Unary Function that is used by Hashed Associative Containers: it maps its argument to a result of type size_t. A Hash Function must be deterministic and stateless. That is, the return value must depend only on the argument, and equal arguments must yield equal results.
The performance of a Hashed Associative Container depends crucially on its hash function. It is important for a Hash Function to minimize collisions, where a collision is defined as two different arguments that hash to the same value. It is also important that the distribution of hash values be uniform; that is, the probability that a Hash Function returns any particular value of type size_t should be roughly the same as the probability that it returns any other value. [1]
Result type | The type returned when the Hash Function is called. The result type must be size_t. |
Deterministic function | The return value depends only on the argument, as opposed to the past history of the Hash Function object. The return value is always the same whenever the argument is the same. |
[1] Note that both of these requirements make sense only in the context of some specific distribution of input values. To take a simple example, suppose that the values being hashed are the six strings "aardvark", "trombone", "history", "diamond", "forthright", and "solitude". In this case, one reasonable (and efficient) hash function would simply be the first character of each string. On the other hand, suppose that the values being hashed are "aaa0001", "aaa0010", "aaa0011", "aaa0100", "aaa0101", and "aaa0110". In that case, a different hash function would be more appropriate. This is why Hashed Associative Containers are parameterized by the hash function: no one hash function is best for all applications.