Choosing the right hash algorithms and especially tuning the functions to fit the entropy of the data, the security needs as well as the performance requirements are important parts in the configuration of an Airlock IAM system.
Secrets that have to be checked by a server should rather be stored as salted hash values than as plaintext. There are other approaches like storing secrets or hashes in HSMs or distributed hash databases, but they are not subject of this article.
There are two goals that hashing of secrets can achieve:
A hash function maps strings of arbitrary size to strings of a fixed size. We are interested in so-called cryptographic hash functions, which are designed to be one-way. This refers to the the property that they are infeasible to invert. Which means it is not possible to find some input that leads to a given output except trying all inputs. Ideally, the function should behave like a random function which, in particular, implies that it is difficult to find collisions.
We distinguish between three classes of these functions:
The goal (2) is fulfilled by any hashing function. For further evaluation we concentrate on goal (1).
If the secret to protect has a lot of entropy so that it is infeasible even to list all probable values, then already the large number of possibilities protects against attacks of type (1). This allows an efficient automated handling of technical keys, e.g. OAuth or OIDC. Fast hash algorithms are perfect for this scenario.
If the secret has too little entropy, then no hashing method will protect that specific secret against an inversion attack (1). An attacker can simply try all possible inputs. This means explicitly that there is no protection for a badly chosen password.
We now consider secrets of low to intermediate entropy. These are typically passwords and alike. Attacks of type (1) where all hashes are inverted and published are particularly severe, as users tend to use the same passwords for multiple accounts. In order to protect against such attacks, a costly hash function has to be tuned, so that it requires as much computation effort as is acceptable for the user and the service provider, while making attacks as costly as possible. The authors of Scrypt propose that for password hashing, the parameters should be chosen in a way that a hash calculation takes approximately 100ms in a single CPU core [1]. Airlock IAM uses the recommended Scrypt parameters by default. Depending on the application scenario, a higher or lower effort may be acceptable/reasonable.
Recommended hashing functions for specific scenarios
Matrix cards typically have low entropy. The inversion of a specific value is feasable - independent of the hashing algorithm. For this reason a service provider has to lock all matrix cards if the hashed values are stolen. Since matrix cards are not reused for other services there is no benefit in inverting and publishing the whole database of secrets.
References