Skip to content

dice-group/dice-hash

Repository files navigation

dice-hash: A Hashing framework

dice-hash provides a framework to generate stable hashes. It provides state-of-the-art hash functions, supports STL containers out of the box and helps you to defines stable hashes for your own structs and classes.

🔋 batteries included: dice-hash defines policies to support different hash algorithms. It comes with predefined policies for three state-of-the-art hash functions:

These three, additional, general purpose hash functions are also (optionally) provided

📦 STL out of the box: dice-hash supports many common STL types already: arithmetic types like bool, int, double, ... etc.; collections like std::unordered_map/set, std::map/set, std::vector, std::tuple, std::pair, std::optional, std::variant, std::array and; all combinations of them.

🔩 extensible: dice-hash supports you with helper functions to define hashes for your own classes. Checkout usage.

Requirements

  • A C++20 compatible compiler. Code was only tested on x86_64.
  • If you want to use Blake2b, Blake2Xb or LtHash: libsodium (either using conan or a local system installation) (for more details scroll down to "Usage for general data hashing")

Include it into your projects

CMake

conan

To use it with conan you need to add the repository:

conan remote add dice-group https://conan.dice-research.org/artifactory/api/conan/tentris

To use it add dice-hash/0.4.9 to the [requires] section of your conan file.

You can now add it to your target with:

target_link_libraries(your_target
        dice-hash::dice-hash
        )

build and run tests

#get it 
git clone https://github.com/dice-group/dice-hash.git
cd dice-hash
#build it
wget https://github.com/conan-io/cmake-conan/raw/develop2/conan_provider.cmake -O conan_provider.cmake
mkdir build
cd build
cmake -DBUILD_TESTING -DCMAKE_BUILD_TYPE=Release ..  -DCMAKE_PROJECT_TOP_LEVEL_INCLUDES=conan_provider.cmake
make -j tests_dice_hash
./test/tests_dice_hash

Note: This example uses conan as dependency provider, other providers are possible. See https://cmake.org/cmake/help/latest/guide/using-dependencies/index.html#dependency-providers

Usage for C++ container hashing

You need to include a single header:

#include <dice/hash.hpp>

The hash is already defined for a lot of common types. In that case you can use the DiceHash just like std::hash. This means these hashes return size_t, if you need larger hashes skip to the section below.

dice::hash::DiceHash<int> hash;
hash(42);

basicUsage is a run able example for this use-case.

If you need DiceHash to be able to work on your own types, you can specialize the dice::hash::dice_hash_overload template:

struct YourType{};
namespace dice::hash {
    template <typename Policy>
    struct dice_hash_overload<Policy, YourType> {
        static std::size_t dice_hash(YourType const& x) noexcept {
            return 42;
        }
    };
}

Here is an compilable example.

If you want to combine the hash of two or more objects you can use the hash_combine or hash_invertible_combine function. These are part of the Policy, however they can be called via the DiceHash object. An example can be seen here.

If your own type is a container type, there is an easier and faster way to define the hash for you. There are the two typetraits is_ordered_container and is_unordered_container. You just need to set these typetraits for your own type, and the hash will automatically loop over the entries and hash them.

struct YourOwnOrderedContainer{...};
namespace dice::hash {
    template<> struct is_ordered_container<YourOwnOrderedContainer> : std::true_type {};
}

Now you can use DiceHash with your container.

However: Your container needs to have begin, end and size functions. One simple example can be found here.

If you want to use DiceHash in a different structure (like std::unordered_map), you will need to set DiceHash as the correct template parameter. This is one example.

Usage for general data hashing

The hash functions mentioned in this section are enabled/disabled using the feature flag WITH_SODIUM=ON/OFF. Enabling this flag (default behaviour) results in libsodium being required as a dependency. If using conan, libsodium will be fetched using conan, otherwise dice-hash will look for a local system installation.

The hashes mentioned here are not meant to be used in C++ containers as they do not return size_t. They are instead meant as general hashing functions for arbitrary data.

Blake2b - "fast secure hashing" (with output sizes from 16 bytes up to 64 bytes)

"BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3."

To use it you need to include

#include <dice/hash/blake2/Blake2b.hpp>

For a usage examples see: examples/blake2b.cpp.

Blake2Xb - arbitrary length hashing based on Blake2b

Blake2Xb is a hash function that produces hashes of arbitrary length.

To use it you need to include

#include <dice/hash/blake2/Blake2xb.hpp>

For a usage examples see: examples/blake2xb.cpp.

Blake3 - one function, fast everywhere

Blake3 is an evolution of Blake2.

To use it you need to include

#include <dice/hash/blake2/Blake3.hpp>

For a usage examples see: examples/blake3.cpp.

LtHash - homomorphic/multiset hashing

LtHash is a multiset/homomorphic hash function, meaning, instead of working on streams of data, it digests individual "objects". This means you can add and remove "objects" to/from an LtHash (object by object) as if it were a multiset and then read the hash that would result from hashing that multiset.

Small non-code example that shows the basic principle:

LtHash({apple}) + LtHash({banana}) - LtHash({peach}) + LtHash({banana}) = LtHash({apple1, banana2, peach-1})

To use it you need to include

#include <dice/hash/lthash/LtHash.hpp>

For a usage example see examples/ltHash.cpp.