summaryrefslogtreecommitdiff
path: root/triehash/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'triehash/README.md')
-rw-r--r--triehash/README.md58
1 files changed, 58 insertions, 0 deletions
diff --git a/triehash/README.md b/triehash/README.md
new file mode 100644
index 000000000..d6f6e7fc3
--- /dev/null
+++ b/triehash/README.md
@@ -0,0 +1,58 @@
+# Order-preserving minimal perfect hash function generator
+
+Build order-preserving minimal perfect hash functions.
+
+[![codecov](https://codecov.io/gh/julian-klode/triehash/branch/master/graph/badge.svg)](https://codecov.io/gh/julian-klode/triehash)
+[![Build Status](https://travis-ci.org/julian-klode/triehash.svg?branch=master)](https://travis-ci.org/julian-klode/triehash)
+
+## Performance
+
+Performance was evaluated against other hash functions. As an input set, the
+fields of Debian Packages and Sources files was used, and each hash function
+was run 1,000,000 times for each word. The byte count of the words were then
+summed up and divided by the total number of nanoseconds each function ran, so
+all speeds below are given in bytes per nanosecond, AKA gigabyte per second.
+
+arch/function|jak-x230 (amd64)|backup (amd64)|asachi.d.o (arm64)|asachi.d.o (armel)|asachi.d.o (armhf)|plummer.d.o (ppc64el)|eller.d.o (mipsel)
+-------------|----------------|--------------|------------------|------------------|------------------|---------------------|------------------
+Trie | 2.4 | 1.9 | 1.2 | 0.9 | 0.8 | 2.0 | 0.2
+Trie (*) | 2.2 | 1.7 | 0.8 | 0.7 | 0.7 | 1.8 | 0.2
+re2c | 1.7 | 1.3 | 0.9 | 0.9 | 0.7 | 1.6 | 0.2
+re2c (*) | 1.2 | 0.9 | 0.6 | 0.6 | 0.5 | 1.1 | 0.1
+gperf (*) | 0.7 | 0.5 | 0.2 | 0.2 | 0.2 | 0.5 | 0.1
+gperf | 1.3 | 0.9 | 0.3 | 0.3 | 0.2 | 0.4 | 0.1
+djb (*) | 0.7 | 0.5 | 0.3 | 0.3 | 0.3 | 0.5 | 0.1
+djb (**) | 1.0 | 0.7 | 0.4 | 0.5 | 0.5 | 0.6 | 0.2
+djb | 0.9 | 0.7 | 0.5 | 0.5 | 0.5 | 0.7 | 0.2
+apt (*) | 1.2 | 0.9 | 0.7 | 0.7 | 0.7 | 1.1 | 0.2
+apt (**) | 2.3 | 1.7 | 0.7 | 0.9 | 0.8 | 1.9 | 0.2
+
+And transposed:
+
+function/arch |Trie |Trie (*) |re2c |re2c (*) |gperf (*)|gperf |djb (*) |djb (**) |djb |apt (*) |apt (**)
+---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------
+jak-x230 (amd64) | 2.4| 2.2| 1.7| 1.2| 0.7| 1.3| 0.7| 1.0| 0.9| 1.2| 2.3
+backup (amd64) | 1.9| 1.7| 1.3| 0.9| 0.5| 0.9| 0.5| 0.7| 0.7| 0.9| 1.7
+asachi.d.o (arm64) | 1.2| 0.8| 0.9| 0.6| 0.2| 0.3| 0.3| 0.4| 0.5| 0.7| 0.7
+asachi.d.o (armel) | 0.9| 0.7| 0.9| 0.6| 0.2| 0.3| 0.3| 0.5| 0.5| 0.7| 0.9
+asachi.d.o (armhf) | 0.8| 0.7| 0.7| 0.5| 0.2| 0.2| 0.3| 0.5| 0.5| 0.7| 0.8
+plummer.d.o (ppc64el)| 2.0| 1.8| 1.6| 1.1| 0.5| 0.4| 0.5| 0.6| 0.7| 1.1| 1.9
+eller.d.o (mipsel) | 0.2| 0.2| 0.2| 0.1| 0.1| 0.1| 0.1| 0.2| 0.2| 0.2| 0.2
+
+
+Legend:
+
+* The (*) variants are case-insensitive, (**) are more optimised versions
+ of the (*) versions.
+* DJB (*) is a DJB Hash with naive lowercase conversion, DJB (**) just ORs one
+ bit into each value to get alphabetical characters to be lowercase
+* APT (*) is the AlphaHash function from APT which hashes the last 8 bytes in a
+ word in a case-insensitive manner. APT (**) is the same function unrolled.
+* All hosts except the x230 are Debian porterboxes. The x230 has a Core i5-3320M,
+ barriere has an Opteron 23xx.
+
+Notes:
+
+* The overhead is larger than needed on some platforms due to gcc inserting
+ unneeded zero extend instructions, see:
+ https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77729