After
Microsoft Visual Studio 2012 and
GCC 4.8.2, we now provide the results of our performance test suite for C++ unordered associative containers in Clang 3.4. All the tests were compiled and run by Thomas Goyne on a Mac OS X 10.9.1 box with an Intel Core i7-3720QM @2.6GHz,
Turbo Boost and
SpeedStep disabled, release settings
-O3 -std=c++11.
Clang can work with two different standard libraries,
libstdc++-v3 and
libc++: we have tried both libstdc++ 4.8.2 and libc++ 3.4.
libstdc++-v3
data:image/s3,"s3://crabby-images/bc6f7/bc6f7d203289df8a889a79d2089bfdee312685b9" alt="" |
Insertion without rehashing, unique elements. |
data:image/s3,"s3://crabby-images/3f81f/3f81fe891d2c4c5a793ae496f7b3ecb59290411e" alt="" |
Insertion with rehashing, unique elements. |
data:image/s3,"s3://crabby-images/44763/44763c1eeca191c8a70b4069116120dc007b6f25" alt="" |
Insertion without rehashing, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/d338b/d338bdb730486ad9c66792907aca550f5e67ea57" alt="" |
Insertion without rehashing, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/7a97b/7a97bb0e74c17c805e1e6aea123cad8f5815e67c" alt="" |
Insertion with rehashing, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/777d2/777d2b23786b72c6f1eed5b5b7c7d5e18c6b910f" alt="" |
Insertion with rehashing, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/08061/08061b88d73a4a571c459fd5a99ff12a9bf7ba86" alt="" |
Erasure, unique elements. |
data:image/s3,"s3://crabby-images/82705/8270558d53d371e27ed62ca30628202562464b87" alt="" |
Erasure, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/b9f15/b9f1555558ccb5a0f2751d0936f21b80b5bb691b" alt="" |
Erasure, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/8bed3/8bed3f71cc7d0b2acf2dc24c12c466335d0e08d4" alt="" |
Successful lookup, unique elements. |
data:image/s3,"s3://crabby-images/fe2d5/fe2d5abc90040a795ecff845f69118059eff1987" alt="" |
Unsuccessful lookup, unique elements. |
data:image/s3,"s3://crabby-images/09c14/09c1450ae32db2520686d346c1bbc0240a3cf7e8" alt="" |
Successful lookup, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/f954e/f954ec083812b4b9ad12ff5ab749f84d7c8cdeb4" alt="" |
Unsuccessful lookup, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/182b1/182b101600c0548a3455bd047ef1d79e2e9c30b0" alt="" |
Successful lookup, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/d3075/d3075cb14c73f915ce6a4d63bc20f636dfeed89e" alt="" |
Unsuccessful lookup, duplicate elements, Fmax = 5, G = 5. |
As expected, the results are very similar to those obtained with GCC: the only noteworthy difference is Boost.Unordered performance in lookup with unique elements, which is somewhat poorer here —the reasons for this discrepancy are not clear to me.
libc++
data:image/s3,"s3://crabby-images/bd83e/bd83e9a59278f2c3722860c40ea64289829f4a26" alt="" |
Insertion without rehashing, unique elements. |
data:image/s3,"s3://crabby-images/5c820/5c820bb588de0100071ba4f7cf709bdf7e1e58b8" alt="" |
Insertion with rehashing, unique elements. |
data:image/s3,"s3://crabby-images/1d6cc/1d6cc5e3f620a5f194adf793fb1d7fa273ce3946" alt="" |
Insertion without rehashing, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/14c10/14c1013f33dc5f3b93da15457a243f201b1f0cbe" alt="" |
Insertion without rehashing, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/0bde3/0bde3ef66e200278f1133b409270e58200b4ae1d" alt="" |
Insertion with rehashing, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/f9551/f9551b90bc030c93705441a647d1e08716375ef8" alt="" |
Insertion with rehashing, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/b9964/b9964793352544e8cd6ff03db3f0bb706b3b6a02" alt="" |
Erasure, unique elements. |
data:image/s3,"s3://crabby-images/053bf/053bf0e8890f57a5c5efaaf81beb5a8e87280af7" alt="" |
Erasure, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/b58c7/b58c7e31dfa540b3995d415c95fa20d1fc176006" alt="" |
Erasure, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/cb1ab/cb1ab24ffd04847d670adcf8fa84fc3666373a97" alt="" |
Successful lookup, unique elements. |
data:image/s3,"s3://crabby-images/c2996/c2996a2a6f6d9c44fde71e7c8eef91af50116684" alt="" |
Unsuccessful lookup, unique elements. |
data:image/s3,"s3://crabby-images/bf07b/bf07b691b8f6bac45d2657cdbca2e4162f15dda7" alt="" |
Successful lookup, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/16895/16895da5ac735df09e5032b10f89d99bca014096" alt="" |
Unsuccessful lookup, duplicate elements, Fmax = 1, G = 5. |
data:image/s3,"s3://crabby-images/7bf95/7bf9598646ad52c4519016f7010ce3452c57494b" alt="" |
Successful lookup, duplicate elements, Fmax = 5, G = 5. |
data:image/s3,"s3://crabby-images/15b77/15b77d9fc49a2c5e8d6d6cfa9862156c299f819b" alt="" |
Unsuccessful lookup, duplicate elements, Fmax = 5, G = 5. |
In general, the unordered associative containers provided by libc++ are considerably slower than those in libstdc++-v3:
- When duplicate elements are allowed, insertion of a new element in libc++ takes place at the end of the range of equivalent elements (in libstdc++ -v3 it is at the beginning), which incurs the extra cost of traversing the range.
- Erasure is unnecessarily expensive: to determine whether a given node p is located at the beginning of its bucket b, libc++ does a hash→bucket mapping on the predecessor prev of x and checks if the resulting bucket is not the same as b, when it suffices to verify if b points to prev.
- It is harder to explain why lookup (specially with unique elements) is so much slower than in libstdc++-v3: code inspection reveals that the hash→bucket mapping is less efficient in libc++ (it involves one extra logical and operation), but this fact alone seems unlikely to account for the big differences in performance.
I would like the benchmarking code; as I am working on special case (very large data-sets) hash-map implementation.
ReplyDeleteDave
The links are already available scattered across the different blog entries I devoted to this. For your convenience:
ReplyDeleteunique_running_insertion.cpp
non_unique_running_insertion.cpp
unique_scattered_erasure.cpp
non_unique_scattered_erasure.cpp
unique_scattered_lookup.cpp
non_unique_scattered_lookup.cpp
Thanks for these links.
ReplyDeleteDave