Monday, September 22, 2008

Profiling stable_vector

The two random access containers of the C++ standard library, std::vector and std::deque, have the same big-O complexity as stable_vector for the following operations:

  • push_back: amortized constant
  • middle insert: O(n)
  • operator[]: constant

Yet the differences in actual performance vary wildly. I have written a profile program that measures the time taken by the three containers at these operations for two different value types: int and a "heavy" user defined type bigobj holding an int and an std::string. The program was built with release mode settings with MSVC++ 8.0 and GNU GCC 3.4.4 for Cygwin/MinGW. Resulting times are summarized in the following table: values are normalized to the minimum among the three containers.

MSVC++ 8.0

int
bigobj

std::
vector

std::
deque

stable_
vector

std::
vector

std::
deque

stable_
vector

push_back
1.00
4.0621.37
1.001.09
1.26
insert
1.00
4.245.29
1.50
1.73
1.00
operator[]
1.00
3.35
1.01
1.00
3.27
1.09
GNU GCC 3.4.4

int
bigobj

std::
vector

std::
deque

stable_
vector

std::
vector

std::
deque

stable_
vector

push_back
1.09
1.00
91.57
1.08
1.00
1.14
insert
1.00
1.30
5.17
21.38
21.36
1.00
operator[]
1.00
41.04
1.11
1.00
81.68
2.03

For small objects like int, std::vector is, as expected, the fastest or very close to the fastest at every operation. push_back takes much longer for stable_vector than std::deque presumably because the latter allocates element chunks thus amortizing allocation time whereas stable_vector performs an allocation for every new node. The poor performance of stable_vector at insert can be attributed to the operation of adjustment of "up" pointers with each insertion operation, which is considerably slower than the simple int copying incurred by the other two containers. Finally, note that random access is very slow for std::deque because it involves some non-trivial arithmetical calculations, whereas std::vector and stable_vector only perform simple pointer offsets and indirections.

When storing heavy objects with expensive copy, the situation changes somewhat. The fact that stable_vector does not internally copy elements upon resizing or shrinking improves its relative performance with respect to the other two containers at push_back, and specially at insert, where stable_vector is the fastest.

This basic analysis suggests that the main performance handicap of stable_vector stems from the fact that it is a node-based container, as opposed to the chunk-based nature of std::vector and std::deque. This drawback might be alleviated by the use of burst-allocation, as proposed by Ion Gaztañaga.

2 comments:

  1. Is there some reason why you use an old version of GCC? Version 4.3.2 is out.

    ReplyDelete
  2. Is there some reason why you use an old version of GCC? Version 4.3.2 is out.

    No particular reason, 3.4.4. is just the version I happen to have installed. Do you have results for GCC 4.3.2? If so I can use them to replace the current figures.

    ReplyDelete