Friday, August 29, 2008

Stable vectors

Extremely convenient as they are, std::vectors have a limitation that many novice C++ programmers frequently stumble upon: iterators and references to an element of an std:vector are invalidated when a preceding element is erased or when the vector expands and needs to migrate its internal storage to a wider memory region (i.e. when the required size exceeds the vector's capacity). We say then that std::vectors are unstable: by contrast, stable containers are those for which references and iterators to a given element remain valid as long as the element is not erased: examples of stable containers within the C++ standard library are std::list and all the associative containers.

Sometimes stability is too precious a feature to live without. One particular property of std::vectors, namely element contiguity, makes it impossible to add stability to this container. So, provided we sacrifice element contiguity, how much can a stable design approach the behavior of std::vector (random access iterators, amortized constant time end insertion/deletion, minimal memory overhead, etc.)?

The following image describes the layout of a possible data structure upon which to base the design of a stable vector:

Each element is stored in its own separate node. All the nodes are referenced from a contiguous array of pointers, but also every node contains an "up" pointer referring back to the associated array cell. This up pointer is the key element to implementing stability and random accessibility:

  1. Iterators point to the nodes rather than to the pointer array. This ensures stability, as it is only the pointer array that needs to be shifted or relocated upon insertion or deletion.
  2. Random access operations can be implemented by using the pointer array as a convenient intermediate zone. For instance, if the iterator it holds a node pointer it.p and we want to advance it by n positions, we simply do:
    ‎it.p = *(it.p->up+n);
    That is, we go "up" to the pointer array, add n there and then go "down" to the resulting node.

In a later entry we will present a full implementation of an STL-conformant stable vector and compare its characteristics (space/time complexity, exception safety) with those of std::vector.

2 comments :

  1. Is this how multi-index container random access index is implemented also?

    ReplyDelete
  2. Hi Szymin,

    Yes, in essence this is the data structure on which random access indices rely.

    ReplyDelete