1
|
- Matthew J Heaney
- On2 Technologies, Inc
- email: matthewjheaney@earthlink.net
- 24/6/2005
|
2
|
- http://www.ada-auth.org/cgi-bin/cvsweb.cgi/AIs/AI-20302.TXT/
- http://charles.tigris.org/source/browse/charles/src/ai302/
|
3
|
- Sequence containers (vectors, lists) store elements at specified
positions.
- Associative containers (sets, maps) store elements in key order.
- There are alternate forms of all containers, for storing indefinite
elements and keys.
- The associative containers have both hashed and ordered forms.
|
4
|
- The standard specifies the time complexity of operations. It is not an implementation
detail. Indeed, this is
practically the reason why the library comprises a suite of containers.
- Different containers have different time semantics. You choose whichever container has the
particular properties best suited to the needs of your particular
application.
|
5
|
- There is a distinction between instantiating a generic component, and
using the instantiated component.
- The generic formal region of components can differ, but as far as using
them goes, once instantiated then components are basically the same,
since they have a more or less identical static interface. (They differ in their execution
behavior, of course.)
- Yes, container types are tagged, but that’s mostly to give you
distinguished-receiver syntax.
(Subprogram parameters that are tagged are also implicitly
aliased.)
|
6
|
|
7
|
|
8
|
|
9
|
- Containers are nothing. Elements are everything.
- A container exists for no other purpose than to store and retrieve
elements. Elements are not a
hidden detail.
- A cursor (and passive iterator) provides access to the elements in a
container, without exposing container representation.
|
10
|
- Cursors allow the container to be viewed as an abstract machine, with
elements that are logically contiguous.
- You navigate among element "addresses" using a cursor, and
"dereference" the cursor to get the element at that address.
- The cursor design effectively abstracts-away the container, since all
you have are cursors, and elements designated by cursors.
|
11
|
|
12
|
- During active iteration, navigation among cursor positions is controlled
by the client.
- An active iterator (cursor) is appropriate when more than one container
(or more than a single element within the same container) is being
visited simultaneously.
- Use an active iterator to terminate the iteration without visiting every
element in the container.
|
13
|
|
14
|
- During passive iteration, advancement is controlled by the operation
itself.
- The passive iterator visits every element in the container. (It's designed for the common case.)
- Potentially more efficient than an active iterator, since the passive
iterator knows that it is visiting all elements in sequence, and hence
can visit elements in a way that takes advantage of the container’s
representation.
|
15
|
- All containers (except hashed) allow iteration in both forward and
reverse directions.
|
16
|
|
17
|
|
18
|
- The Element function returns a copy of the element in the container.
- However, if what you really want to do is just query the element
directly, then that function can be relatively inefficient if the
element is expensive to copy (it's large, or controlled, etc).
- The operation Query_Element returns a constant view of the actual
container element.
|
19
|
|
20
|
- The Replace_Element procedure assigns a new value to the element in the
container.
- This operation alone is not general enough: we often need a way to
modify the element in place, not simply replace its value.
- The operation Update_Element returns a variable view of the actual
container element.
|
21
|
|
22
|
- When Process.all is executing from within Query_Element or
Update_Element, there are things you can’t do to the container: Clear,
Move, Insert, etc. Basically
anything that can change container cardinality is verboten.
- You also cannot call any operation that would replace the element you’re
currently visiting: Swap, Replace_Element, etc.
|
23
|
- Insert, Append, Prepend
- Find, Reverse_Find, Contains
- Delete, Delete_First, Delete_Last
- Element, Query_Element
- Replace_Element, Update_Element
- First, First_Element, Last, Last_Element
- Swap, Move, Generic_Sorting
- Iterate, Reverse_Iterate
|
24
|
- Provides random access to elements.
- Complexity of Append is amortized constant time. (But Prepend is linear time.)
- The vector model is that an internal array automatically expands as
necessary to store more elements.
(That’s the model. A
vector doesn’t have to be implemented as an array.)
|
25
|
|
26
|
|
27
|
- Vectors are unique in that they have both index-based and cursor-based
operations.
- The cursor-based operations make it easier to switch between a vector
and some other container (usually a list), and provide a uniform syntax
for iteration (that applies to all containers). [See list ex. on p. 42.]
|
28
|
- The function Capacity returns the total amount of internal storage. A vector automatically increases the
capacity during insertion, when the current length (number of elements)
equals the current capacity.
- Reserve_Capacity tells the vector to preallocate a specified amount of
internal storage. If you know the
total number of elements in advance of insertion, it’s more efficient to
reserve the necessary capacity, since the expansion of the internal
array is done only once.
|
29
|
|
30
|
- You can also set the vector length explicitly. This can be used to either truncate
the vector (and hence throw elements away), or expand the vector (in
which case “empty” elements are appended).
|
31
|
|
32
|
- Insert_Space is another way to reserve capacity, by making room
(“space”) in the middle of the vector, without having an actual element
to insert.
|
33
|
|
34
|
- All insertion and deletion operations have a count parameter (that
defaults to 1) to control how many elements are inserted or
deleted. There are additional
overloadings of insertion operations, that accept a vector as the
New_Item parameter.
- In general, when you insert multiple elements into a vector, you should
try to do it in a way that avoids repeated expansion of the internal
array. If you know how many
elements you intend to insert, then either Reserve_Capacity first, or
Insert_Space, or Set_Length, or specify the Count parameter of Insert.
|
35
|
- As for any other container, you can use either a passive iterator or a
cursor-based active iterator.
However, since a vector also supports index-based operations, you
can also use a for loop to iterate in the traditional way.
|
36
|
|
37
|
- The Move operation moves, not copies, the elements from one container to
another.
- In the case of a vector, Move works by simply transferring the internal
array. For other containers, Move
is implemented similarly.
- Move makes it possible to use a container to assemble elements in one
part of a system, and then move them to another part of the system.
|
38
|
|
39
|
- Swap (logically) exchanges a pair of elements.
- Mostly intended to take advantage of the representation of indefinite
vectors, which allocate each element.
- Swap is implemented (in the indefinite vector case) by exchanging the
internal pointers, which can be potentially more efficient than
exchanging elements directly.
|
40
|
- Every container has a Find operation to search for an element. For vectors and lists, Find performs a
linear search from First to Last.
(For the maps and sets, the search works differently, and is
definitely not linear.)
- For the sequence containers, there's also a Reverse_Find, to search from
Last to First.
- Vectors also have indexed-based versions: Find_Index and
Reverse_Find_Index.
- Search operations for the sequence containers have a parameter (with a
suitable default) to specify from where to begin the search.
|
41
|
- Insertion and deletion have constant time complexity at all positions.
- No random access.
- The list container is monolithic, not polylithic (a la LISP); there is
no structure sharing.
- Lists are often useful for implementing a queue.
|
42
|
|
43
|
- Use Splice to either move an element (really, its node) within the same
list or even from a different list, or to move an entire list.
- Like a vector, lists can be sorted.
Unlike a vector, the sort is stable.
- A pair of sorted lists can be merged, such that one list is spliced onto
the other in sort order.
- Operation Reverse_List reverses a list.
- There’s a special Swap_Links for lists, to exchange list nodes instead
of elements.
|
44
|
- Associative containers (maps, sets) store elements ordered by key.
- Maps associate a separate key object with an element. For a set, an element is its own key.
- There are both ordered (tree-based) and hashed (hash table-based)
versions.
|
45
|
- Hashed associative containers have unit time complexity, on
average. This is good for fast
lookup of individual elements.
(But note that execution behavior of a hashed container is very
sensitive to quality of hash function.)
- Ordered associative containers have logarithmic time complexity, even in
the worst case. This
predictability is safer for real-time systems. Ordered containers are good for iteration
over ranges of elements.
|
46
|
- A hashed container first computes the hash value of a new item, to find
the bucket. It then uses the
generic formal equivalence function (not equality) to compare the new
item to the existing elements in that bucket. [See example on p.93.]
|
47
|
|
48
|
- During insertion in an ordered map or set, keys are compared for
equivalence, not equality.
- Ordered keys are equivalent if the following relation (known as “strict
weak ordering”) is true:
|
49
|
|
50
|
- Keys and elements are stored as pairs, ordered by key.
- For an ordered map, the “<“ relation for keys determines the (sorted)
order.
- For a hashed map, the bucket (and hence the order) is determined by the
hash value of the key. If the
hash function is performing well, keys should be scattered throughout
the hash table, equally distributed among the buckets.
|
51
|
- The Contains and Find operations are used to determine whether an
element is in the map. Use
Contains if all you need is a simple membership test.
- Find returns a cursor as its result.
If the cursor object has the distinguished value No_Element (or
equivalently, the predicate Has_Element returns False), then the search
failed and the key is not in the map.
Otherwise, the cursor designates the key/element pair whose key
matched.
|
52
|
|
53
|
|
54
|
- For each key in the left map, hashed map equality searches for the key
in the right map. That is, it
first computes the hash value of the left key to find the right bucket,
and then uses Equivalent_Keys to find the equivalent key in that
bucket. If an equivalent key is
found, it then compares elements using element equality. (This is the only time when element
equality is actually used.)
- Note that both the key and the element are used to compute hashed map
equality, but key equality is not used.
(In fact key equality is never used in this API.)
|
55
|
|
56
|
- Unlike a hashed map, there’s no need for ordered map "=" to
search for the key, since the keys are already in sort order.
- If the key in the left map is equivalent to the corresponding key in the
right map (equivalence being defined in terms of key "<"),
then ordered map "=" compares the associated elements for
equality. (This is the only time
when element "=" is actually used.)
- Note that the key and element are both used to compute ordered map
equality, but key equality is not used.
(In fact key equality is never used in this API.)
|
57
|
|
58
|
- The Insert operation attempts to insert a key (and element) into the
map. If the key is already in the
map, then Insert raises C_E; otherwise, it inserts the new key/element
pair in the map.
- Include (a variation of Insert) attempts to insert the key, but if the
key is already in the map, it replaces the existing key/element pair
with the new key/element pair, instead of raising C_E.
|
59
|
- Suppose we want to either insert a new element if this is a new key, or
modify the existing element if the key already exists.
- One technique would be to first try to Find the key, and if it's not
found, then Insert the key in the map.
|
60
|
|
61
|
- However, this technique is inefficient, because Insert must perform its
own search, thus duplicating the search performed by Find.
- A more efficient technique would be to attempt to insert the key, but
instead of an exception, let the insertion operation return a cursor
(the same as what Find does), and report back about whether the
insertion succeeded.
|
62
|
|
63
|
- If Inserted returns True, then the key/element pair was inserted into
the map, and the cursor designates the newly-inserted key/element pair.
- If Insert returns False, then the key
was already in the map, and the cursor designates the existing
key/element pair, which is not
modified.
|
64
|
|
65
|
- An element can be deleted either by specifying its key, or by specifying
a cursor (that designates the key/element pair).
- The key-based Delete raises Constraint_Error if the key isn’t found in
the map.
- Exclude (a variation of key-based Delete) does nothing if the key isn’t
in the map.
- The cursor-based Delete raises C_E if the cursor equals No_Element, and
raises Program_Error if the cursor designates a node in some other map.
|
66
|
|
67
|
- Replace searches the map to determine whether the key is a member. If the key isn’t found, then it raises
Constraint_Error. Otherwise, it
replaces the existing key/element pair with the new key/element pair.
- Replace differs from Include only with respect to whether the key is
already in the map.
- If you simply want to assign a new value to an existing element, then
use Replace_Element.
|
68
|
- A key might have interesting state of its own, and so Include and
Replace assign new values to both the existing key and existing element.
- Sometimes the key assignment is more than you really want, if you’re
only interested in elements. You
can avoid unwanted key assignment by using conditional Insert combined
with Replace_Element.
|
69
|
|
70
|
- As elements are inserted into the hashed container, the internal hash
table automatically expands when it becomes full (defined as capacity =
length).
- The standard does not specify what the load factor is. It says only that capacity is the
maximum length before which no automatic rehashing will occur.
- You need to care about rehashing, because it’s expensive. If you know the total number of
elements prior to insertion, use Reserve_Capacity to preallocate the
buckets array, and thus avoid rehashing.
|
71
|
|
72
|
- A set is like a map, with the difference that in a set an element is its
own key. There is no separate key
object, and only the element is stored in the container.
- Ordered sets are often useful for implementing a priority queue.
|
73
|
|
74
|
- Searches normally work by computing the hash value of the item to find
the bucket, and then using Equivalent_Elements to find the matching
element in the bucket.
- Hashed set equality works a little differently. It uses element equality ("=")
to compare the item to the elements in the bucket. This is the only time when element
equality is used.
|
75
|
- Computing ordered set equality is straightforward: since the elements
are already in (sorted) order, there’s no need for a search. Each element in one set is simply
compared to the corresponding element in the other set, using element
equality. This is the only time
when element equality is used.
|
76
|
- Interestingly, set containers actually have two ways of being
compared. We have already seen
the first way, set equality ("="), which is implemented in
terms of element equality.
- The second way, Equivalent_Sets, is implemented in terms of the
equivalence relation for elements.
(Equivalent_Elements for hashed sets, and "<" for
ordered sets.)
|
77
|
- Set containers also have the traditional operations for sets: Union,
Intersection, Difference, and Symmetric_Difference.
- Each operation has procedure, function, and operator forms.
- There are also Overlap and Is_Subset operations.
|
78
|
- The Generic_Keys nested package can be used to manipulate a set in terms
of a key.
- Useful when the element is a record, and the element’s key is a
component of the record.
- Solves the problem of finding a set element if you only know its
key-part, and can't easily synthesize a nonce element to use as the
search item.
|
79
|
|
80
|
|
81
|
|
82
|
|
83
|
|
84
|
- Each container has different forms for definite and indefinite formal
types. Useful when type String
is the generic actual element or key type.
- The container library has generic operations for sorting both
constrained and unconstrained arrays.
- There are also hash functions for String and Unbounded_String, and their
wide string equivalents.
|
85
|
- Suppose we are given the task of counting the frequency of each word in
a file, and then displaying the results in frequency order.
|
86
|
- Instantiate an indefinite map (hashed or ordered -- it doesn't matter
which) indexed by String. Use the
map to collect the word frequencies.
- Allocate an array of map cursors, and then sort the array in frequency
order.
|
87
|
|
88
|
|
89
|
|
90
|
|
91
|
|
92
|
- Use a set with a word/count pair as the element.
|
93
|
|
94
|
|
95
|
|
96
|
- The indefinite vector has properties that make it an attractive
alternative to sets and maps. It
has less storage overhead, since there's no element node (just the
element). And the elements are
allocated, so the cost of insertion is relatively low (since only
pointers to elements are moved, not elements).
- Here we use a binary search to find the insertion position such that the
vector always remains sorted.
- A separate array of cursors, that we sort after collecting all the
words, isn't necessary when using a vector, since we can explicitly sort
the vector itself.
|
97
|
|
98
|
|
99
|
|
100
|
|