Notes
Slide Show
Outline
1
Ada 2005 Standard Container Library
  • Matthew J Heaney
  • On2 Technologies, Inc
  • email: matthewjheaney@earthlink.net
  • 24/6/2005
2
Useful Links
  • http://www.ada-auth.org/cgi-bin/cvsweb.cgi/AIs/AI-20302.TXT/
  • http://charles.tigris.org/source/browse/charles/src/ai302/
3
Container Taxonomy
  • Sequence containers (vectors, lists) store elements at specified positions.
  • Associative containers (sets, maps) store elements in key order.
  • There are alternate forms of all containers, for storing indefinite elements and keys.
  • The associative containers have both hashed and ordered forms.
4
Time Complexity
  • The standard specifies the time complexity of operations.  It is not an implementation detail.  Indeed, this is practically the reason why the library comprises a suite of containers.
  • Different containers have different time semantics.  You choose whichever container has the particular properties best suited to the needs of your particular application.
5
Static Polymorphism
  • There is a distinction between instantiating a generic component, and using the instantiated component.
  • The generic formal region of components can differ, but as far as using them goes, once instantiated then components are basically the same, since they have a more or less identical static interface.  (They differ in their execution behavior, of course.)
  • Yes, container types are tagged, but that’s mostly to give you distinguished-receiver syntax.  (Subprogram parameters that are tagged are also implicitly aliased.)
6
Ordered Set
7
Hashed Set
8
Ordered Set or Hashed Set?
9
Cursors and (Passive) Iterators
  • Containers are nothing. Elements are everything.
  • A container exists for no other purpose than to store and retrieve elements.  Elements are not a hidden detail.
  • A cursor (and passive iterator) provides access to the elements in a container, without exposing container representation.
10
Machine Model
  • Cursors allow the container to be viewed as an abstract machine, with elements that are logically contiguous.
  • You navigate among element "addresses" using a cursor, and "dereference" the cursor to get the element at that address.
  • The cursor design effectively abstracts-away the container, since all you have are cursors, and elements designated by cursors.
11
Active Iterator (Cursor)
12
Active Iteration (Cursors)
  • During active iteration, navigation among cursor positions is controlled by the client.
  • An active iterator (cursor) is appropriate when more than one container (or more than a single element within the same container) is being visited simultaneously.
  • Use an active iterator to terminate the iteration without visiting every element in the container.
13
Passive Iterator
14
Passive Iteration
  • During passive iteration, advancement is controlled by the operation itself.
  • The passive iterator visits every element in the container.  (It's designed for the common case.)
  • Potentially more efficient than an active iterator, since the passive iterator knows that it is visiting all elements in sequence, and hence can visit elements in a way that takes advantage of the container’s representation.
15
Reverse Iteration
  • All containers (except hashed) allow iteration in both forward and reverse directions.
16
Active Iteration In Reverse
17
Passive Iteration In Reverse
18
Constant View of Elements
  • The Element function returns a copy of the element in the container.
  • However, if what you really want to do is just query the element directly, then that function can be relatively inefficient if the element is expensive to copy (it's large, or controlled, etc).
  • The operation Query_Element returns a constant view of the actual container element.
19
 
20
Variable View of Elements
  • The Replace_Element procedure assigns a new value to the element in the container.
  • This operation alone is not general enough: we often need a way to modify the element in place, not simply replace its value.
  • The operation Update_Element returns a variable view of the actual container element.
21
 
22
“Tampers with Elements”
  • When Process.all is executing from within Query_Element or Update_Element, there are things you can’t do to the container: Clear, Move, Insert, etc.  Basically anything that can change container cardinality is verboten.
  • You also cannot call any operation that would replace the element you’re currently visiting: Swap, Replace_Element, etc.
23
Sequence Containers
  • Insert, Append, Prepend
  • Find, Reverse_Find, Contains
  • Delete, Delete_First, Delete_Last
  • Element, Query_Element
  • Replace_Element, Update_Element
  • First, First_Element, Last, Last_Element
  • Swap, Move, Generic_Sorting
  • Iterate, Reverse_Iterate
24
Vectors
  • Provides random access to elements.
  • Complexity of Append is amortized constant time.  (But Prepend is linear time.)
  • The vector model is that an internal array automatically expands as necessary to store more elements.  (That’s the model.  A vector doesn’t have to be implemented as an array.)
25
Index-based Operations
26
Cursor-based Operations
27
Index- vs. Cursor-based Operations
  • Vectors are unique in that they have both index-based and cursor-based operations.
  • The cursor-based operations make it easier to switch between a vector and some other container (usually a list), and provide a uniform syntax for iteration (that applies to all containers).  [See list ex. on p. 42.]
28
Capacity vs. Length
  • The function Capacity returns the total amount of internal storage.  A vector automatically increases the capacity during insertion, when the current length (number of elements) equals the current capacity.
  • Reserve_Capacity tells the vector to preallocate a specified amount of internal storage.  If you know the total number of elements in advance of insertion, it’s more efficient to reserve the necessary capacity, since the expansion of the internal array is done only once.
29
 
30
Set_Length
  • You can also set the vector length explicitly.  This can be used to either truncate the vector (and hence throw elements away), or expand the vector (in which case “empty” elements are appended).
31
 
32
Insert_Space
  • Insert_Space is another way to reserve capacity, by making room (“space”) in the middle of the vector, without having an actual element to insert.
33
 
34
Inserting Multiple Elements
  • All insertion and deletion operations have a count parameter (that defaults to 1) to control how many elements are inserted or deleted.  There are additional overloadings of insertion operations, that accept a vector as the New_Item parameter.
  • In general, when you insert multiple elements into a vector, you should try to do it in a way that avoids repeated expansion of the internal array.  If you know how many elements you intend to insert, then either Reserve_Capacity first, or Insert_Space, or Set_Length, or specify the Count parameter of Insert.
35
Index-based Active Iteration
  • As for any other container, you can use either a passive iterator or a cursor-based active iterator.  However, since a vector also supports index-based operations, you can also use a for loop to iterate in the traditional way.
36
 
37
Move
  • The Move operation moves, not copies, the elements from one container to another.
  • In the case of a vector, Move works by simply transferring the internal array.  For other containers, Move is implemented similarly.
  • Move makes it possible to use a container to assemble elements in one part of a system, and then move them to another part of the system.
38
 
39
Swap
  • Swap (logically) exchanges a pair of elements.
  • Mostly intended to take advantage of the representation of indefinite vectors, which allocate each element.
  • Swap is implemented (in the indefinite vector case) by exchanging the internal pointers, which can be potentially more efficient than exchanging elements directly.
40
Find, Reverse_Find
  • Every container has a Find operation to search for an element.  For vectors and lists, Find performs a linear search from First to Last.  (For the maps and sets, the search works differently, and is definitely not linear.)
  • For the sequence containers, there's also a Reverse_Find, to search from Last to First.
  • Vectors also have indexed-based versions: Find_Index and Reverse_Find_Index.
  • Search operations for the sequence containers have a parameter (with a suitable default) to specify from where to begin the search.
41
Lists
  • Insertion and deletion have constant time complexity at all positions.
  • No random access.
  • The list container is monolithic, not polylithic (a la LISP); there is no structure sharing.
  • Lists are often useful for implementing a queue.
42
 
43
Splice, Sort, Merge, etc
  • Use Splice to either move an element (really, its node) within the same list or even from a different list, or to move an entire list.
  • Like a vector, lists can be sorted.  Unlike a vector, the sort is stable.
  • A pair of sorted lists can be merged, such that one list is spliced onto the other in sort order.
  • Operation Reverse_List reverses a list.
  • There’s a special Swap_Links for lists, to exchange list nodes instead of elements.
44
Associative Containers
  • Associative containers (maps, sets) store elements ordered by key.
  • Maps associate a separate key object with an element.  For a set, an element is its own key.
  • There are both ordered (tree-based) and hashed (hash table-based) versions.
45
Time Complexity
  • Hashed associative containers have unit time complexity, on average.  This is good for fast lookup of individual elements.  (But note that execution behavior of a hashed container is very sensitive to quality of hash function.)
  • Ordered associative containers have logarithmic time complexity, even in the worst case.  This predictability is safer for real-time systems.  Ordered containers are good for iteration over ranges of elements.
46
Hashed Container Equivalence
  • A hashed container first computes the hash value of a new item, to find the bucket.  It then uses the generic formal equivalence function (not equality) to compare the new item to the existing elements in that bucket.  [See example on p.93.]
47
 
48
Ordered Container Equivalence
  • During insertion in an ordered map or set, keys are compared for equivalence, not equality.
  • Ordered keys are equivalent if the following relation (known as “strict weak ordering”) is true:
49
 
50
Maps
  • Keys and elements are stored as pairs, ordered by key.
  • For an ordered map, the “<“ relation for keys determines the (sorted) order.
  • For a hashed map, the bucket (and hence the order) is determined by the hash value of the key.  If the hash function is performing well, keys should be scattered throughout the hash table, equally distributed among the buckets.
51
Membership Tests
  • The Contains and Find operations are used to determine whether an element is in the map.  Use Contains if all you need is a simple membership test.
  • Find returns a cursor as its result.  If the cursor object has the distinguished value No_Element (or equivalently, the predicate Has_Element returns False), then the search failed and the key is not in the map.  Otherwise, the cursor designates the key/element pair whose key matched.
52
Contains
53
Find
54
Hashed Map "="
  • For each key in the left map, hashed map equality searches for the key in the right map.  That is, it first computes the hash value of the left key to find the right bucket, and then uses Equivalent_Keys to find the equivalent key in that bucket.  If an equivalent key is found, it then compares elements using element equality.  (This is the only time when element equality is actually used.)
  • Note that both the key and the element are used to compute hashed map equality, but key equality is not used.  (In fact key equality is never used in this API.)
55
 
56
Ordered Map "="
  • Unlike a hashed map, there’s no need for ordered map "=" to search for the key, since the keys are already in sort order.
  • If the key in the left map is equivalent to the corresponding key in the right map (equivalence being defined in terms of key "<"), then ordered map "=" compares the associated elements for equality.  (This is the only time when element "=" is actually used.)
  • Note that the key and element are both used to compute ordered map equality, but key equality is not used.  (In fact key equality is never used in this API.)
57
 
58
Insertion
  • The Insert operation attempts to insert a key (and element) into the map.  If the key is already in the map, then Insert raises C_E; otherwise, it inserts the new key/element pair in the map.
  • Include (a variation of Insert) attempts to insert the key, but if the key is already in the map, it replaces the existing key/element pair with the new key/element pair, instead of raising C_E.
59
Conditional Insertion
  • Suppose we want to either insert a new element if this is a new key, or modify the existing element if the key already exists.
  • One technique would be to first try to Find the key, and if it's not found, then Insert the key in the map.
60
 
61
Condition Insertion (cont’d)
  • However, this technique is inefficient, because Insert must perform its own search, thus duplicating the search performed by Find.
  • A more efficient technique would be to attempt to insert the key, but instead of an exception, let the insertion operation return a cursor (the same as what Find does), and report back about whether the insertion succeeded.
62
 
63
Conditional Insertion (cont’d)
  • If Inserted returns True, then the key/element pair was inserted into the map, and the cursor designates the newly-inserted key/element pair.
  • If Insert returns False, then the key  was already in the map, and the cursor designates the existing key/element pair, which is not  modified.
64
 
65
Deletion
  • An element can be deleted either by specifying its key, or by specifying a cursor (that designates the key/element pair).
  • The key-based Delete raises Constraint_Error if the key isn’t found in the map.
  • Exclude (a variation of key-based Delete) does nothing if the key isn’t in the map.
  • The cursor-based Delete raises C_E if the cursor equals No_Element, and raises Program_Error if the cursor designates a node in some other map.
66
 
67
Replace, Replace_Element
  • Replace searches the map to determine whether the key is a member.  If the key isn’t found, then it raises Constraint_Error.  Otherwise, it replaces the existing key/element pair with the new key/element pair.
  • Replace differs from Include only with respect to whether the key is already in the map.
  • If you simply want to assign a new value to an existing element, then use Replace_Element.
68
Keys Get Updated Too
  • A key might have interesting state of its own, and so Include and Replace assign new values to both the existing key and existing element.
  • Sometimes the key assignment is more than you really want, if you’re only interested in elements.  You can avoid unwanted key assignment by using conditional Insert combined with Replace_Element.
69
 
70
Hashed Container Capacity
  • As elements are inserted into the hashed container, the internal hash table automatically expands when it becomes full (defined as capacity = length).
  • The standard does not specify what the load factor is.  It says only that capacity is the maximum length before which no automatic rehashing will occur.
  • You need to care about rehashing, because it’s expensive.  If you know the total number of elements prior to insertion, use Reserve_Capacity to preallocate the buckets array, and thus avoid rehashing.
71
 
72
Sets
  • A set is like a map, with the difference that in a set an element is its own key.  There is no separate key object, and only the element is stored in the container.
  • Ordered sets are often useful for implementing a priority queue.
73
 
74
Hashed Set "="
  • Searches normally work by computing the hash value of the item to find the bucket, and then using Equivalent_Elements to find the matching element in the bucket.
  • Hashed set equality works a little differently.  It uses element equality ("=") to compare the item to the elements in the bucket.  This is the only time when element equality is used.
75
Ordered Set "="
  • Computing ordered set equality is straightforward: since the elements are already in (sorted) order, there’s no need for a search.  Each element in one set is simply compared to the corresponding element in the other set, using element equality.  This is the only time when element equality is used.
76
Equivalent_Sets
  • Interestingly, set containers actually have two ways of being compared.  We have already seen the first way, set equality ("="), which is implemented in terms of element equality.
  • The second way, Equivalent_Sets, is implemented in terms of the equivalence relation for elements.  (Equivalent_Elements for hashed sets, and "<" for ordered sets.)
77
Classic Set Operations
  • Set containers also have the traditional operations for sets: Union, Intersection, Difference, and Symmetric_Difference.
  • Each operation has procedure, function, and operator forms.
  • There are also Overlap and Is_Subset operations.
78
Generic_Keys
  • The Generic_Keys nested package can be used to manipulate a set in terms of a key.
  • Useful when the element is a record, and the element’s key is a component of the record.
  • Solves the problem of finding a set element if you only know its key-part, and can't easily synthesize a nonce element to use as the search item.
79
 
80
 
81
 
82
 
83
 
84
Miscellaneous
  • Each container has different forms for definite and indefinite formal types.   Useful when type String is the generic actual element or key type.
  • The container library has generic operations for sorting both constrained and unconstrained arrays.
  • There are also hash functions for String and Unbounded_String, and their wide string equivalents.
85
Word Frequency Example
  • Suppose we are given the task of counting the frequency of each word in a file, and then displaying the results in frequency order.
86
Solution #1
  • Instantiate an indefinite map (hashed or ordered -- it doesn't matter which) indexed by String.  Use the map to collect the word frequencies.
  • Allocate an array of map cursors, and then sort the array in frequency order.
87
 
88
 
89
 
90
 
91
 
92
Solution #2
  • Use a set with a word/count pair as the element.


93
 
94
 
95
 
96
Solution #3
  • The indefinite vector has properties that make it an attractive alternative to sets and maps.  It has less storage overhead, since there's no element node (just the element).  And the elements are allocated, so the cost of insertion is relatively low (since only pointers to elements are moved, not elements).
  • Here we use a binary search to find the insertion position such that the vector always remains sorted.
  • A separate array of cursors, that we sort after collecting all the words, isn't necessary when using a vector, since we can explicitly sort the vector itself.
97
 
98
 
99
 
100