This paper presents elementary patterns for constructing and using Linear Indexed Data Structures (often known as Vectors and Arrays).
Patterns for building and using these structures include:
You are in a situation in which you need a collection of objects of a given type, such that at least one of the following conditions is true.
Therefore, use a Linear Indexed Data Structure in a contiguous block of memory. Examples in C-based languages include the STL or Java Vector and String classes, apvector and apstring, and built-in arrays.
A linear indexed data structure is either a Fixed-Size Linear Data Structure or an Expandable Linear Data Structure.
Patterns for building and using these structures include Declare-Construct-Initialize (Variables), Indexed Random Access, Valid Index, Appended Item, Linear Indexed Traversal, and Reverse Linear Indexed Traversal.
A Linear Indexed Data Structure might not be the best choice if
You are in a situation that calls for a Linear Indexed Data Structure and you know up front how many such objects there will be.
Therefore, specify the number of items that will be in the collection as you construct it, using the Declare-Construct-Initialize pattern.
For example, the syntax for creating
a C++ STL Vector of known capacity,
or
an apvector of known size,
whose
elements are of type T
are essentially the same:
Vector<T> V(nbrItems); // STL Vector class
apvector<T> V(nbrItems); // Advanced Placement program's apvector class
Built-in arrays in C, C++, and Java are always Fixed-Size Linear
Data Structures.
In C and C++, the fixed size for a statically-defined array must be a
constant expression known at compile time.
The syntax for creating
arrays of type T is
T myArray[const_expr]; // C or C++
T[] myArray = new T[nbrItems]; // Java
Note that whether the items in the collection are actually constructed and initialized when the data structure is constructed depends on the particular data structure and the type of the objects in it. In the case of an STL Vector, for example, the number of objects in the Vector is zero until Appended Items are added to it explicitly. On the other hand, with C++ built-in arrays and apvectors, the data structure is filled to capacity with "default items" of the underlying data element type. If that type is a primitive type, the individual items are uninitialized. If the individual items are objects, they are constructed and initialized using the default constructor for their class. Whether these data elements are meaningful depends on the type of the objects, the behavior of the default constructor, and what the program wishes to do with the data items.
(The C++ AP classes are available online from the College Board's web site.)
You are in a situation that calls for a Linear Indexed Data Structure and you don't know up front how many objects it will need to hold.
Therefore, declare and construct an empty collection, using the Declare-Construct-Initialize pattern. Resize the collection when you have items to add. You may need to resize it explicitly, or this may be done automatically by methods that add items to the data structure.
You may want to keep all the meaningful items (items that contain meaningful, initialized values) in the Expandable Linear Indexed Data Structure. together at the beginning of the collection. Doing this means that if you know how many meaningful items are in the collection, you also know where they are (at which indices). If the items are not contiguous, in other words, there are holes in the collection, then you don't know which items in the collection are meaningful and which are not without asking them. If the order of the items in the collection does not matter but you want to keep them together at the front of the data structure, use the Appended Item pattern to add items.
Some examples of Expandable Linear Indexed Data Structures
include the STL Vector and Advanced Placement program
apvector classes in C++ and the ArrayList and
Vector classes in Java.
Vector<T> V; // C++ STL Vector class
apvector<T> V; // Advanced Placement program's apvector class
Vector v = new Vector(); // Java Vector class
ArrayList a = new ArrayList(); // Java ArrayList class
The STL Vector and Java Vector and
ArrayList classes support expanding the data structure by
simply appending items to it (see Appended Item).
An apvector object must be expanded using the
resize operation. Other classes also often allow the user
to explicitly resize the capacity of an Expandable Linear
Indexed Data Structure to reduce the time necessary to append
items.
You want to add an element to an
Expandable Linear Indexed Data Structure.
You either want the new element to be the last item in the list or
you don't care where it's added but
you want all meaningful items (items that contain meaningful,
initialized values)
together at the beginning of the collection,
so that knowing how many meaningful items you have
also tells you the indices of the meaningful items. (If they are not
contiguous, in other words, there are holes in the collection, then you
wouldn't know which items are
meaningful and which are not without asking them.)
Thus, an
Expandable Linear Indexed Data Structure
with n meaningful items
would store those items in the first
n entries in the data structure (for example, at indices
0 through n - 1 in C, C++, or Java).
If you add an item to such a data structure, it should have
n + 1 items in the first
n + 1 entries in the data structure (for example, at indices
0 through n in C, C++, or Java).
Therefore, add your new object to the collection by appending it to the end.
Many
Expandable Linear Indexed Data Structures
provide a method for doing this, automatically growing the size of the
data structure if necessary. Some, however, grow only when their capacity is
increased explicitly (such as apvector). In such a case, you
must check the capacity of the data structure and resize it if necessary
before appending a new item.
For example, a Java Vector is an
Expandable Linear Indexed Data Structure
that supports dynamic growth as items are appended.
If deck is a Java Vector of
playing cards, then the following
statement adds a new Card to the end of the deck:
deck.addElement(new Card());
Note that
this examples uses an Anonymous Object. It is not
necessary to create a named variable for every element in the
collection. The collection is a single named object; we refer
to the individual elements in it by specifying
its location using Indexed Random Access
rather than by using a unique variable name.
Of course, it is also possible to append an already existing object
to a collection.
Card card1 = new Card();
deck.addElement(card1);
If an
Expandable Linear Indexed Data Structure
with n elements
does not provide a method for appending items,
use an Indexed Random Access
to add a new element to the end.
If the index for the new element would not be
a Valid Index, resize the data structure first.
NOTE: This pattern is also appropriate for a Fixed-Size Linear Indexed Data Structure, but only if the data structure is already large enough to accomodate the appended item.
You want to access (retrieve or set) a specified entry in a Linear Indexed Data Structure.
Therefore, access it directly, specifying the index of the desired entry. The index should be a Valid Index for the specific Linear Indexed Data Structure.
For example, if deck is an STL Vector of playing cards, then one might use the following statement to display the 1st card in the deck (provided that there is at least 1 card in the deck):
deck[0].display();
One could display the 6th card just as easily
(provided that there are at least 6 cards in the deck):
deck[5].display();
(Note that indexing of Linear Indexed Data Structures
in C, C++, and Java, along with some other languages, begins at 0. See Valid Index for more details.)
Using a Java Vector, one could display the 6th card of a deck with at least 6 cards with
((Card) deck.elementAt(5)).display();
Since a Java Vector does not know what type of objects it contains, you
need to specify (by casting) the type of the element
you are retrieving before you
can use it as an element of that type. In this example, the 6th element
must be cast as a Card before the display
method can be invoked.
This pattern can also be used to modify an element in the data structure. For example, to swap the 1st and 52nd cards in the C++ deck above, the following code might be used:
Card temp = deck[0]; // temp is 1st card
deck[0] = deck[51]; // 1st card is now same as 52nd card
deck[51] = temp; // 52nd card is now old 1st card
Using the Java Vector above, the code would be
Card temp = (Card) deck.elementAt(0); // temp is 1st card
deck.setElementAt((Card) deck.elementAt(51), 0); // 1st card is now same as 52nd card
deck.setElementAt(temp, 51); // 52nd card is now old 1st card
You want to access a valid element in a Linear Indexed Data Structure.
Therefore, use an index in the range determined by the programming language and the size of the data structure. In many languages, such as C, C++, and Java, valid indices for a structure with N elements ranges from 0 to N-1, where 0 is the index of the 1st element in the Linear Indexed Data Structure and N-1 is the index of the last element. In other languages, such as Pascal, valid indices for a Linear Indexed Data Structure with N elements ranges from 1 to N.
How to determine the size of a data structure (the number of elements in it) also depends on the language and the particular data structure. For example, if your data structure is an STL or Java Vector, then the size of the data structure refers to the number of Appended Items that have been explicitly added to it. You can use
V.size()
to obtain the number of elements in any STL or Java Vector V.
Technically, the "number of elements" in a built-in array or
apvector depends on its capacity (which depends only on the amount of
memory that has been allocated to it). In fact, when a built-in array
or apvector is constructed, it is filled with objects created using the
default constructor for that type. Whether these are meaningful data
elements or not depends on the type of the objects and what the program
wishes to do with them. Thus, it is
possible with these data structures to have an index that is valid as
far as the language and the specification of the data structure is
concerned, but refers to an element that never had meaningful data
assigned to it. Such an index is, in fact, a Valid
Index for the left-hand side of an assignment, and a
Linear Indexed Traversal
is an appropriate way to add new elements to this kind of data structure.
The range of valid indices for assignment purposes, therefore, is 0 to
C-1, where C is the capacity of the data structure. (For an apvector
V, you can use
V.length()
to refer to its capacity.)
On the other hand, an index to a data structure element that has not been initialized with meaningful data (as far as the program is concerned) is not a Valid Index for retrieving data from the data structure. In fact, with these data structures, there is no rule that requires that all meaningful data precede the uninitialized data. In other words, it is possible to have a Valid Index that is preceded by an invalid index, at least for data retrieval purposes. In order to have a meaningful range of valid indices for data access (which would be 0 to N-1, where N is the number of elements explicitly added as new elements to the data structure), one must add each new element to the data structure as an Appended Item, or add all the elements to entries 0 through N-1 in the data structure, in whatever order, before accessing them. It is also necessary to explicitly keep track of N, as these data structures do not do that.