In my previous post with the same title I alluded to confusion I encountered when I started poking around collections in the .NET Framework. I guess the problem with them is to figure out what each of them was created for and understand their benefits and drawbacks.
The previous post was mostly about lists. This time around let’s review some dictionary-type collections.
NameValueCollection
This is a really weird one. As MSDN states, it represents a sorted collection of associated String keys and String values that can be accessed either with the key or with the index
.
First, there’s nothing sorted about it. Under the hood it uses a Hashtable which, by definition, is not suitable for sorting. I guess you can call AllKeys, which returns an array of strings, and sort it then, but I believe documentation is wrong on this account.
Second, the way you store anything in NameValueCollection is by passing a string name/string value pair, which makes it a dictionary. It does allow duplicates because the underlying storage is a hashtable of dictionary entries and every key has an ArrayList of values.
Another oddity about this class is that it doesn’t expose SyncRoot which pretty much every other collection does. MSDN simply suggests rolling your own wrapper of NameObjectCollectionBase and get to SyncRoot that way. I needed a way to synchronize access to this kind of collection in my recent DICT Client project. I didn’t want to go that far with a custom wrapper, so I dumped this class.
I must say, though, NameValueCollection is recommended for management of frequently changing data. This class is also useful for fast retrieval.
One peculiar thing about this collection (something I haven’t seen a mention of) is if you stored several values under one key, you get them back as a comma-delimited string!
public virtual string Get(string name)
{
ArrayList list1 = (ArrayList) base.BaseGet(name);
return NameValueCollection.GetAsOneString(list1);
}
private static string GetAsOneString(ArrayList list)
{
int num1 = (list != null) ? list.Count : 0;
if (num1 == 1)
return (string) list[0];
if (num1 <= 1)
return null;
StringBuilder builder1 = new StringBuilder((string) list[0]);
for (int num2 = 1; num2 < num1; num2++) {
builder1.Append(',');
builder1.Append((string) list[num2]);
}
return builder1.ToString();
}
I wish they added an extra space for readability, but it’s still pretty cool.
SortedList
This is quite an interesting class. In fact, the only class which sorts automatically. Its startup time is relatively high because it needs to sort items up front, but after that retrieval works really fast. Internally it uses BinarySearch, which we already know is fast, and as James Curran pointed out, the collection must be sorted in either case for BinarySearch to do its work right. IndexOf works fast too because it delegates lookup to BinarySearch.
This class is not recommended for storing frequently changing data for obvious reasons (need to resort). If your data changes often, put it in an ArrayList instead and run its Sort method.
Again, SortedList is defined at MSDN as a collection of key-and-value pairs that are sorted by the keys and are accessible by key and by index
, which—in essence—makes it a dictionary, not a list.
Overall this is a very helpful class with a slew of methods to retrieve all values or all keys, see if a certain value and/or key exists, etc.
StringDictionary
By definition, this is a strongly typed Hashtable which allows you to store string key/string value pairs. The benefit is that is saves a typecast from Object to String when you read something back.
This class is weird in its own way. When you add a key/value pair, it converts the key to lower case:
public virtual void Add(string key, string value)
{
this.contents.Add(
key.ToLower(CultureInfo.InvariantCulture), value);
}
Again, in my DICT Client project I needed to get a list of all keys, but having realized they were miraculously transformed to lower case I dumped StringDictionary. This is a really weird way to prevent insertion of duplicates. Especially since its storage is a plain Hashtable.
Case Insensitive Hashtable
I swear I didn’t know you could create a Hashtable which ignored casing of its keys. Almost every book out there harps on the fact that keys are case sensitive. Creating one is no rocket science (see Performing Culture-Insensitive String Operations in Collections). As an added bonus, this link also demonstrates how to create a case-insensitive comparer for a SortedList.
If you work with “tr-TR” (Turkish in Turkey) and “az -AZ-Latn” (Azeri (Latin) in Azerbaijan) cultures, I strongly suggest you read through the entire section Performing Culture-Insensitive String Operations.
Hashtable, HybridDictionary and ListDictionary
I think the name of ListDictionary is the worst one in the entire collection namespace. Is it a list? Or a dictionary? Both?? It’s a very mediocre name which tells you nothing. Open it up in Reflector and you learn it’s a linked list.
MSDN suggests you shouldn’t store more than 10 entries in it for “optimal” performance. Its methods are O(n) operations much like several other collection classes, so why 10? Linked lists don’t perform that bad so why blacklist them? I can’t see any indication why 10 is the limit, so if anyone knows, please speak up.
HybridDictionary is a chameleon class which starts off faking a ListDictionary (no more than 10 PT Cruisers per customer, please), and over 10 entries it turns into a Hashtable. HybridDictionary even has a private method called ChangeOver to kick off this mutation.
What beats me is why you can’t write one class, not three, and build all this logic into it.
Conclusion
I wish I could just draw a flow chart to help you make decisions about which collection to choose. But I can’t. In fact, every time I’m faced with such a decision, I have headaches. .NET collections are written in such a way that you’re left with a feeling the friendly car salesmen ripped you off instead of giving you the best deal. All I could do was outline some strange nuanses.
Back in my C++ days I used the Standard Type Library (STL) for years. STL, written by the leading industry specialists, is a life saver of every C++ developer. You can find several excellent books on STL by big-name authors and master it quite easily.
It’s a shame .NET collections are nowhere like STL. If you make a language so similar to C++ and Java, why not reuse the best? With the advent of generics in 2.0 I’m hoping collections will become easier to comprehend. Until then…
Happy hacking!