Share via


Dictionary with limited size

Question

Sunday, October 24, 2010 12:30 PM | 1 vote

What's the best way to create a dictionary (or similiar key-value pair lookup table) of fixed size, where the oldest entries would simply be removed (forgotten)?

All replies (17)

Monday, October 25, 2010 8:55 AM âś…Answered | 2 votes

Combine a Queue and a Dictionary:

class LimitedSizeDictionary<TKey, TValue> : IDictionary<TKey, TValue>
{
 Dictionary<TKey, TValue> dict;
 Queue<TKey> queue;
 int size;

 public LimitedSizeDictionary(int size)
 {
 this.size = size;
 dict = new Dictionary<TKey, TValue>(size + 1);
 queue = new Queue<TKey>(size);
 }

 public void Add(TKey key, TValue value)
 {
 dict.Add(key, value);
 if (queue.Count == size)
 dict.Remove(queue.Dequeue());
 queue.Enqueue(key);
 }

 public bool Remove(TKey key)
 {
 if (dict.Remove(key))
 {
 Queue<TKey> newQueue = new Queue<TKey>(size);
 foreach (TKey item in queue)
 if (!dict.Comparer.Equals(item, key))
 newQueue.Enqueue(item);
 queue = newQueue;
 return true;
 }
 else
 return false;
 }
}

Sunday, October 24, 2010 12:43 PM | 1 vote

Hello,

 

You could create a custom dictionary and do something like this:

  public class CustomDictionary<TKey, TValue> : Dictionary<TKey, TValue>
  {
    public int MaxItemsToHold { get; set; }

    private Queue<TKey> orderedKeys = new Queue<TKey>();

    public new void Add(TKey key, TValue value)
    {
      orderedKeys.Enqueue(key);
      if (this.MaxItemsToHold != 0 && this.Count >= MaxItemsToHold)
      {
        this.Remove(orderedKeys.Dequeue());
      }

      base.Add(key, value);
    }
  }

Hope this helps, if you have any other questions or comments, please let me know,

Best Regards,
Emanuel Varga

If a post answers your question, please click "Mark As Answer " on that post and "Mark as Helpful ".


Sunday, October 24, 2010 8:37 PM

Ah yes. That is easy enough. But,

Is the Last() element in a dictionary to oldest one? That is, if all were added with the Add(...), does it return the one that was added first? I realize this is true for Lists, but is it also true for Dictionaries? 

 


Sunday, October 24, 2010 8:44 PM

Hello again,

 

Sorry, i was in a rush and failed to notice that, it should be this.Remove(this.FirstOrDefault().Key); ( i was thinking of lists with InsertAt(0, item)).

 

Hope this helps, if you have any other questions or comments, please let me know,

Best Regards,
Emanuel Varga

If a post answers your question, please click "Mark As Answer " on that post and "Mark as Helpful ".


Sunday, October 24, 2010 9:54 PM

Well, you didn't answer my question: Is the Last() element that you are removing, the oldest?


Sunday, October 24, 2010 9:56 PM | 1 vote

No the Last is not guaranteed to be the oldest in almost all collections, especially when used against a Dictionary.  You have a few choices from my opinion:

  1. You can create a custom Dictionary by implementing IDictionary<TKey, TValue> so  you can have tighter control over the implementation as apposed to extending Dictionary. Internally using one of the following data structures:

    A. Use a Queue of KeyValuePair<TKey, TValue> objects and know you know they are inserted in order so the first one in the queue is the oldest. Use Linq to Objects to manage the removing out of order, queue to add, and then dequeue to remove the oldest when you need to make room
    B. You can use a Dictionary with a KeyValuePair<TKey, TValue> as the key in the dictionary with this KVP using user defied key as its key and a timestamp as the value, and the user defined value is the value to this KVP key.  
    C. You could then use a 3 dimensional array to handle, key, value, timestamp
    D. Two dictionaries on for the key, value and another to map the key's to timestamps.  

    On the ones with timestamps you can use this to know what is the oldest.

  2. You can USE a Dictionary and manage its max size and age externally with a queue or another dictionary.

  3. There is always the option to refactor design so that you do not need to have a fixed size dictionary.

 


Sunday, October 24, 2010 9:57 PM

Hello again,

 

I have updated my previous code, the First() one is the oldest if you are using Add, it is the last one only if you are using InsertAt(pos, item), and that is just for lists.

It is always the First, if you are not using a SortedDictionary.

 

Best Regards,
Emanuel Varga

If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".


Sunday, October 24, 2010 10:15 PM

Items in a dictionary are based on the key's hash and are broken into buckets.

The order items are in a Dictionary are not guaranteed. The only generic collections with guaranteed insertion order are Queue and Stack. Sorted collections have an ordered guaranteed based on a Comparer. And using Linq First or Last are not guaranteed to give the first or last based on insertion order on a collection that doesn't guarantee insertion order is maintained.  It will be based on where they are in the collection at the time of the Linq query.

This is why if you need a custom Dictionary you really should implement the interface and create your own data structure as not to rely on the base Dictionary implementation.

 


Sunday, October 24, 2010 10:25 PM

Items in a dictionary are based on the key's hash and are broken into buckets.

The order items are in a Dictionary are not guaranteed. The only generic collections with guaranteed insertion order are Queue and Stack. Sorted collections have an ordered guaranteed based on a Comparer. And using Linq First or Last are not guaranteed to give the first or last based on insertion order on a collection that doesn't guarantee insertion order is maintained.  It will be based on where they are in the collection at the time of the Linq query.

This is why if you need a custom Dictionary you really should implement the interface and create your own data structure as not to rely on the base Dictionary implementation.

 

Really? Please read this article :

"Dictionary vs. SortedDictionary

The difference between Dictionary and SortedDictionary start with the obvious. A Dictionary keeps elements in the order they were added . Meanwhile a SortedDictionary keeps elements sorted based on their comparer."

 

Best Regards,
Emanuel Varga

If a post answers your question, please click "Mark As Answer " on that post and "Mark as Helpful ".


Sunday, October 24, 2010 10:41 PM

I would not depend on that, even if you find it true at the current release.

 

See this quote:

For purposes of enumeration, each item in the dictionary is treated as a KeyValuePair<TKey, TValue> structure representing a value and its key. The order in which the items are returned is undefined.

Mike


Sunday, October 24, 2010 10:53 PM

First you can't beleive everything you read in sites like that, they are not perfect and make honest mistakes.  You should trust MSDN and actually knowing datastructures and then finally looking at the actual source code.

You should read the MSDN documentation for the Dictionary<Tkey, Tvalue> class, where it stated "The order in which the items are returned is undefined."  No where in the Dictionary documentation does it state anything to back up the claim on that site that they are "in the order the were added".  It actually contradicts this site with the statement I quoted and that it also states that the "Dictionary<TKey, TValue> class is implemented as a hash table".

So knowing how standard hash tables are implemented you would know that standard hash table implementations do not guarantee retrieval order or iteration order. 

You may get lucky with a small number of items that it appears to maintain insertion order, but as the number of items stored get larger you will get more and more bucket collisions with in the hash table and there by changing the retrieval and iteration order.


Monday, October 25, 2010 6:25 AM

Hello again,

 

Yes, you are right, but honestly i did some extensive tests in the past on some very large dictionary and the order remained the same in all of them.

I have corrected my original proposal by adding a queue, this way you are sure that the first item is the oldest:

  public class CustomDictionary<TKey, TValue> : Dictionary<TKey, TValue>
  {
    public int MaxItemsToHold { get; set; }

    private Queue<TKey> orderedKeys = new Queue<TKey>();

    public new void Add(TKey key, TValue value)
    {
      orderedKeys.Enqueue(key);
      if (this.MaxItemsToHold != 0 && this.Count >= MaxItemsToHold)
      {
        this.Remove(orderedKeys.Dequeue());
      }

      base.Add(key, value);
    }
  }

Hope this helps, if you have any other questions or comments, please let me know,

Best Regards,
Emanuel Varga

If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".


Monday, October 25, 2010 8:25 AM

Yes, a queue is "first in, first out" and so it has an obvious use here. But won't it be slow in retrieval (O(n))?

Basically, I don't get any of the speed advantage of a conventional dictionary going this route (this route being implementing a queue within a class that overrides IDictionary). Isn't that correct?

I suppose the nested Dictionaries approrach that Rodney first mentioned would work. It seems a bit in-elegant for a situation that has likely come up before.

Or maybe not. Here's a little about why I am pursuing this. I am writing numerical optimization software (see ooot.codeplex.com). Many of these methods were created 30-40 years ago when computer memory wasn't cheap. Nowadays, calls to an objective function may be expensive for use in most engineering design problems. And so it'd be a shame to waste time re-evaluating a state, x, that you already found the value for f = F(x). So, I will store ALL past x's and f's in a dictionary. But there may well be millions of calls to this function. I want to safeguard against filling up computer memory.


Monday, October 25, 2010 12:03 PM

Now I like that idea! I wish I had come up with it.

 

I implemented it and it works perfectly. I may set another constant to remove more than one item in the Add method. This way, like garbage collection, the removal won't happen quite so often, and the Add function will be just a tick quicker.

Thanks, Louis.fr


Monday, October 25, 2010 4:56 PM

Hi Matthew,

I know Louis provided a code sample, but I did provide you that as a solution yesterday. I provided several in fact, and the one marked #2 is a Dictionary that is managed by a queue.  I figured providing you several choices all of which would work, and allowing you to pick the one that best fits your needs. You didnt' provide much in the way of requirements that would narrow now which one would fit best for you and thought you could build them yourself and test which one will work best in your situation. Only you could really know what is the "best way" for you needs.

In the future you may want to mention you require working source code examples for acceptance. 


Monday, October 25, 2010 8:23 PM

In the future you may want to mention you require working source code examples for acceptance. 

That wasn't working source code.


Tuesday, October 26, 2010 3:22 PM

I think future web-surfers would appreciate both replies by Rodney Foley and Louis.fr. However, one must admit that there is much more detail in the Louis.fr post. Now that I re-read the posts, I see that this was indeed mentioned along with other good suggestions in Rodney Foley's first post, but, it is much harder to extrapolate this idea to real code. I'm not saying I couldn't do it, but with Louis.fr code now in front of all our eyes, we can more easily see how to do it, and how it'll function in the larger code.

Thanks to everyone who replied.