|
was probably having issues removing a row ! HAAAAAA !
|
|
|
|
|
You probably all knew this, but I needed a definitive answer.
I'm using a ConcurrentQueue which is a FIFO collection.
I needed to know that when I went to determine if the queue already had the item, if it would return faster when the item was found at the top (next item out) versus found at the bottom (last out position).
I assumed, but wasn't sure.
This code gives a good definitive answer.
ConcurrentQueue<int> mainList = new ConcurrentQueue<int>();
Console.WriteLine("Adding items...");
for (int x = 1; x < 10000000;x++){
mainList.Enqueue(x);
}
Console.WriteLine("DONE adding items...");
Stopwatch sw = new Stopwatch();
Console.WriteLine("Searching for FIRST item...");
sw.Start();
mainList.FirstOrDefault(item => item == 1);
sw.Stop();
Console.WriteLine ("{0}ms",sw.ElapsedMilliseconds);
sw = new StopWatch();
Console.WriteLine("Searching for LAST item...");
sw.Start();
mainList.FirstOrDefault(item => item == 9999999);
sw.Stop();
Console.WriteLine ("{0}ms",sw.ElapsedMilliseconds);
Output
Adding items...
DONE adding items...
Searching for FIRST item...
0ms
Searching for LAST item...
99ms
The consistent result is around 100ms on my machine.
EDIT
This addition seems to bear expected results too:
for (int z = 9999999; z >= 1;z--){
if (z % 1000000 == 0) {
sw = new Stopwatch();
sw.Start();
mainList.FirstOrDefault(item => item == z);
sw.Stop();
Console.WriteLine ("{0} : {1}ms",z, sw.ElapsedMilliseconds);
}
}
Output for this section is:
9000000 : 99ms
8000000 : 89ms
7000000 : 78ms
6000000 : 71ms
5000000 : 55ms
4000000 : 44ms
3000000 : 33ms
2000000 : 21ms
1000000 : 10ms
Interesting that on my machine it is about 10ms per 1,000,000 items.
EDIT 2
But, wait...there's more!
What about for objects?
I'm creating a queue of anonymous objects via dynamic -- basically cuz I'm lazy.
Then I test to see how long it takes to find the matching value. Answer: Longer
Right, we all assumed longer. The amount of time could just be unboxing of object to get value or whatever.
This answers the question for me though, because I'm going to be putting a bunch of items in a ConcurrentQueue and I want to add more items but don't want to add an item if the collection already contains the item.
void Main()
{
ConcurrentQueue<dynamic> mainList = new ConcurrentQueue<dynamic>();
Console.WriteLine("Adding items...");
for (int x = 1; x < 10000000;x++){
mainList.Enqueue(new {x=x,name="x_" + x});
}
Console.WriteLine("DONE adding items...");
Stopwatch sw = new Stopwatch();
Console.WriteLine("Searching for FIRST item...");
sw.Start();
mainList.FirstOrDefault(item => item.x == 1);
sw.Stop();
Console.WriteLine ("{0}ms",sw.ElapsedMilliseconds);
sw = new Stopwatch();
Console.WriteLine("Searching for LAST item...");
sw.Start();
mainList.FirstOrDefault(item => item.x == 9999999);
sw.Stop();
Console.WriteLine ("{0}ms",sw.ElapsedMilliseconds);
for (int z = 9999999; z >= 1;z--){
if (z % 1000000 == 0) {
sw = new Stopwatch();
sw.Start();
mainList.FirstOrDefault(item => item.x == z);
sw.Stop();
Console.WriteLine ("{0} : {1}ms",z, sw.ElapsedMilliseconds);
}
}
}
Output Looks Like
Adding items...
DONE adding items...
Searching for FIRST item...
4ms
Searching for LAST item...
753ms
9000000 : 643ms
8000000 : 575ms
7000000 : 496ms
6000000 : 430ms
5000000 : 362ms
4000000 : 285ms
3000000 : 217ms
2000000 : 141ms
1000000 : 72ms
modified 5-Feb-18 17:25pm.
|
|
|
|
|
raddevus wrote: I assumed, but wasn't sure.
This code gives a good definitive answer. Hence, the upvote.
Proof is always better than an assumption
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
Not particularly surprising, given the amount of work it's doing.
Reference Source: ConcurrentQueue.cs[^]
It looks like it's implemented as a singly linked list of small arrays (32 items each). Searching a linked list for the last item, or for an item that doesn't exist, is always going to take longer than finding an item close to the start.
It also seems that the order of searches has some effect. Try searching for 1st followed by 31st; then try 31st followed by 1st.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Very good confirming information on that. Thanks a lot.
Richard Deeming wrote: It also seems that the order of searches has some effect. Try searching for 1st followed by 31st; then try 31st followed by 1s
I actually noticed that too as I tested, but my original post became so long I decided not to mention more.
|
|
|
|
|
It is seems to me a very bad practice to use a FIFO/LIFO oriented collection to anything else than push/pop...
Skipper: We'll fix it.
Alex: Fix it? How you gonna fix this?
Skipper: Grit, spit and a whole lotta duct tape.
|
|
|
|
|
Kornfeld Eliyahu Peter wrote: seems to me a very bad practice to use a FIFO/LIFO oriented collection to anything else than push/pop
Yes, a good observation.
I had initially solved this with a ConcurrentDictionary.
Someone else suggested the Queue to me.
1. I have a memory collection which continues to get re-loaded from the database on an interval.
2. I only want to load new items into the collection
3. However, when I get an item, I just want to pop off the oldest item from the collection.
This lead me to two different ways to do this.
1. Queue - load up the queue on interval - insure an item is never added more than once. -- later, simply pop off the oldest item (First-In). The getter code is a bit more simple (obvious / self-documenting) with the Queue.
2. Dictionary - load up Dictionary on interval - TryAdd() handles this for me. As long as I load the dictionary from the database in reverse order (Stored Proc returns oldest first) all I have to do is call .First() on Dictionary when I want the oldest item. The Queue obviously gives you the First added item automatically.
I've now done detailed testing and using the ConcurrentDictionary is much less CPU intensive, but might be slightly more RAM hungry.
|
|
|
|
|
raddevus wrote: all I have to do is call .First() on Dictionary when I want the oldest item.
That's an assumption that's going to bite you in the future. The dictionary classes don't guarantee that the enumerator will return items in any specific order. The order could change with a framework update; or it could change when your dictionary exceeds a certain size.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
Ah, interesting. Very good info. So back to the Queue, it is.
There are always trade-offs.
I was sorting the Dictionary first via
waitingFiles.OrderByDescending(of => of.Value.FileCreated).First();
But that is really intensive because from what I can tell those LINQ objects are instantiated to search for the one that matches.
So, I was hoping (and it is really fast and nice) that since I load the data into the Dictionary in ascending order (oldest files first) that I could count on just calling First().
I mean it does work now. I was actually testing the order that the files are then processed.
However, I understand that it is not guaranteed.
Interesting and why I'm trying not to make assumptions about which is better ( Dictionary or Queue ) in this instance.
|
|
|
|
|
Question, do you need to check all or any of the elements in the list, or is it enough to check the last one?
If so you would do better off with a linked list.
There are no concurrent linked lists though. Apparently it was around in the beta versions of dotNet 4.0 but it was left out from the release version as there were no advantages over using a normal linked list with locking.
But a simple lock would still be faster since you don't need to enumerate the collection.
|
|
|
|
|
Thanks for chiming in on the conversation. The challenge I have here is that I need to insure the item is not currently anywhere in the collection -- an item may have been loaded at any location.
There are probably other ways around this but just attempting to do it the easiest, yet most correct way which uses the least resources.
The collection is used in two ways :
1. caller will request the oldest unique item
2. collection will periodically load newer items into itself
The interesting thing about this is that the Queue makes on of those requirements easy and the Dictionary makes the other requirement easy.
The first one is easy with the Queue since you just get the first in item.
The second one is difficult with a Queue (there is no key) so you have to determine that the object is not already contained in the Queue for every item, each time the items are retrieved from db.
The second one is easy with a Dictionary since you can do TryAdd() with your unique key and it fails fast if the item has already been added).
However, the first one is more difficult with a Dictionary because the items are not (necessarily) ordered in a significant way and we need to get the oldest item.
It's an interesting problem.
|
|
|
|
|
If you can afford the extra memory usage, the simplest option would probably be a custom class to store the data twice - once in a HashSet<T> , and once in a Queue<T> .
You'd obviously need to add your own locking code to ensure it was thread-safe.
Off the top of my head, something like this might work:
public sealed class UniqueQueue<T> : IEnumerable<T>, IDisposable
{
private readonly Queue<T> _queue = new Queue<T>();
private readonly HashSet<T> _set = new HashSet<T>();
private readonly ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
public int Count
{
get
{
bool lockTaken = false;
try
{
lockTaken = _lock.TryEnterReadLock(Timeout.Infinite);
if (!lockTaken) throw new InvalidOperationException();
return _queue.Count;
}
finally
{
if (lockTaken)
{
_lock.ExitReadLock();
}
}
}
}
public void Dispose()
{
_lock.Dispose();
}
public T Dequeue()
{
bool lockTaken = false;
try
{
lockTaken = _lock.TryEnterWriteLock(Timeout.Infinite);
if (!lockTaken) throw new InvalidOperationException();
T oldestItem = _queue.Dequeue();
_set.Remove(oldestItem);
return oldestItem;
}
finally
{
if (lockTaken)
{
_lock.ExitWriteLock();
}
}
}
public void AddRange(IEnumerable<T> itemsToAdd)
{
if (itemsToAdd == null) throw new ArgumentNullException(nameof(itemsToAdd));
bool lockTaken = false;
try
{
lockTaken = _lock.TryEnterWriteLock(Timeout.Infinite);
if (!lockTaken) throw new InvalidOperationException();
foreach (T item in itemsToAdd)
{
if (_set.Add(item))
{
_queue.Enqueue(item);
}
}
}
finally
{
if (lockTaken)
{
_lock.ExitWriteLock();
}
}
}
public UniqueQueueEnumerator GetEnumerator()
{
T[] copyOfItems = Array.Empty<T>();
bool lockTaken = false;
try
{
lockTaken = _lock.TryEnterReadLock(Timeout.Infinite);
if (!lockTaken) throw new InvalidOperationException();
copyOfItems = new T[_queue.Count];
_queue.CopyTo(copyOfItems, 0);
}
finally
{
if (lockTaken)
{
_lock.ExitReadLock();
}
}
return new UniqueQueueEnumerator(copyOfItems);
}
IEnumerator<T> IEnumerable<T>.GetEnumerator() => GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public struct UniqueQueueEnumerator : IEnumerator<T>
{
private readonly T[] _items;
private int _position;
public UniqueQueueEnumerator(T[] items)
{
_items = items;
_position = -1;
}
void IDisposable.Dispose()
{
}
public void Reset()
{
if (_items == null) throw new InvalidOperationException();
_position = -1;
}
public bool MoveNext()
{
if (_items == null) return false;
return ++_position < _items.Length;
}
public T Current
{
get
{
if (_items == null) throw new InvalidOperationException();
if (0 > _position || _position >= _items.Length) throw new InvalidOperationException();
return _items[_position];
}
}
object IEnumerator.Current => Current;
}
}
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
This is a very good suggestion and I saw something very similar at StackOverflow.
But, as you mentioned I was attempting to keep the memory size low too and it is interesting that this does store the objects in two collections.
Also, I am actually using the ConcurrentQueue and ConcurrentDictionary which are thread-protected automatically so that would solve that part of the problem.
Thanks for the input and discussion.
|
|
|
|
|
raddevus wrote: I am actually using the ConcurrentQueue and ConcurrentDictionary which are thread-protected automatically
Ish.
Modifications to one of those collections will automatically be thread-safe. But if you're trying to read or update the two in a single atomic operation, you'll still need your own locking mechanism.
"These people looked deep within my soul and assigned me a number based on the order in which I joined."
- Homer
|
|
|
|
|
I had a feeling it wouldn't be that simple.
Richards suggestion would've been my second one, minus the code sample though. (impressive that!)
Second question though, you wrote in another post that you load the objects from a database.
Are the objects in the database not unique? How do you ensure the uniqueness if you have popped an object from the list?
Or is the uniqueness coming from the identity column in the database? If so you could still use the linked list. Using a paged and sorted query.
modified 9-Feb-18 13:14pm.
|
|
|
|
|
raddevus wrote: I needed to know that when I went to determine if the queue already had the item, if it would return faster when the item was found at the top (next item out) versus found at the bottom (last out position).
I assumed, but wasn't sure.
This code gives a good definitive answer
You forgot the important qualifier through -- this code gives a good definitive answer for build XXX of the library. In the next build, the implementer may find a bug with the way they're doing things and have to change it, altering the performance. If you're writing code that depends that much on the performance of some class, you need to write the class yourself.
|
|
|
|
|
|
Without showing code there's little interesting here (especially since "nothing" was fixed ) ...
There's 'yer chuckle though.
kmoorevs wrote: fixed That's like some sort of like quantum foam, right?
if i'm not mistaken
|
|
|
|
|
for the pun.
"Go forth into the source" - Neal Morse
|
|
|
|
|
Is decimal portion not required?
TOMZ_KV
|
|
|
|
|
Tomz_KV wrote: Is decimal portion not required?
Yes, which was the problem...division casts to the datatype with highest precision for parameters in the equation. (I'm probably saying that wrong!)
The correction was to convert one of the parameters in the division to a decimal.
I only posted here 'cause we no longer have a 'hall of shame'!
"Go forth into the source" - Neal Morse
|
|
|
|
|
List<object> ccList = new List<object>();
ccList.Add(cc);
foo.Items = ccList.ToArray();
- Create a list.
- Add the item to the list.
- Convert the list to an array.
Latest Article - Code Review - What You Can Learn From a Single Line of Code
Learning to code with python is like learning to swim with those little arm floaties. It gives you undeserved confidence and will eventually drown you. - DangerBunny
Artificial intelligence is the only remedy for natural stupidity. - CDP1802
|
|
|
|
|
Or:
var ccList = Enumerable.Range(0,1);
var objects = ccList.Select(x=> (object)cc);
foo.Items = objects.ToArray();
|
|
|
|
|
That code will work, but, like my signature says, that doesn't mean it is a good way to do it.
Just because the code works, it doesn't mean that it is good code.
|
|
|
|
|
BillW33 wrote: That code will work, but, like my signature says, that doesn't mean it is a good way to do it. And I suppose this is exactly why he is posting it in this forum
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|