|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Performance of Queue(of T).Enqueue(T)When calling Enqueue, the internal array may need to be reallocated. My
question is by how much? In the old MFC array classes, you could tell MFC how many additional elements to add to the array when it needed to reallocate, which greatly boosted performance relative to adding 1 element at a time. Thanks, Mike Ober. Michael D. Ober wrote:
> When calling Enqueue, the internal array may need to be reallocated. My Unfortunately, the .NET documentation fails to make any mention of the > question is by how much? In the old MFC array classes, you could > tell MFC how many additional elements to add to the array when it > needed to reallocate, which greatly boosted performance relative to > adding 1 element at a time. reallocation strategy for any of the array-based generic containers. I'd suggest doing some experimentation to try to measure the insert performance as the queue gets larger, or use .NET Reflector to examine the IL behind the queue class to determine how it grows. Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's been shown to give the best overall performance for a dynamically sized array. -cd
Show quote
Hide quote
"Carl Daniel [VC++ MVP]" <cpdaniel_remove_this_and_nospam@mvps.org.nospam> Taking a quick look with reflector, it's clear that Queue<T> grows wrote in message news:uFsdpfT8GHA.4604@TK2MSFTNGP03.phx.gbl... > Michael D. Ober wrote: >> When calling Enqueue, the internal array may need to be reallocated. My >> question is by how much? In the old MFC array classes, you could >> tell MFC how many additional elements to add to the array when it >> needed to reallocate, which greatly boosted performance relative to >> adding 1 element at a time. > > Unfortunately, the .NET documentation fails to make any mention of the > reallocation strategy for any of the array-based generic containers. I'd > suggest doing some experimentation to try to measure the insert > performance as the queue gets larger, or use .NET Reflector to examine the > IL behind the queue class to determine how it grows. > > Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's > been shown to give the best overall performance for a dynamically sized > array. linearly - whenever the internal array is full, it's increased in size by 4 elements. That gives O(n^2) performance on average if nothing is ever removed from the queue. Since you typically wouldn't expect a queue to have a large number of elements, linear growth is probably a reasonable choice for most applications. List<T>, on the other hand, uses exponential growth (factor of 2), so it will have amortized constant insert time as it grows. -cd Thanks - I'll switch to List<T>. My application needs the quicker
reallocation. Mike. Show quoteHide quote "Carl Daniel [VC++ MVP]" <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message news:e8WXF6T8GHA.2248@TK2MSFTNGP04.phx.gbl... > "Carl Daniel [VC++ MVP]" <cpdaniel_remove_this_and_nospam@mvps.org.nospam> > wrote in message news:uFsdpfT8GHA.4604@TK2MSFTNGP03.phx.gbl... >> Michael D. Ober wrote: >>> When calling Enqueue, the internal array may need to be reallocated. My >>> question is by how much? In the old MFC array classes, you could >>> tell MFC how many additional elements to add to the array when it >>> needed to reallocate, which greatly boosted performance relative to >>> adding 1 element at a time. >> >> Unfortunately, the .NET documentation fails to make any mention of the >> reallocation strategy for any of the array-based generic containers. I'd >> suggest doing some experimentation to try to measure the insert >> performance as the queue gets larger, or use .NET Reflector to examine >> the IL behind the queue class to determine how it grows. >> >> Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's >> been shown to give the best overall performance for a dynamically sized >> array. > > Taking a quick look with reflector, it's clear that Queue<T> grows > linearly - whenever the internal array is full, it's increased in size by > 4 elements. That gives O(n^2) performance on average if nothing is ever > removed from the queue. Since you typically wouldn't expect a queue to > have a large number of elements, linear growth is probably a reasonable > choice for most applications. > > List<T>, on the other hand, uses exponential growth (factor of 2), so it > will have amortized constant insert time as it grows. > > -cd > Actually, I ended up switching to LinkedList<T> to emulate the
Enqueue/Dequeue methods.. My application generates queues of a few items to thousands of items. It appears that the slight overhead of Linked Lists is worth the benefit of never reallocating an array. Mike. Show quoteHide quote "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message news:e7SUzAU8GHA.4740@TK2MSFTNGP02.phx.gbl... > Thanks - I'll switch to List<T>. My application needs the quicker > reallocation. > > Mike. > > "Carl Daniel [VC++ MVP]" <cpdaniel_remove_this_and_nospam@mvps.org.nospam> > wrote in message news:e8WXF6T8GHA.2248@TK2MSFTNGP04.phx.gbl... >> "Carl Daniel [VC++ MVP]" >> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >> news:uFsdpfT8GHA.4604@TK2MSFTNGP03.phx.gbl... >>> Michael D. Ober wrote: >>>> When calling Enqueue, the internal array may need to be reallocated. My >>>> question is by how much? In the old MFC array classes, you could >>>> tell MFC how many additional elements to add to the array when it >>>> needed to reallocate, which greatly boosted performance relative to >>>> adding 1 element at a time. >>> >>> Unfortunately, the .NET documentation fails to make any mention of the >>> reallocation strategy for any of the array-based generic containers. >>> I'd suggest doing some experimentation to try to measure the insert >>> performance as the queue gets larger, or use .NET Reflector to examine >>> the IL behind the queue class to determine how it grows. >>> >>> Hopefully, it grows exponentially (by a factor of 1.5 - 2) because >>> that's been shown to give the best overall performance for a dynamically >>> sized array. >> >> Taking a quick look with reflector, it's clear that Queue<T> grows >> linearly - whenever the internal array is full, it's increased in size by >> 4 elements. That gives O(n^2) performance on average if nothing is ever >> removed from the queue. Since you typically wouldn't expect a queue to >> have a large number of elements, linear growth is probably a reasonable >> choice for most applications. >> >> List<T>, on the other hand, uses exponential growth (factor of 2), so it >> will have amortized constant insert time as it grows. >> >> -cd >> > > Michael,
If you know the maximum/typical size of the queue you can use its constructor that sets the initial capacity: http://msdn2.microsoft.com/en-us/library/35bz1ab7.aspx I would recommend setting the initial capacity first no matter which data type you use Queue(Of T) or List(Of T). Of course LinkedList(Of T) doesn't have a capacity ;-) I would also try Queue(Of T) with varying capacities before reverting to an alternate data type. -- Show quoteHide quoteHope this helps Jay B. Harlow ..NET Application Architect, Enthusiast, & Evangelist T.S. Bradley - http://www.tsbradley.net "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message news:eW1beOU8GHA.4116@TK2MSFTNGP03.phx.gbl... > Actually, I ended up switching to LinkedList<T> to emulate the > Enqueue/Dequeue methods.. My application generates queues of a few items > to thousands of items. It appears that the slight overhead of Linked > Lists is worth the benefit of never reallocating an array. > > Mike. > > "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message > news:e7SUzAU8GHA.4740@TK2MSFTNGP02.phx.gbl... >> Thanks - I'll switch to List<T>. My application needs the quicker >> reallocation. >> >> Mike. >> >> "Carl Daniel [VC++ MVP]" >> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >> news:e8WXF6T8GHA.2248@TK2MSFTNGP04.phx.gbl... >>> "Carl Daniel [VC++ MVP]" >>> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >>> news:uFsdpfT8GHA.4604@TK2MSFTNGP03.phx.gbl... >>>> Michael D. Ober wrote: >>>>> When calling Enqueue, the internal array may need to be reallocated. >>>>> My question is by how much? In the old MFC array classes, you could >>>>> tell MFC how many additional elements to add to the array when it >>>>> needed to reallocate, which greatly boosted performance relative to >>>>> adding 1 element at a time. >>>> >>>> Unfortunately, the .NET documentation fails to make any mention of the >>>> reallocation strategy for any of the array-based generic containers. >>>> I'd suggest doing some experimentation to try to measure the insert >>>> performance as the queue gets larger, or use .NET Reflector to examine >>>> the IL behind the queue class to determine how it grows. >>>> >>>> Hopefully, it grows exponentially (by a factor of 1.5 - 2) because >>>> that's been shown to give the best overall performance for a >>>> dynamically sized array. >>> >>> Taking a quick look with reflector, it's clear that Queue<T> grows >>> linearly - whenever the internal array is full, it's increased in size >>> by 4 elements. That gives O(n^2) performance on average if nothing is >>> ever removed from the queue. Since you typically wouldn't expect a >>> queue to have a large number of elements, linear growth is probably a >>> reasonable choice for most applications. >>> >>> List<T>, on the other hand, uses exponential growth (factor of 2), so it >>> will have amortized constant insert time as it grows. >>> >>> -cd >>> >> >> > > Jay,
The maximum size of a typical queue for this application ranges from 0 to 60000 with most of the queues (note that there are several) running from 4000 - 6000, but I have absolutely no guarantees of this as the source queue max sizes aren't under my control. I also have a time constraint on the input portion so the inputs must be added as quickly as possible. Combine this with the fact that I load the queues about 10 times faster than they can be emptied and I was seeing long pauses (on the order of several minutes) while the internal arrays were being reallocated and the GC was running with both Queue<T> and List<T>. Creating a queue or list with an initial size caused the application to stall when a queue or list was first being created. It turns out that emulating a queue using LinkedList<T> was very quick and the O(1) insertion to the end and removal from the front made a huge difference in the application performance. I did notice last night that I had a ~25 minute pause for a full GC pass, which by the way dropped the memory footprint of this application from 90Mb to 20Mb. This GC pass occurred well after the queues had been loaded so the time constraint on the input portion of my application was no longer applicable. By the way, by rewriting to use multiple queues and threads, I took the runtime of this application, which is a one-way DFS between two Samba servers using a Windows Server as the controller, from over 24 hours (attempting to read input the entire time) to about 7 hours (with only the first 90 minutes reading input). I expect further performance gains now that I finally have a complete pass and I don't have to copy gigabytes of files over a T1. Mike. Show quoteHide quote "Jay B. Harlow" <Jay_Harlow_***@tsbradley.net> wrote in message news:B63BA4D8-8E33-4178-BEC5-2CDE0BB71BDA@microsoft.com... > Michael, > If you know the maximum/typical size of the queue you can use its > constructor that sets the initial capacity: > > http://msdn2.microsoft.com/en-us/library/35bz1ab7.aspx > > I would recommend setting the initial capacity first no matter which data > type you use Queue(Of T) or List(Of T). Of course LinkedList(Of T) doesn't > have a capacity ;-) > > I would also try Queue(Of T) with varying capacities before reverting to > an alternate data type. > > -- > Hope this helps > Jay B. Harlow > .NET Application Architect, Enthusiast, & Evangelist > T.S. Bradley - http://www.tsbradley.net > > > "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message > news:eW1beOU8GHA.4116@TK2MSFTNGP03.phx.gbl... >> Actually, I ended up switching to LinkedList<T> to emulate the >> Enqueue/Dequeue methods.. My application generates queues of a few items >> to thousands of items. It appears that the slight overhead of Linked >> Lists is worth the benefit of never reallocating an array. >> >> Mike. >> >> "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message >> news:e7SUzAU8GHA.4740@TK2MSFTNGP02.phx.gbl... >>> Thanks - I'll switch to List<T>. My application needs the quicker >>> reallocation. >>> >>> Mike. >>> >>> "Carl Daniel [VC++ MVP]" >>> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >>> news:e8WXF6T8GHA.2248@TK2MSFTNGP04.phx.gbl... >>>> "Carl Daniel [VC++ MVP]" >>>> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >>>> news:uFsdpfT8GHA.4604@TK2MSFTNGP03.phx.gbl... >>>>> Michael D. Ober wrote: >>>>>> When calling Enqueue, the internal array may need to be reallocated. >>>>>> My question is by how much? In the old MFC array classes, you could >>>>>> tell MFC how many additional elements to add to the array when it >>>>>> needed to reallocate, which greatly boosted performance relative to >>>>>> adding 1 element at a time. >>>>> >>>>> Unfortunately, the .NET documentation fails to make any mention of the >>>>> reallocation strategy for any of the array-based generic containers. >>>>> I'd suggest doing some experimentation to try to measure the insert >>>>> performance as the queue gets larger, or use .NET Reflector to examine >>>>> the IL behind the queue class to determine how it grows. >>>>> >>>>> Hopefully, it grows exponentially (by a factor of 1.5 - 2) because >>>>> that's been shown to give the best overall performance for a >>>>> dynamically sized array. >>>> >>>> Taking a quick look with reflector, it's clear that Queue<T> grows >>>> linearly - whenever the internal array is full, it's increased in size >>>> by 4 elements. That gives O(n^2) performance on average if nothing is >>>> ever removed from the queue. Since you typically wouldn't expect a >>>> queue to have a large number of elements, linear growth is probably a >>>> reasonable choice for most applications. >>>> >>>> List<T>, on the other hand, uses exponential growth (factor of 2), so >>>> it will have amortized constant insert time as it grows. >>>> >>>> -cd >>>> >>> >>> >> >> > Final thoughts and followup:
Average run time has stabilized around 4.5 hours with the time critical portion taking 2 and a quarter hours. Memory usage has stabilized around 32Mb (per Task Manager) and I have no long pauses for the GC. Queue<T> and List<T> both had far larger memory footprints (~90Mb) and long pauses for the GC. The reallocation of the internal arrays would be my guess for the culprit for the poor memory performance of Queue<T> and List<T>. Once again, choosing the correct data structure made a huge difference. In this case LinkedList<T> and wrapping it in a derived Queue class was the key to performance. The LinkedList<T> class has O(1) removal from the front and addition to the end methods as well as a O(1) count method. It never needs to be bulk copied to another memory location, which means the GC doesn't have to work nearly as hard, even if individual LinkItems must be moved in memory. Jay, thanks for your assistance and feedback from the reflector. Do you have any links to how to use the reflector in VS 2005 so I can do my own research when performance (or interest) dictates? Mike. Show quoteHide quote "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message news:eHw%23UAf8GHA.940@TK2MSFTNGP03.phx.gbl... > Jay, > > The maximum size of a typical queue for this application ranges from 0 to > 60000 with most of the queues (note that there are several) running from > 4000 - 6000, but I have absolutely no guarantees of this as the source > queue max sizes aren't under my control. I also have a time constraint on > the input portion so the inputs must be added as quickly as possible. > Combine this with the fact that I load the queues about 10 times faster > than they can be emptied and I was seeing long pauses (on the order of > several minutes) while the internal arrays were being reallocated and the > GC was running with both Queue<T> and List<T>. Creating a queue or list > with an initial size caused the application to stall when a queue or list > was first being created. > > It turns out that emulating a queue using LinkedList<T> was very quick and > the O(1) insertion to the end and removal from the front made a huge > difference in the application performance. I did notice last night that I > had a ~25 minute pause for a full GC pass, which by the way dropped the > memory footprint of this application from 90Mb to 20Mb. This GC pass > occurred well after the queues had been loaded so the time constraint on > the input portion of my application was no longer applicable. > > By the way, by rewriting to use multiple queues and threads, I took the > runtime of this application, which is a one-way DFS between two Samba > servers using a Windows Server as the controller, from over 24 hours > (attempting to read input the entire time) to about 7 hours (with only the > first 90 minutes reading input). I expect further performance gains now > that I finally have a complete pass and I don't have to copy gigabytes of > files over a T1. > > Mike. > > "Jay B. Harlow" <Jay_Harlow_***@tsbradley.net> wrote in message > news:B63BA4D8-8E33-4178-BEC5-2CDE0BB71BDA@microsoft.com... >> Michael, >> If you know the maximum/typical size of the queue you can use its >> constructor that sets the initial capacity: >> >> http://msdn2.microsoft.com/en-us/library/35bz1ab7.aspx >> >> I would recommend setting the initial capacity first no matter which data >> type you use Queue(Of T) or List(Of T). Of course LinkedList(Of T) >> doesn't have a capacity ;-) >> >> I would also try Queue(Of T) with varying capacities before reverting to >> an alternate data type. >> >> -- >> Hope this helps >> Jay B. Harlow >> .NET Application Architect, Enthusiast, & Evangelist >> T.S. Bradley - http://www.tsbradley.net >> >> >> "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message >> news:eW1beOU8GHA.4116@TK2MSFTNGP03.phx.gbl... >>> Actually, I ended up switching to LinkedList<T> to emulate the >>> Enqueue/Dequeue methods.. My application generates queues of a few >>> items to thousands of items. It appears that the slight overhead of >>> Linked Lists is worth the benefit of never reallocating an array. >>> >>> Mike. >>> >>> "Michael D. Ober" <ober***@.alum.mit.edu.nospam> wrote in message >>> news:e7SUzAU8GHA.4740@TK2MSFTNGP02.phx.gbl... >>>> Thanks - I'll switch to List<T>. My application needs the quicker >>>> reallocation. >>>> >>>> Mike. >>>> >>>> "Carl Daniel [VC++ MVP]" >>>> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >>>> news:e8WXF6T8GHA.2248@TK2MSFTNGP04.phx.gbl... >>>>> "Carl Daniel [VC++ MVP]" >>>>> <cpdaniel_remove_this_and_nospam@mvps.org.nospam> wrote in message >>>>> news:uFsdpfT8GHA.4604@TK2MSFTNGP03.phx.gbl... >>>>>> Michael D. Ober wrote: >>>>>>> When calling Enqueue, the internal array may need to be reallocated. >>>>>>> My question is by how much? In the old MFC array classes, you could >>>>>>> tell MFC how many additional elements to add to the array when it >>>>>>> needed to reallocate, which greatly boosted performance relative to >>>>>>> adding 1 element at a time. >>>>>> >>>>>> Unfortunately, the .NET documentation fails to make any mention of >>>>>> the reallocation strategy for any of the array-based generic >>>>>> containers. I'd suggest doing some experimentation to try to measure >>>>>> the insert performance as the queue gets larger, or use .NET >>>>>> Reflector to examine the IL behind the queue class to determine how >>>>>> it grows. >>>>>> >>>>>> Hopefully, it grows exponentially (by a factor of 1.5 - 2) because >>>>>> that's been shown to give the best overall performance for a >>>>>> dynamically sized array. >>>>> >>>>> Taking a quick look with reflector, it's clear that Queue<T> grows >>>>> linearly - whenever the internal array is full, it's increased in size >>>>> by 4 elements. That gives O(n^2) performance on average if nothing is >>>>> ever removed from the queue. Since you typically wouldn't expect a >>>>> queue to have a large number of elements, linear growth is probably a >>>>> reasonable choice for most applications. >>>>> >>>>> List<T>, on the other hand, uses exponential growth (factor of 2), so >>>>> it will have amortized constant insert time as it grows. >>>>> >>>>> -cd >>>>> >>>> >>>> >>> >>> >> > >
How to convert a Byte() to an IntPtr in VB
load form in vb.net How to raise a shortcut picking off each digit of an integer How to use binary files as resources in VB 2005 Regex.Split... Can I do this?? control array with code added controls Re: What .NET classes are SLOW vs. API? Getting external IP Address Is there a equivalent to 'Last Position' in VS2005? |
|||||||||||||||||||||||