|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Best data structure?looking for the best option. And after several test I am pretty confused. So I will be grateful if anyone can help me. My basic need is: Keep an "array" (or other structure) of pointers to "arrays" (or other structure) of bytes. Example: arrayPointer(1,000,000) each one point to arraByte1(), arrayByte2(), arrayByte3().., arrayByte1000000(). So it is possible to redim each arrayByte() to the dimension I need. I supposed that a configuration like the above (suppose each arrayByte hold 1 Byte) needs: Pointers: 4 bytes x 1 Million = 4 Mb Arrays: 1 byte per array x 1 Million arrays = 1 Mb I have trying different option and I obtained a very high memory comsumption, is there any way to keep memory low and achieve what I need?, I am using "GC.GetTotalMemory(True)" to find out the memory usage is that right? Array of structures Public Structure stbyte Public byt() As Byte End Structure Memory = GC.GetTotalMemory(True) Dim intPo(1000000) As stbyte ' 1 Million structures For intI = 1 To 1000000 ReDim intPo(intI - 1).byt(0) ' Redim to 1 byte each Next Memory = GC.GetTotalMemory(True) - Memory Memory = Memory / 1000000 I obtained Memory= 20 Mbytes Jagged Array of Bytes Memory = GC.GetTotalMemory(True) Dim bytes(1000000)() As Byte ' 1 Million Jagged array For intI = 1 To 1000000 ReDim bytes(intI - 1)(1) ' Each vector to 1 byte Next Memory = GC.GetTotalMemory(True) - Memory Memory = Memory / 1000000 I obtained Memory= 20 Mbytes Rectangular Array of Bytes Memory = GC.GetTotalMemory(True) Dim bytes(1000000, 0) As Byte ' 1 Million array of 1 byte each Memory = GC.GetTotalMemory(True) - Memory Memory = Memory / 1000000 I obtained Memory = 1 Mbytes (but in this case I can not redim each row). Array of BitsArrays Memory = GC.GetTotalMemory(True) Dim bitArray(1000000) As BitArray ' 1 Million bitarrays For intI = 1 To 1000000 bitArray(intI - 1) = New BitArray(8) ' Each bitarray 8 bits Next Memory = GC.GetTotalMemory(True) - Memory Memory = Memory / 1000000 I obtained Memory = 40 Mbytes James,
The first question you have to ask yourself forever with arrays is. "Are all elements of the array used". If not than in what amount not. Maybe is by instance the HashTable than a better choise. However in my opinion is there not to give a general answer based on your question as it is. Cor Probably the question is not well define. What I need:
I need the less consuming memory data structure that allow me to: 1) Keep a list of "Pointers" or "references" to data structures (for this a Hashtable is fine). 2) Each data structure can be resize dynamically, each one with different sizes. An example could be: An array of pointers, each pointer points to an array of bytes. That is: 1 array of M pointers, and M arrays of bytes (each one can hod different number of bytes). Thanks, james Show quoteHide quote "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> escribió en el mensaje news:%23nuNPkhNGHA.2912@tk2msftngp13.phx.gbl... > James, > > The first question you have to ask yourself forever with arrays is. "Are > all elements of the array used". If not than in what amount not. > > Maybe is by instance the HashTable than a better choise. > > However in my opinion is there not to give a general answer based on your > question as it is. > > Cor > > The big picture may help (is this for things like spreadsheet cells ? What
is the nature of those informations ?). For example a "sparse array" structure is a combination of elementar data structures that allows to store only filled in cells. (i.e. you can have a 200x200 cells array but non used cells won't take storage). Also as a side note that there is AFAIK not a single solution for this (for example basically it uses a list of rows/columns and each list holds a list of the cells on the row/column) but as long as the external interface is exposed as an array (cells(x,y)) you can actually change the actual implementation with no or minimal impact on the main code... -- Show quoteHide quotePatrice "James" <i***@pricetech.es> a écrit dans le message de news:u4556$hNGHA.2560@TK2MSFTNGP09.phx.gbl... > Probably the question is not well define. What I need: > > I need the less consuming memory data structure that allow me to: > > 1) Keep a list of "Pointers" or "references" to data structures (for this a > Hashtable is fine). > 2) Each data structure can be resize dynamically, each one with different > sizes. > > An example could be: > An array of pointers, each pointer points to an array of bytes. That is: 1 > array of M pointers, and M arrays of bytes (each one can hod different > number of bytes). > > Thanks, > > james > > > > "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> escribió en el mensaje > news:%23nuNPkhNGHA.2912@tk2msftngp13.phx.gbl... > > James, > > > > The first question you have to ask yourself forever with arrays is. "Are > > all elements of the array used". If not than in what amount not. > > > > Maybe is by instance the HashTable than a better choise. > > > > However in my opinion is there not to give a general answer based on your > > question as it is. > > > > Cor > > > > > > > Ok! These are my needs:
I have a list of words (more than a million different words) I keep this words in a hashtable for easy access. For each of the words I need to keep in compress (binary form) the documents ID where each word belong (it can be hundreds of documents for each word). Sometimes I need to add more documents ID to some of the words. That is: Words DocumentID Paris -> doc123, doc234, doc345, doc345,... London->doc2, doc34, doc234,... .... So I need to keep in memory (that is what i am looking for low memory usage) the Words (actully I used a hashtable) and some structure to hold the document ID's for each word. Sometimes I need to add new documents to some of the words (normally after the last document for that word), so the data structure that hold the documents must be able to increase it lenght. My first thought was to use a jagged array but it consume a lot of overhead memory: Dim bytes(1000000)() As Byte ' 1 Million Jagged array For intI = 1 To 1000000 ReDim bytes(intI - 1)(1) ' Each vector to 1 byte Next This consume: 20 Mbytes of data?? when I was just supposing it will consume 1 Mbyte Any help?? Show quoteHide quote "Patrice" <a@bc.c> escribió en el mensaje news:%230JjGaiNGHA.2528@TK2MSFTNGP12.phx.gbl... > The big picture may help (is this for things like spreadsheet cells ? What > is the nature of those informations ?). > > For example a "sparse array" structure is a combination of elementar data > structures that allows to store only filled in cells. (i.e. you can have a > 200x200 cells array but non used cells won't take storage). > > Also as a side note that there is AFAIK not a single solution for this > (for > example basically it uses a list of rows/columns and each list holds a > list > of the cells on the row/column) but as long as the external interface is > exposed as an array (cells(x,y)) you can actually change the actual > implementation with no or minimal impact on the main code... > > -- > Patrice > > "James" <i***@pricetech.es> a écrit dans le message de > news:u4556$hNGHA.2560@TK2MSFTNGP09.phx.gbl... >> Probably the question is not well define. What I need: >> >> I need the less consuming memory data structure that allow me to: >> >> 1) Keep a list of "Pointers" or "references" to data structures (for this > a >> Hashtable is fine). >> 2) Each data structure can be resize dynamically, each one with different >> sizes. >> >> An example could be: >> An array of pointers, each pointer points to an array of bytes. That is: >> 1 >> array of M pointers, and M arrays of bytes (each one can hod different >> number of bytes). >> >> Thanks, >> >> james >> >> >> >> "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> escribió en el mensaje >> news:%23nuNPkhNGHA.2912@tk2msftngp13.phx.gbl... >> > James, >> > >> > The first question you have to ask yourself forever with arrays is. >> > "Are >> > all elements of the array used". If not than in what amount not. >> > >> > Maybe is by instance the HashTable than a better choise. >> > >> > However in my opinion is there not to give a general answer based on > your >> > question as it is. >> > >> > Cor >> > >> > >> >> >> > > > Have you considered using a class object that has a "Word" property and a
Collection of Document ID"s. You would have a class instance for each word. -- Show quoteHide quoteDennis in Houston "James" wrote: > Ok! These are my needs: > > I have a list of words (more than a million different words) I keep this > words in a hashtable for easy access. For each of the words I need to keep > in compress (binary form) the documents ID where each word belong (it can be > hundreds of documents for each word). Sometimes I need to add more documents > ID to some of the words. That is: > > Words DocumentID > Paris -> doc123, doc234, doc345, doc345,... > London->doc2, doc34, doc234,... > .... > So I need to keep in memory (that is what i am looking for low memory usage) > the Words (actully I used a hashtable) and some structure to hold the > document ID's for each word. Sometimes I need to add new documents to some > of the words (normally after the last document for that word), so the data > structure that hold the documents must be able to increase it lenght. > > My first thought was to use a jagged array but it consume a lot of overhead > memory: > Dim bytes(1000000)() As Byte ' 1 Million Jagged array > For intI = 1 To 1000000 > ReDim bytes(intI - 1)(1) ' Each vector to 1 byte > Next > This consume: 20 Mbytes of data?? when I was just supposing it will consume > 1 Mbyte > > Any help?? > > > > > > > "Patrice" <a@bc.c> escribió en el mensaje > news:%230JjGaiNGHA.2528@TK2MSFTNGP12.phx.gbl... > > The big picture may help (is this for things like spreadsheet cells ? What > > is the nature of those informations ?). > > > > For example a "sparse array" structure is a combination of elementar data > > structures that allows to store only filled in cells. (i.e. you can have a > > 200x200 cells array but non used cells won't take storage). > > > > Also as a side note that there is AFAIK not a single solution for this > > (for > > example basically it uses a list of rows/columns and each list holds a > > list > > of the cells on the row/column) but as long as the external interface is > > exposed as an array (cells(x,y)) you can actually change the actual > > implementation with no or minimal impact on the main code... > > > > -- > > Patrice > > > > "James" <i***@pricetech.es> a écrit dans le message de > > news:u4556$hNGHA.2560@TK2MSFTNGP09.phx.gbl... > >> Probably the question is not well define. What I need: > >> > >> I need the less consuming memory data structure that allow me to: > >> > >> 1) Keep a list of "Pointers" or "references" to data structures (for this > > a > >> Hashtable is fine). > >> 2) Each data structure can be resize dynamically, each one with different > >> sizes. > >> > >> An example could be: > >> An array of pointers, each pointer points to an array of bytes. That is: > >> 1 > >> array of M pointers, and M arrays of bytes (each one can hod different > >> number of bytes). > >> > >> Thanks, > >> > >> james > >> > >> > >> > >> "Cor Ligthert [MVP]" <notmyfirstn***@planet.nl> escribió en el mensaje > >> news:%23nuNPkhNGHA.2912@tk2msftngp13.phx.gbl... > >> > James, > >> > > >> > The first question you have to ask yourself forever with arrays is. > >> > "Are > >> > all elements of the array used". If not than in what amount not. > >> > > >> > Maybe is by instance the HashTable than a better choise. > >> > > >> > However in my opinion is there not to give a general answer based on > > your > >> > question as it is. > >> > > >> > Cor > >> > > >> > > >> > >> > >> > > > > > > > > > > Dennis,
That might not be a bad idea. That class could encapsulate the compression/decompression of the word and associated document names for more efficient storage. Sure, there'll be a performance hit, but with 1,000,000 word associations the OP is going to have to get creative with the data structure. It'll be an excerise in balancing space vs. performance. That's one thing you didn't tell us James. I think we have a pretty good handle on your memory requirements, but what about performance. Is that important? What is the typical scenario for accessing the data structure? Brian Dennis wrote: Show quoteHide quote > Have you considered using a class object that has a "Word" property and a > Collection of Document ID"s. You would have a class instance for each word. > -- > Dennis in Houston > >
empty GUID from string in VB2005
Help creating custom ComboBox control No tables visible in SQL Server connection Installing published application fails Text Files Changing CheckedState or Checked-value without raising ItemChecked-event Opacity Unable to open file Control Arrays Basic How To Question: Class Library or Windows Control Library |
|||||||||||||||||||||||