|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
What are HashTables good for?I am new to HashTables. What are they good for and what kinds of uses do
they have? I am looking for a broad generalized list of what people most commonly use them for. It is for research on my project to determine if they would be of some sort of good use in the project. Hope this helps
http://en.wikipedia.org/wiki/Hash_table Regards, -- Show quoteHide quoteAngel J. Hernández M MCP,MCAD,MCSD,MCDBA Microsoft MVP http://msmvps.com/blogs/angelhernandez *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* http://www.customware.net Tecnical Solutions Architect "Andy B." <a_bo***@sbcglobal.net> wrote in message news:ObKuU068JHA.3836@TK2MSFTNGP02.phx.gbl... > I am new to HashTables. What are they good for and what kinds of uses do > they have? I am looking for a broad generalized list of what people most > commonly use them for. It is for research on my project to determine if > they would be of some sort of good use in the project. > Like everything in Net, there is probably nobody which knows everything, so
there is no need that you try that. Be aware that you should use things as soon as you need them or when a course asks for that. The hashtable is a very old mechanism to store whatever data using tag keys which can help to find and sort things quicker. Today, it is more a kind of base class, because there are more classes which are easier and more maintainable to use. Cor Show quoteHide quote "Andy B." <a_bo***@sbcglobal.net> wrote in message news:ObKuU068JHA.3836@TK2MSFTNGP02.phx.gbl... >I am new to HashTables. What are they good for and what kinds of uses do >they have? I am looking for a broad generalized list of what people most >commonly use them for. It is for research on my project to determine if >they would be of some sort of good use in the project. > Do you have a useful replacement then?
Show quoteHide quote "Cor Ligthert[MVP]" <Notmyfirstn***@planet.nl> wrote in message news:%23X4iH878JHA.5040@TK2MSFTNGP04.phx.gbl... > Like everything in Net, there is probably nobody which knows everything, > so there is no need that you try that. > > Be aware that you should use things as soon as you need them or when a > course asks for that. > > The hashtable is a very old mechanism to store whatever data using tag > keys which can help to find and sort things quicker. > > Today, it is more a kind of base class, because there are more classes > which are easier and more maintainable to use. > > Cor > > > > "Andy B." <a_bo***@sbcglobal.net> wrote in message > news:ObKuU068JHA.3836@TK2MSFTNGP02.phx.gbl... >>I am new to HashTables. What are they good for and what kinds of uses do >>they have? I am looking for a broad generalized list of what people most >>commonly use them for. It is for research on my project to determine if >>they would be of some sort of good use in the project. >> > Andy
The hashtable is what i call a programming fundament , just like for instance arrays and collections they are always there somewhere behind the scenes of the fancy new methods however if you know how and when ( and also important when not ) to use them you can sure program more efective and can even code the fastest code as these "Low level" mechanisms are not bloated in anny way HTH Michel Show quoteHide quote "Andy B." <a_bo***@sbcglobal.net> schreef in bericht news:eOZYkv98JHA.3836@TK2MSFTNGP02.phx.gbl... > Do you have a useful replacement then? > "Cor Ligthert[MVP]" <Notmyfirstn***@planet.nl> wrote in message > news:%23X4iH878JHA.5040@TK2MSFTNGP04.phx.gbl... >> Like everything in Net, there is probably nobody which knows everything, >> so there is no need that you try that. >> >> Be aware that you should use things as soon as you need them or when a >> course asks for that. >> >> The hashtable is a very old mechanism to store whatever data using tag >> keys which can help to find and sort things quicker. >> >> Today, it is more a kind of base class, because there are more classes >> which are easier and more maintainable to use. >> >> Cor >> >> >> >> "Andy B." <a_bo***@sbcglobal.net> wrote in message >> news:ObKuU068JHA.3836@TK2MSFTNGP02.phx.gbl... >>>I am new to HashTables. What are they good for and what kinds of uses do >>>they have? I am looking for a broad generalized list of what people most >>>commonly use them for. It is for research on my project to determine if >>>they would be of some sort of good use in the project. >>> >> > > Andy B. wrote:
> I am new to HashTables. What are they good for and what kinds of uses do The short simplified answer is to serve as fast Lookup method. You > they have? I am looking for a broad generalized list of what people most > commonly use them for. It is for research on my project to determine if they > would be of some sort of good use in the project. would be very safe with that basic definition. A 2nd safe answer is they are used to create "unique" hash representation of data (i.e., common hash methods are MD5, SHA1, SHA256 which uses a specific hash table to generate a unique hash). Long answer: A HASH is a translation of a string or array of bytes to a fixed size representation, could be a string or number, so that you can use the HASH to find where something is stored in a block of data. For example, lets say you have a list of contacts with people names and email address: Contacts["Andy B."] = "a***@somewhere.com" Contacts["Mike"] = "m***@overthere.com" Behind the scene, this "mapping" concept will hash "Andy B" and "Mike" to some generic unique set of indexes in a HASH TABLE that closely match the string. Its a way of INDEXING a string that allows for a faster "ZOOM" of where the the answer might be. Lets use a very simplicity example here: lets say your hash table is an array of 26 bytes (characters) "A" to "Z" HashTable[0] = "A" HashTable[1] = "B" HashTable[2] = "C" ... HashTable[12] = "M" ... HashTable[25] = "Z" Its not really a hash table because a HASH uses some algorithm to create a numeric representation of "A" to "Z", but for the sake of simplicity, this example offers the same principle. So what is the hash index would "Andy B" fit and "Mike" fit? Well, HashTable[0] for "Andy B" because it starts with "A" and HashTable[12] for "Mike' because it starts with a "M" Thats the basic idea of a HastTable. But like I said, a HashTable generically uses some HASH algorithm or function to calculate a hash of some string. The algorithm could be just the first four bytes of a string to generate a HASH to determine where to place the string in the table. Maybe the first four character people names is good enough to separate names into different groups. What that means is that one or more people can have the same hash, ANDY B. ANDY GARCIA ANDY SMITH ANDY WHOEVER etc, would all have the same hash index. So one make ask, it is possible to had an untold number of "ANDY" in the world. How many can you fit in the same Hash location? That is where Hash Table Page Size come in to play. You might want to say a page will have 100 people before a new PAGE is create for the hash. So you can have: HashTable[0][0] = 1st group of 100 "ANDY" people HastTable[0][1] = 2nd group of 100 "ANDY" people HastTable[0][2] = 3nd group of 100 "ANDY" people, but 1/2 full etc. When it comes time to adding more "ANDY" people, it will fill up the page before a new hash page is created. Since Page 2 is 1/2 empty, it will begin there. Now, consider the situation where not everyone is ANDY, but maybe ANDI or ANDREW. Depending of hashing function, it could maybe lump all of these under the same hash. For example if only the first letter was used they would all be lumped together in page 0, HashTable[0][0]. HashTable[0][0] = "ANDY", "ANDI", "ANDREW" So this is very much dependent of how the Hash function is defined, the number of indexes you have and the page size. Sometimes in practice, the hasttable helps create a tree: HashTable[index] = array of pages of size X and each page is further expand with another tree of pages that is based on the LEFT or RIGHT sort order of a string. So for example, using ANDY, ANDI and ANDREW, the sort order is: ANDI ANDREW ANDY As the page is filled with the sorted names, once it fills up, a new page is created in the position that make its sort consistent. So for example, given new names ANDRIOD and ANDRA, you will may be placed like this in the hash table tree of pages: ANDI ANDRA ANDREW ANDRIOD ANDY The above is probably wrong placements, including the idea that pages may be moved around, but overall the ideas is that left and right pages based on sorting. The basic idea of using hash table is to create a way to store a list of information using a specific hash indexing method that will allow for a fast lookup. Almost all, if not all, the collection classes or templates use a hash table for the key=value associations. Generally, the type of collection is based on your needs: For example, like in the mapping example used above: Collection[key] = value If the collection is one dimension, using the same key MAY replace the existing one: Contacts["Andy B."] = "a***@somewhere1.com" Contacts["Andy B."] = "a***@somewhere2.com" The 2nd one is a replacement of the value for the same key. So just like DATABASE ideas of UNIQUE or ALLOW DUPES, you have classes that may offer the same idea. If that was the case here, then you might has two "ANDY B." keys. It all depends on the collection class behavior and what it offers you. These are things you might look for. Some classes even allow you to replace it default HASH function. A good example for that is case sensitivity: Contacts["Andy B."] = "a***@somewhere1.com" Contacts["andy b."] = "a***@somewhere2.com" That could be two or one key=value pairs depending on collection class you are using. If the class does not offer a option for case sensitivity, it may offer you a HASH function override where in hash function override you lower or upper case the key before calculating the hash. As you can probably imagine, this need is common to consider, "Andy B" and "andy b" are really the same person. :-) Finally, what are practical examples of a hash function? I can't tell you what the .NET collection class uses, but the C/C++ CMap collection class template uses this: template<class VALUE, class ARG_VALUE> AFX_INLINE UINT CMapEx<VALUE, ARG_VALUE>::HashString(TKEYARG Key) const { LPCSTR key = Key; UINT nHash = 0; while (*key) { nHash = (nHash<<5) + nHash + towlower(*key++); } return nHash; } It loops thru all key characters. It lower cases the character. Makes the key=value association case insensitive. It accumulates the lower case with the last hash value and the binary bit left shifting of the hash value. Go figure! Probably some computer science guy can tell us why the above is good enough. :-) But other common hash functions used in practice are: CRC Cyclic Redundancy Check, Adds all bytes, result 2 bytes CRC32 32 bit (four byte) version of CRC MD5 Message Digest SHA1 Secure Hash Algorithm SHA256 Secure Hash Algorithm 256 bit version SOUNDEX Popular method for getting a common phonetic of a word and many others, but the above the long time common ones you hear about used in many things and embedded in many things we take for granted. Modern Modem and File transfers and even the internet (TCP) uses cyclic redundancy checks to make sure what was was sent is the same that was received. MD5 and SHAx uses specific HASH TABLES to generate a final hash value. With these the idea is to create an unique hash for a given string. You should explore th .NET crypto assembly for these. DIM hash as string = MD5("Andy B.") SHA1 and SHA256 is popular in security applications (like SSL, DKIM). SHA1 is now considered obsolete since the Chinese has shown it is "easy" (relatively speaking) for two different strings can produce the same hash value. This is called clobber. SHA256 is the replacement recommendation and more secured (harder to clobber), but it was also been recently shown that it has weaknesses for security purposes but good enough for the next 2-3 years. -- Andy B. wrote:
> I am new to HashTables. What are they good for and what kinds of uses do Nothing - you should use the Dictionary<> class instead. It's the > they have? I am looking for a broad generalized list of what people most > commonly use them for. It is for research on my project to determine if they > would be of some sort of good use in the project. generic replacement for the HashTable class, which means that it's strongly typed so that you don't have to cast everything that you read from it. A Dictionary is used whenever you want a fast lookup on the key for a key-value pair. Getting an item by key is close to an O(1) operation, which means that it practically takes the same time to get the item regardless of how many items there are in the Dictionary. I use the hashtable a lot in my projects , for the simple reasson that it is
lightweight and fast in fact it is unbeatable in terms of speed when retrieving key value pairs ( tested and proofed by Balena see the 2003 book ) Where do i use it for ? If for some reasson i need to store a key value and must check if this value already exists in the table in a later stage or if i want to filter duplicates and simply want to ignore the same keys ( if you set the item property in a hashtable you do not need to check for existence ) or if i need a storage that can hold anny sort of object mixed I you have a rough estimation of data and initilaize the table with so manny slots ( or with a slight higher margin ) then the hashtable uses turbo boost Regards Michel Show quoteHide quote "Andy B." <a_bo***@sbcglobal.net> schreef in bericht news:ObKuU068JHA.3836@TK2MSFTNGP02.phx.gbl... >I am new to HashTables. What are they good for and what kinds of uses do >they have? I am looking for a broad generalized list of what people most >commonly use them for. It is for research on my project to determine if >they would be of some sort of good use in the project. >
.net 3.5 with vs 2005
Need some help from VB Programmers... How to hide base class member variable in derived class (w/o shadows)? Application Icon distributing software Alternative to strongly typed datasets winmm book? Uisng ADO.Net The Entity Framework "Random" numbers cause pattern? Need help in finding error in SOAP web service call |
|||||||||||||||||||||||