Home All Groups Group Topic Archive Search About

String comparison algorithms

Author
18 Jul 2006 3:26 PM
almurph@altavista.com
Hi,


    Hope you can help me with this one. I'm looking for some nice string
comparison algorithms. I want to be able to compare 2 strings (fairly
smallish, less than 50 characters) and return a % of how well they are
similar. So, 2 strings that are absoloutly identical will return 100%.
Strings that are radically different will return numbers near 0%:-


Tokenistic Examples:

String1        String2            % Comnparison
Albatross    Car                5
Car        Car                100



    I would appreciate any
comments/code-samples/suggestions/user-experiences that you may have...


Thanks in advance,
Al.

PS: I already have implemented Lavenstein distance, so no worries there.

Author
18 Jul 2006 5:31 PM
Samuel Shulman
It all depends how you really want to rate the similarity

In the case that you returned 5 I don't see any similarity at all



<almu***@altavista.com> wrote in message
Show quoteHide quote
news:1153236393.121201.239300@i42g2000cwa.googlegroups.com...
> Hi,
>
>
> Hope you can help me with this one. I'm looking for some nice string
> comparison algorithms. I want to be able to compare 2 strings (fairly
> smallish, less than 50 characters) and return a % of how well they are
> similar. So, 2 strings that are absoloutly identical will return 100%.
> Strings that are radically different will return numbers near 0%:-
>
>
> Tokenistic Examples:
>
> String1 String2 % Comnparison
> Albatross Car 5
> Car Car 100
>
>
>
> I would appreciate any
> comments/code-samples/suggestions/user-experiences that you may have...
>
>
> Thanks in advance,
> Al.
>
> PS: I already have implemented Lavenstein distance, so no worries there.
>
Author
19 Jul 2006 9:49 AM
nime
MySQL has got a similar command: soundex

  $sql = "
    SELECT title FROM entries
    WHERE
     (
     title LIKE '$ax%'
     OR soundex(title) LIKE soundex('$ax')
     )
    LIMIT 20
   ";




Show quoteHide quote
<almu***@altavista.com> wrote in message news:1153236393.121201.239300@i42g2000cwa.googlegroups.com...
> Hi,
>
>
> Hope you can help me with this one. I'm looking for some nice string
> comparison algorithms. I want to be able to compare 2 strings (fairly
> smallish, less than 50 characters) and return a % of how well they are
> similar. So, 2 strings that are absoloutly identical will return 100%.
> Strings that are radically different will return numbers near 0%:-
>
>
> Tokenistic Examples:
>
> String1 String2 % Comnparison
> Albatross Car 5
> Car Car 100
>
>
>
> I would appreciate any
> comments/code-samples/suggestions/user-experiences that you may have...
>
>
> Thanks in advance,
> Al.
>
> PS: I already have implemented Lavenstein distance, so no worries there.
>
Author
19 Jul 2006 2:36 PM
Chris Dunaway
nime wrote:
> MySQL has got a similar command: soundex

But soundex is not a string comparison method.  It compares words by
their sound (hence the name).  I doesn't give a "score" of how well two
words compare.
Author
19 Jul 2006 3:13 PM
G Himangi
I think there is a dynamic programming algorithm to determine this : its
called length between the 2 strings. Google this up and I am sure you will
find some code


---------
- G Himangi, Sky Software http://www.ssware.com
Shell MegaPack : Drop-In Explorer GUI Controls For Your Apps (.Net & ActiveX
Editions Available)
EZNamespaceExtensions.Net : Develop namespace extensions rapidly in .Net
EZShellExtensions.Net : Develop all shell extensions rapidly in .Net
---------



<almu***@altavista.com> wrote in message
Show quoteHide quote
news:1153236393.121201.239300@i42g2000cwa.googlegroups.com...
> Hi,
>
>
> Hope you can help me with this one. I'm looking for some nice string
> comparison algorithms. I want to be able to compare 2 strings (fairly
> smallish, less than 50 characters) and return a % of how well they are
> similar. So, 2 strings that are absoloutly identical will return 100%.
> Strings that are radically different will return numbers near 0%:-
>
>
> Tokenistic Examples:
>
> String1 String2 % Comnparison
> Albatross Car 5
> Car Car 100
>
>
>
> I would appreciate any
> comments/code-samples/suggestions/user-experiences that you may have...
>
>
> Thanks in advance,
> Al.
>
> PS: I already have implemented Lavenstein distance, so no worries there.
>