Home All Groups Group Topic Archive Search About

count occurrences of string within a string

Author
22 Aug 2006 12:15 AM
Dana King
I'm looking for some other developers opinions, I'm trying to find the best
way to count strings within a string using VB.net.

I have tested five methods and have found the String.Replace method is the
fastest and the Regex.Matches.Count to be the slowest. I posted my results
and source code to my web site. If you could take a look and maybe suggest
an even faster method I'd like to hear from you. Also, if anyone can tell me
why Regex is so much slower than the rest I'd like to know about it.

Thanks.

http://www.dotnetmaniac.com/ArticleViewer.aspx?Key={4d9141de-2f82-4490-80ca-c9f725c4e291}

Author
22 Aug 2006 2:26 AM
GhostInAK
Hello Dana,

Looking for only a single character I can get about twice as much speed using
a char loop. 

Using For Next and String.Equals...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.21875

Using Do Loop with InStr function...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.25

Using Do Loop with String.IndexOf...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.109375

Using String.Replace...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.0625

Using RegEx.Matches.Count...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.828125

Using Char Loop...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.015625

snippet:
'== Using char loop
    Private Sub test6(ByVal totalLength As Integer, ByVal fileContents As String)

        Dim one As Integer = 0
        Dim startTime As Int64 = Now.Ticks
        Dim tCount As Integer = 0

        For tCount = 0 To fileContents.Length - 1
            If fileContents(tCount) = "1"c Then
                one += 1
            End If
        Next
        Dim endTime As Long = Now.Ticks
        Dim elapsedTime As Double = TimeSpan.FromTicks(endTime - startTime).TotalSeconds
        Console.WriteLine("Using Char Loop...")
        Console.WriteLine("1's found: " & Format(one, "#,#") & " in " & Format(totalLength,
"#,#") & " characters.")
        Console.WriteLine("elpased time: " & elapsedTime.ToString)
        Console.WriteLine()
        Console.WriteLine()
    End Sub


-Boo

Show quoteHide quote
> I'm looking for some other developers opinions, I'm trying to find the
> best way to count strings within a string using VB.net.
>
> I have tested five methods and have found the String.Replace method is
> the fastest and the Regex.Matches.Count to be the slowest. I posted my
> results and source code to my web site. If you could take a look and
> maybe suggest an even faster method I'd like to hear from you. Also,
> if anyone can tell me why Regex is so much slower than the rest I'd
> like to know about it.
>
> Thanks.
>
> http://www.dotnetmaniac.com/ArticleViewer.aspx?Key={4d9141de-2f82-4490
> -80ca-c9f725c4e291}
>
Author
22 Aug 2006 3:37 AM
Mudhead
In test1 try this:

For i As Integer = 0 To totalLength - 1
     If fileContents(i) = "1"c Then one += 1
Next


In test3 make sure you specify a char (i.e."1"c) and not a string("1")

Do : Result = fileContents.IndexOf("1"c, Start)
       If Result = -1 Then Exit Do
     one += 1 : Start = Result + 1 : Loop



Show quoteHide quote
"Dana King" <bushido***@hotmail.com> wrote in message
news:e4JkXAYxGHA.5044@TK2MSFTNGP05.phx.gbl...
> I'm looking for some other developers opinions, I'm trying to find the
> best way to count strings within a string using VB.net.
>
> I have tested five methods and have found the String.Replace method is the
> fastest and the Regex.Matches.Count to be the slowest. I posted my results
> and source code to my web site. If you could take a look and maybe suggest
> an even faster method I'd like to hear from you. Also, if anyone can tell
> me why Regex is so much slower than the rest I'd like to know about it.
>
> Thanks.
>
> http://www.dotnetmaniac.com/ArticleViewer.aspx?Key={4d9141de-2f82-4490-80ca-c9f725c4e291}
>
>
>
Author
22 Aug 2006 5:30 AM
Cor Ligthert [MVP]
Dana,

We have tested this some years ago, from my memory.

For a string in a string is the best method to count to use the VB Net Instr
while going forward in the string,

For a single char in a string is the best method the string.indexoff("x"c)
doing the same.

http://msdn2.microsoft.com/en-us/library/8460tsh1.aspx

For sure I remember that with counting strings is the Instr twice as fast as
the indexof. The regex and the split string are as far as I remember me
about 100 times slower. Using the instr with a single char is extremely
slower than the indexof with a char.

I hope this helps,

Cor



Show quoteHide quote
"Dana King" <bushido***@hotmail.com> schreef in bericht
news:e4JkXAYxGHA.5044@TK2MSFTNGP05.phx.gbl...
> I'm looking for some other developers opinions, I'm trying to find the
> best way to count strings within a string using VB.net.
>
> I have tested five methods and have found the String.Replace method is the
> fastest and the Regex.Matches.Count to be the slowest. I posted my results
> and source code to my web site. If you could take a look and maybe suggest
> an even faster method I'd like to hear from you. Also, if anyone can tell
> me why Regex is so much slower than the rest I'd like to know about it.
>
> Thanks.
>
> http://www.dotnetmaniac.com/ArticleViewer.aspx?Key={4d9141de-2f82-4490-80ca-c9f725c4e291}
>
>
>