Home All Groups Group Topic Archive Search About

Search for values in between two values in a string?

Author
13 May 2009 5:13 AM
pat
The subject may not be written well but what I'm trying to do is
search for a value in a string by searching for its span class.  The
string is an HTML file so what I want is the value in between <span
class="myclass"> and </span>

Is there anything in vb.net that can do that?   c#.net has a function,
but I can't seem to find it in vb.net.  Thanks

Author
13 May 2009 7:41 AM
Mike
pat wrote:
> The subject may not be written well but what I'm trying to do is
> search for a value in a string by searching for its span class.  The
> string is an HTML file so what I want is the value in between <span
> class="myclass"> and </span>
>
> Is there anything in vb.net that can do that?   c#.net has a function,
> but I can't seem to find it in vb.net.  Thanks

You can  use XPATH or LINQ to perform XML or HTML lookups.  I'm  not
familar with LINQ and I believe its for .NET 3.0 with I haven't
upgraded to yet.  But in XPATH, its fairly simple;

Example HTML file:

<!-- File: c:\spantest.htm-->
<html>
<body>
<span class="myclass">text</span>
<span class="myclass1">text1</span>
<span class="myclass2">text2</span>
<span class="myclass2">text2.2</span>
<span class="myclass2">text2.3</span>
<span class="myclass2">text2.4</span>
<span class="myclass3">text3</span>
<span class="myclass4">text4</span>
<span class="myclass5">text5</span>
</body>
</html>

Using XPATH VB.NET Example:

Imports System.XML.XPath

Class Test_Xpath

     '' load xml doc, query and return values array
     Shared Function GetHtmlValue( _
              ByVal xpq As String, _
              ByVal xmlfn As String) As String()
         Dim xmldoc As New XPathDocument(xmlfn)
         Dim nav As XPathNavigator = xmldoc.CreateNavigator()
         Dim iterator As XPathNodeIterator
         Dim result As New List(Of String)
         Try
             iterator = nav.Select(xpq)
             Do While iterator.MoveNext
                 result.Add(iterator.Current.Value)
             Loop
         Catch ex As Exception
         End Try
         Return result.ToArray()
     End Function

     Shared Sub DoTest1(byval query as string, byval fn as string)
         Dim res As String() = GetHtmlValue(query, fn)
         Console.WriteLine("Total DOM elements Found: {0}", res.Length)
         For Each s As String In res
             Console.WriteLine("{0}", s)
         Next
         Console.ReadKey(True)
     End Sub

     Shared Sub main()

      ' expecting one
      DoTest1("//child::*/span[@class='myclass']", "c:\spantest.htm")

      ' expecting multiple
      DoTest1("//child::*/span[@class='myclass2']","c:\spantest.htm")

      ' expecting none
      DoTest1("//child::*/span[@class='myclassXX']","c:\spantest.htm")

     End Sub
End Class

Of course, learning  XPATH statements is  the trick. It is well
documented in MSDN.

--
Author
13 May 2009 7:33 PM
Herfried K. Wagner [MVP]
"Mike" <unkn***@unknown.tv> schrieb:
>> The subject may not be written well but what I'm trying to do is
>> search for a value in a string by searching for its span class.  The
>> string is an HTML file so what I want is the value in between <span
>> class="myclass"> and </span>
>>
>> Is there anything in vb.net that can do that?   c#.net has a function,
>> but I can't seem to find it in vb.net.  Thanks
>
> You can  use XPATH or LINQ to perform XML or HTML lookups.  I'm  not
> familar with LINQ and I believe its for .NET 3.0 with I haven't upgraded
> to yet.  But in XPATH, its fairly simple;

Note that this will only work with XHTML or HTML which is "compatible" to
XHTML, but it will not work with every HTML document.  For the latter, using
an SGML parser might be an option.

--
M S   Herfried K. Wagner
M V P  <URL:http://dotnet.mvps.org/>
V B   <URL:http://dotnet.mvps.org/dotnet/faqs/>