|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Regex Output to TextboxI am extremely new to programming so I apologize in advance for my lack of knowlege, vocabulary and if my code is sloppy. I will take any suggestions to clean up my code and to follow any best pratices that I missed. I am taking a stab at writing a program through Visual Basic 2005 Express Edition that will (don't laugh...) keep track, search through and find new friends with specific interests on myspace. The functionality of myspace is quite a let down so I thought I would try to create a program to fill my need. I am stuck at the moment on trying to do a httpWebRequest and regex to give me an output into a textbox of friend IDs. Here my code so far that I have figured out through researching the boards, msdn and using Expresso for my regular expression: ' Imports System.Text.RegularExpressions Private Sub GetIDsButton_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles GetIDsButton.Click Dim myURL As String = ToolStripTextBox1.Text Dim myReg As Net.HttpWebRequest = DirectCast(Net.WebRequest.Create(myURL), _ Net.HttpWebRequest) Dim myResp As Net.HttpWebResponse = _ DirectCast(myReg.GetResponse(), Net.HttpWebResponse) Dim myStream As IO.Stream = myResp.GetResponseStream() Dim myreader As New IO.StreamReader(myStream) Dim myPage As String = myreader.ReadToEnd() myResp.Close() '' Regular expression built for Visual Basic on: Thu, Jul 6, 2006, 12:07:39 PM '' Using Expresso Version: 2.1.2150, http://www.ultrapico.com '' '' A description of the regular expression: '' '' [1]: A numbered capture group. [\d{5,9}] '' Any digit, between 5 and 9 repetitions '' Match a suffix but exclude it from the capture. ["><img\b] '' "><img\b '' "><img '' First or last character in a word ' Dim r As Regex = New Regex( _ "(\d{5,9})(?=""><img\b)" + vbCrLf + "", _ RegexOptions.IgnoreCase _ Or RegexOptions.IgnorePatternWhitespace _ ) Dim m As RegularExpressions.Match Dim mLink As RegularExpressions.Match For Each m In r.Matches(myPage) Next TextBox1.Text = mLink I am getting the error: 'Error 1 Value of type 'System.Text.RegularExpressions.Match' cannot be converted to 'String'. Can someone help me figure out what I am doing wrong? I haven't checked yet, but isn't Matches the collection of matches
returned? Therefore your code would need to look like this: For Each m In r.Matches Next Then of course you still need to pass your page, probably another method I thinks. But I could be wrong. The error indicates a type mismatch and I am guessing that r.Matches(myPage) returns strings. Show quoteHide quote > For Each m In r.Matches(myPage) > Next Thanks for writing me back Steven. I think you are exactly right. It
seems that I am trying to return a string from the collection of matches which isn't possible. However I am totally lost about how to work around this. I think I figure out how to get my webpage into a stream and I think I even have the regular expression figured out, but I am not sure how to pass in my streamed text to the regular expression and then have the results output into a textbox. Any ideas? Also, just to clarify the error is highlighting mLink and saying "Value of type 'System.Text.RegularExpressions.Match' cannot be converted to 'String'." Which I think it is telling me exactly what you wrote below, but I just wanted to add that. Any ideas how I should code this to run my stream through and output into a textbox? Thanks again for your help. Steven Nagy wrote: Show quoteHide quote > I haven't checked yet, but isn't Matches the collection of matches > returned? > Therefore your code would need to look like this: > > For Each m In r.Matches > Next > > Then of course you still need to pass your page, probably another > method I thinks. > But I could be wrong. The error indicates a type mismatch and I am > guessing that r.Matches(myPage) returns strings. > > > For Each m In r.Matches(myPage) > > Next Check Larry's post below.
I'm not 100% on the syntax you need and MSDN site is running too slow to use currently (is it just me?) so I can't check usage of the REGEX class. Larry's suggestion with a tree representation of elements sounds quite useful, but not sure what sort of speed performance issues there are. Sarah wrote:
> Hi - I would strongly suggest you stop trying to use regex to parse html,> > I am extremely new to programming so I apologize in advance for my lack > of knowlege, vocabulary and if my code is sloppy. I will take any > suggestions to clean up my code and to follow any best pratices that I > missed. I am taking a stab at writing a program through Visual Basic > 2005 Express Edition that will (don't laugh...) keep track, search > through and find new friends with specific interests on myspace. The > functionality of myspace is quite a let down so I thought I would try > to create a program to fill my need. I am stuck at the moment on > trying to do a httpWebRequest and regex and instead track down HtmlAgilityPack which parses html and gives you a nice document tree back. As a newbie this might look intimidating to start with but believe me, getting used to working a document tree will in the short and long term be far more enjoyable and useful than trying to get regexes for parsing html right. -- Larry Lard Replies to group please When starting a new topic, please mention which version of VB/C# you are using |
|||||||||||||||||||||||