|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Awe Forget itThis is a bunch of bull cr*p. I have tried copying tables out on the web
and there are so many variations that its not feasable to write a single regex for every situation. So, I give up. Hello...
Please try to keep your related posts together in one thread ;) And now a sugestion: Try the HTML-DOM and look at the tags there... They have a property of inner text, which can be used to extract the text out of any HTML-Node or even the whole document... Or you can examine all tables or table.row or tabledata fields and extract the information from there. But scrapping information from websites is usually quite hard ;) This will get all the tables: Set IgnoreCase and SingleLine options. Use
groups. <table .*?</table> Show quoteHide quote "Just Me" <news.microsoft.com> wrote in message news:%23iJpLxMLHHA.1008@TK2MSFTNGP06.phx.gbl... > This is a bunch of bull cr*p. I have tried copying tables out on the web > and there are so many variations that its not feasable to write a single > regex for every situation. > > So, I give up. > "Just Me" <news.microsoft.com> wrote in message Well shux, why don't you just read the file one char at a time and use use news:%23iJpLxMLHHA.1008@TK2MSFTNGP06.phx.gbl... > This is a bunch of bull cr*p. I have tried copying tables out on the web > and there are so many variations that its not feasable to write a single > regex for every situation. > "if" statements and comparison operators? It won't be a minimal task, but it won't be that tough, either. HTML Parser
http://www.codeproject.com/dotnet/apmilhtml.asp Show quoteHide quote "Just Me" <news.microsoft.com> wrote in message news:%23iJpLxMLHHA.1008@TK2MSFTNGP06.phx.gbl... > This is a bunch of bull cr*p. I have tried copying tables out on the web > and there are so many variations that its not feasable to write a single > regex for every situation. > > So, I give up. > Just Me,
Why than using Regex, MSHTML is much easier to get information about webdocuments. Be aware that a page can exist from more documents (frames) http://www.vb-tips.com/dbpages.aspx?ID=541adf13-d9c0-435c-893f-56dbb63fdf1c Be aware that our website is extremely in reconstruction these weeks. I hope this helps, Cor Show quoteHide quote "Just Me" <news.microsoft.com> schreef in bericht news:%23iJpLxMLHHA.1008@TK2MSFTNGP06.phx.gbl... > This is a bunch of bull cr*p. I have tried copying tables out on the web > and there are so many variations that its not feasable to write a single > regex for every situation. > > So, I give up. >
ExecuteNonQuery - problem
How to modify label.text in a dynamically generated label in VB.net Typing Markup Tags Ending in /> Forms ok in design mode, cropped in run mode, only on my monitor question about sub New() in a class Getting values between forms Process.Start A happy 2007 question about returning value of a function Regex question |
|||||||||||||||||||||||