|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
[Regular Expression] extraction when bounds are vbCr and vbLfI need to extract a subtext from a text. The subtext must contain a given word. The subtext bounds are: vbCr (return) vbLf (new line) vbCrLf (return+new line) the very beginning of the text the very ending of the text I tried with: ^ \n \r $ so to have: Dim myText As String Dim myPattern As String = "^\n\r" & myWord & "\n\r$" Dim match As Match = Regex.Match(myText, myPattern, RegexOptions.Multiline Or RegexOptions.IgnoreCase) but I had problems. Try this, where Text is the subtext
Dim FoundMatch As Boolean Try FoundMatch = Regex.IsMatch(SubjectString, "^$Text$", RegexOptions.Multiline) Catch ex As ArgumentException 'Syntax error in the regular expression End Try HTH Chris Show quoteHide quote "teo" <t**@inwind.it> wrote in message news:gparf293biklfnpte16i7707ghi3n83v8i@4ax.com... > Hallo > > I need to extract a subtext from a text. > The subtext must contain a given word. > > The subtext bounds are: > > vbCr (return) > vbLf (new line) > vbCrLf (return+new line) > the very beginning of the text > the very ending of the text > > > I tried with: > > ^ > \n > \r > $ > > so to have: > > Dim myText As String > Dim myPattern As String = "^\n\r" & myWord & "\n\r$" > > Dim match As Match = Regex.Match(myText, myPattern, RegexOptions.Multiline > Or RegexOptions.IgnoreCase) > > but I had problems. Hi Teo,
Just to clarify, are you trying to find all the lines in a given file that contain a particular word? What does your data look like, are these strictly text files? Can you give me an example that I can test on. Where ever there is a VbLf, VbCr, or VbCrLf you can just make note of it This is some text VbCrLf that I want to test VbCrLf against. Regulazy or the Regulator by Roy Osherove might help as well. http://tools.osherove.com/ Chris Show quoteHide quote "teo" <t**@inwind.it> wrote in message news:gparf293biklfnpte16i7707ghi3n83v8i@4ax.com... > Hallo > > I need to extract a subtext from a text. > The subtext must contain a given word. > > The subtext bounds are: > > vbCr (return) > vbLf (new line) > vbCrLf (return+new line) > the very beginning of the text > the very ending of the text > > > I tried with: > > ^ > \n > \r > $ > > so to have: > > Dim myText As String > Dim myPattern As String = "^\n\r" & myWord & "\n\r$" > > Dim match As Match = Regex.Match(myText, myPattern, RegexOptions.Multiline > Or RegexOptions.IgnoreCase) > > but I had problems. I uploaded a zip file (2 Kb) that contains a .rtf file
with the explanation and a sample, here: http://www.zshare.net/download/regexsmp-zip.html (no java required) Show quoteHide quote >Hi Teo, > >Just to clarify, are you trying to find all the lines in a given file that >contain a particular word? > >What does your data look like, are these strictly text files? Can you give >me an example that I can test on. Where ever there is a VbLf, VbCr, or >VbCrLf you can just make note of it > >This is some text VbCrLf >that I want to test VbCrLf >against. > >Regulazy or the Regulator by Roy Osherove might help as well. >http://tools.osherove.com/ > >Chris > >"teo" <t**@inwind.it> wrote in message >news:gparf293biklfnpte16i7707ghi3n83v8i@4ax.com... >> Hallo >> I need to extract a subtext from a text. >> The subtext must contain a given word. >> >> The subtext bounds are: >> >> vbCr (return) >> vbLf (new line) >> vbCrLf (return+new line) >> the very beginning of the text >> the very ending of the text >> >> >> I tried with: >> >> ^ >> \n >> \r >> $ >> >> so to have: >> >> Dim myText As String >> Dim myPattern As String = "^\n\r" & myWord & "\n\r$" >> >> Dim match As Match = Regex.Match(myText, myPattern, RegexOptions.Multiline >> Or RegexOptions.IgnoreCase) >> >> but I had problems. > Hi Teo,
Thanks for putting that up there. It helped nicely. Try the following code: Imports System.Text.RegularExpressions Imports System.Windows.Forms Imports System.IO Public Module Module1 Public Sub main() Dim fileName As String = InputBox("Give me the file to parse", _ "File name input box") CheckContents(fileName) End Sub ''' <summary> ''' Check the contents of a file ''' </summary> ''' <param name="Filename"></param> ''' <remarks>Could be expanded to check against multiple ''' keywords by adding another argument that contains the ''' keyword and inserting it in place of the DIO characters</remarks> Public Sub CheckContents(ByVal Filename As String) 'Declare RegExp Dim dioRegex As New Regex(".*DIO.*(\n|\r|\r\n)", RegexOptions.IgnoreCase) 'Make sure the file is really there Dim fileExists As Boolean fileExists = My.Computer.FileSystem.FileExists(Filename) 'Throw exception if the file is not there If Not fileExists Then Throw New FileNotFoundException 'Get the contents of the file Dim fileContents As String fileContents = My.Computer.FileSystem.ReadAllText(Filename) 'Check File Contents Against Regex Dim dioMatches As MatchCollection = dioRegex.Matches(fileContents) 'Loop though all of the matches and do something cool with them For Each dioMatch As Match In dioMatches 'Your cool code goes here :o) 'I'm just going to print the results to a messagebox MsgBox(dioMatch.Value) Next End Sub End Module Please keep in mind that some of the RTF formatting characters are left. I didn't know if you wanted them left in, but you should be able to easily strip out the /p and other character combinations using Str.Replace(oldChar, newChar) where Str is the your data. Best regards, Chris Show quoteHide quote "teo" <t**@inwind.it> wrote in message news:ljurf21nv19sgmncmaca6i5fobrot9pv4r@4ax.com... >I uploaded a zip file (2 Kb) that contains a .rtf file > with the explanation and a sample, > here: > http://www.zshare.net/download/regexsmp-zip.html > (no java required) > > > >>Hi Teo, >> >>Just to clarify, are you trying to find all the lines in a given file that >>contain a particular word? >> >>What does your data look like, are these strictly text files? Can you give >>me an example that I can test on. Where ever there is a VbLf, VbCr, or >>VbCrLf you can just make note of it >> >>This is some text VbCrLf >>that I want to test VbCrLf >>against. >> >>Regulazy or the Regulator by Roy Osherove might help as well. >>http://tools.osherove.com/ >> >>Chris >> >>"teo" <t**@inwind.it> wrote in message >>news:gparf293biklfnpte16i7707ghi3n83v8i@4ax.com... >>> Hallo >>> I need to extract a subtext from a text. >>> The subtext must contain a given word. >>> >>> The subtext bounds are: >>> >>> vbCr (return) >>> vbLf (new line) >>> vbCrLf (return+new line) >>> the very beginning of the text >>> the very ending of the text >>> >>> >>> I tried with: >>> >>> ^ >>> \n >>> \r >>> $ >>> >>> so to have: >>> >>> Dim myText As String >>> Dim myPattern As String = "^\n\r" & myWord & "\n\r$" >>> >>> Dim match As Match = Regex.Match(myText, myPattern, >>> RegexOptions.Multiline >>> Or RegexOptions.IgnoreCase) >>> >>> but I had problems. >> > I made few tests and I faced one problem:
the last sentence is never matched (that is if the word is in the last sentence I'm not able to extract the sentence; while if it is in the first sentence, it is all OK...) Show quoteHide quote >Hi Teo, > >Thanks for putting that up there. It helped nicely. > > >Try the following code: > >Imports System.Text.RegularExpressions >Imports System.Windows.Forms >Imports System.IO >Public Module Module1 > > Public Sub main() > Dim fileName As String = InputBox("Give me the file to parse", _ > "File name input box") > CheckContents(fileName) > > End Sub > > ''' <summary> > ''' Check the contents of a file > ''' </summary> > ''' <param name="Filename"></param> > ''' <remarks>Could be expanded to check against multiple > ''' keywords by adding another argument that contains the > ''' keyword and inserting it in place of the DIO characters</remarks> > Public Sub CheckContents(ByVal Filename As String) > > 'Declare RegExp > Dim dioRegex As New Regex(".*DIO.*(\n|\r|\r\n)", >RegexOptions.IgnoreCase) > > 'Make sure the file is really there > Dim fileExists As Boolean > fileExists = My.Computer.FileSystem.FileExists(Filename) > > 'Throw exception if the file is not there > If Not fileExists Then Throw New FileNotFoundException > > 'Get the contents of the file > Dim fileContents As String > fileContents = My.Computer.FileSystem.ReadAllText(Filename) > > 'Check File Contents Against Regex > Dim dioMatches As MatchCollection = dioRegex.Matches(fileContents) > > 'Loop though all of the matches and do something cool with them > For Each dioMatch As Match In dioMatches > > 'Your cool code goes here :o) > > 'I'm just going to print the results to a messagebox > MsgBox(dioMatch.Value) > > Next > > End Sub >End Module > > >Please keep in mind that some of the RTF formatting characters are left. I >didn't know if you wanted them left in, but you should be able to easily >strip out the /p and other character combinations using Str.Replace(oldChar, >newChar) where Str is the your data. > >Best regards, > >Chris > > >"teo" <t**@inwind.it> wrote in message >news:ljurf21nv19sgmncmaca6i5fobrot9pv4r@4ax.com... >>I uploaded a zip file (2 Kb) that contains a .rtf file >> with the explanation and a sample, >> here: >> http://www.zshare.net/download/regexsmp-zip.html >> (no java required) >> >> >> >>>Hi Teo, >>> >>>Just to clarify, are you trying to find all the lines in a given file that >>>contain a particular word? >>> >>>What does your data look like, are these strictly text files? Can you give >>>me an example that I can test on. Where ever there is a VbLf, VbCr, or >>>VbCrLf you can just make note of it >>> >>>This is some text VbCrLf >>>that I want to test VbCrLf >>>against. >>> >>>Regulazy or the Regulator by Roy Osherove might help as well. >>>http://tools.osherove.com/ >>> >>>Chris >>> >>>"teo" <t**@inwind.it> wrote in message >>>news:gparf293biklfnpte16i7707ghi3n83v8i@4ax.com... >>>> Hallo >>>> I need to extract a subtext from a text. >>>> The subtext must contain a given word. >>>> >>>> The subtext bounds are: >>>> >>>> vbCr (return) >>>> vbLf (new line) >>>> vbCrLf (return+new line) >>>> the very beginning of the text >>>> the very ending of the text >>>> >>>> >>>> I tried with: >>>> >>>> ^ >>>> \n >>>> \r >>>> $ >>>> >>>> so to have: >>>> >>>> Dim myText As String >>>> Dim myPattern As String = "^\n\r" & myWord & "\n\r$" >>>> >>>> Dim match As Match = Regex.Match(myText, myPattern, >>>> RegexOptions.Multiline >>>> Or RegexOptions.IgnoreCase) >>>> >>>> but I had problems. >>> >> > Hi Teo,
I missed the case if there is not a line feed, carriage return or combination. Try replacing the dioRegex, in the CheckContents sub, with the following: Dim dioRegex As New Regex(".*DIO.*((\n|\r|\r\n)|.*)", RegexOptions.IgnoreCase) Hope that helps, Chris Show quoteHide quote "teo" <t**@inwind.it> wrote in message news:t1ntf2divnv4c9s6r3slb95nnirin3aq1p@4ax.com... >I made few tests and I faced one problem: > > the last sentence is never matched > > (that is > if the word is in the last sentence > I'm not able to extract the sentence; > while if it is in the first sentence, it is all OK...) > > > > >>Hi Teo, >> >>Thanks for putting that up there. It helped nicely. >> >> >>Try the following code: >> >>Imports System.Text.RegularExpressions >>Imports System.Windows.Forms >>Imports System.IO >>Public Module Module1 >> >> Public Sub main() >> Dim fileName As String = InputBox("Give me the file to parse", _ >> "File name input box") >> CheckContents(fileName) >> >> End Sub >> >> ''' <summary> >> ''' Check the contents of a file >> ''' </summary> >> ''' <param name="Filename"></param> >> ''' <remarks>Could be expanded to check against multiple >> ''' keywords by adding another argument that contains the >> ''' keyword and inserting it in place of the DIO characters</remarks> >> Public Sub CheckContents(ByVal Filename As String) >> >> 'Declare RegExp >> Dim dioRegex As New Regex(".*DIO.*(\n|\r|\r\n)", >>RegexOptions.IgnoreCase) >> >> 'Make sure the file is really there >> Dim fileExists As Boolean >> fileExists = My.Computer.FileSystem.FileExists(Filename) >> >> 'Throw exception if the file is not there >> If Not fileExists Then Throw New FileNotFoundException >> >> 'Get the contents of the file >> Dim fileContents As String >> fileContents = My.Computer.FileSystem.ReadAllText(Filename) >> >> 'Check File Contents Against Regex >> Dim dioMatches As MatchCollection = dioRegex.Matches(fileContents) >> >> 'Loop though all of the matches and do something cool with them >> For Each dioMatch As Match In dioMatches >> >> 'Your cool code goes here :o) >> >> 'I'm just going to print the results to a messagebox >> MsgBox(dioMatch.Value) >> >> Next >> >> End Sub >>End Module >> >> >>Please keep in mind that some of the RTF formatting characters are left. I >>didn't know if you wanted them left in, but you should be able to easily >>strip out the /p and other character combinations using >>Str.Replace(oldChar, >>newChar) where Str is the your data. >> >>Best regards, >> >>Chris >> >> >>"teo" <t**@inwind.it> wrote in message >>news:ljurf21nv19sgmncmaca6i5fobrot9pv4r@4ax.com... >>>I uploaded a zip file (2 Kb) that contains a .rtf file >>> with the explanation and a sample, >>> here: >>> http://www.zshare.net/download/regexsmp-zip.html >>> (no java required) >>> >>> >>> >>>>Hi Teo, >>>> >>>>Just to clarify, are you trying to find all the lines in a given file >>>>that >>>>contain a particular word? >>>> >>>>What does your data look like, are these strictly text files? Can you >>>>give >>>>me an example that I can test on. Where ever there is a VbLf, VbCr, or >>>>VbCrLf you can just make note of it >>>> >>>>This is some text VbCrLf >>>>that I want to test VbCrLf >>>>against. >>>> >>>>Regulazy or the Regulator by Roy Osherove might help as well. >>>>http://tools.osherove.com/ >>>> >>>>Chris >>>> >>>>"teo" <t**@inwind.it> wrote in message >>>>news:gparf293biklfnpte16i7707ghi3n83v8i@4ax.com... >>>>> Hallo >>>>> I need to extract a subtext from a text. >>>>> The subtext must contain a given word. >>>>> >>>>> The subtext bounds are: >>>>> >>>>> vbCr (return) >>>>> vbLf (new line) >>>>> vbCrLf (return+new line) >>>>> the very beginning of the text >>>>> the very ending of the text >>>>> >>>>> >>>>> I tried with: >>>>> >>>>> ^ >>>>> \n >>>>> \r >>>>> $ >>>>> >>>>> so to have: >>>>> >>>>> Dim myText As String >>>>> Dim myPattern As String = "^\n\r" & myWord & "\n\r$" >>>>> >>>>> Dim match As Match = Regex.Match(myText, myPattern, >>>>> RegexOptions.Multiline >>>>> Or RegexOptions.IgnoreCase) >>>>> >>>>> but I had problems. >>>> >>> >> >
Help with Adding A Row
How to input characters that are not present in a keyboard to a VB source and multiple machines User Control constructors taking arguments? UI Challenge: How to create a real outliner (like Ecco, Grandview, etc) - SampleDisplay.bmp (0/1) newbe help please. paste disabled when i add a menubar in vb2005 Book Recommendations Request datagridview.SelectedRows returns last row first? Break up string, Comma Delimited Is the difference intended? |
|||||||||||||||||||||||