|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Splitting a large string variable into lines <= 70 charsI need to be able to split large string variables into an array of lines, each line can be no longer than 70 chars. The string variables are text, so I would additionally like the lines to end at the end of a word, if you catch my drift. For example, I have a large string variable containing the text: "I've seen things you people wouldn't believe. Attack ships on fire off the Shoulder of Orion. I watched C-beams glistening in the moonlight at the Tannhauser Gate. All these moments will be lost, like tears in rain. Time to die." Now, with a limit of 70 chars per line (and lines *must* end with a completed word), I want this to appear like this: "I've seen things you people wouldn't believe. Attack ships on fire" "off the Shoulder of Orion. I watched C-beams glistening in the" "moonlight at the Tannhauser Gate. All these moments will be lost," "like tears in rain. Time to die." ie, split into an array of x lines. Thanks, Daren Just some thoughts off hand and I don't know the commands only that they
exist. Locate the first space from the right starting at 70 in the input string. You then can take everything before that as the first string. Everything after that becomes the new input string. Repeat until the input string is less than 70. Or count one char at a time through the input string until you get to 70 saving the position of the last space you find. Or you might split the input string with the split command giving it the space as the delimiter. then stack the words into new strings checking that they don't add to more than 70 in each string. Daren wrote: Show quoteHide quote > Hi, > > I need to be able to split large string variables into an array of > lines, each line can be no longer than 70 chars. > > The string variables are text, so I would additionally like the lines > to end at the end of a word, if you catch my drift. > > For example, I have a large string variable containing the text: > "I've seen things you people wouldn't believe. Attack ships on fire > off the Shoulder of Orion. I watched C-beams glistening in the > moonlight at the Tannhauser Gate. All these moments will be lost, like > tears in rain. Time to die." > > Now, with a limit of 70 chars per line (and lines *must* end with a > completed word), I want this to appear like this: > "I've seen things you people wouldn't believe. Attack ships on fire" > "off the Shoulder of Orion. I watched C-beams glistening in the" > "moonlight at the Tannhauser Gate. All these moments will be lost," > "like tears in rain. Time to die." > > ie, split into an array of x lines. > > Thanks, > > Daren > Hi Daren,
I tried out one of CJ's excellent suggestions(the first one, actually) and it works for me. This seems to be the most efficient method to me. As CMM said, it's an interesting exercise to try out yourself. If you're still stuck, here's the code. (I'm using a preferable Line Length of 65, since we need to search for a space after this length. This means that each line is about 65-75 chars in length, on average, depending on how big the last word is.) ============================================ Dim LineLength As Integer = 50 Dim currPos As Integer Dim theText As String = "My Large String goes here" Dim thisLine As String Dim allLines As New StringBuilder() 'Locate the first space after specified no. of chars.(LineLength) While theText.Length > LineLength 'Locate the first space after 70 chars. currPos = theText.IndexOf(" ", LineLength) If currPos > -1 Then 'Get all the text from start of string to currPos thisLine = theText.Substring(0, currPos + 1) 'Remove this extracted part from the original string too. theText = theText.Remove(0, currPos + 1) 'Append this line and a CrLf to the StringBuilder allLines.Append(thisLine) allLines.Append(vbCrLf) End If End While 'Append the remaining part of the text(last line) to the StringBuilder allLines.Append(theText) 'Display the Text in a Multiline Textbox TextBox1.Text = allLines.ToString() ============================================ HTH, Regards, Cerebrus. Cerebrus,
Thanks for the kind words about my ideas. I knew there was some reason I'm still employed. And I doubt it's for my knowledge of VB.net :) Actually I was thinking in suggestion #1 of using the InStrRev function to locate the last space before the 70th char. (I had to go look it up--to write this reply) It might be an older command from VB6 era but it is in the .net help. I've done a lot of string manipulation in my career. Unfortunately not much in VB. Cerebrus wrote: Show quoteHide quote > Hi Daren, > > I tried out one of CJ's excellent suggestions(the first one, actually) > and it works for me. This seems to be the most efficient method to me. > As CMM said, it's an interesting exercise to try out yourself. > > If you're still stuck, here's the code. > > (I'm using a preferable Line Length of 65, since we need to search for > a space after this length. This means that each line is about 65-75 > chars in length, on average, depending on how big the last word is.) > > ============================================ > > Dim LineLength As Integer = 50 > Dim currPos As Integer > Dim theText As String = "My Large String goes here" > Dim thisLine As String > Dim allLines As New StringBuilder() > 'Locate the first space after specified no. of chars.(LineLength) > While theText.Length > LineLength > 'Locate the first space after 70 chars. > currPos = theText.IndexOf(" ", LineLength) > If currPos > -1 Then > 'Get all the text from start of string to currPos > thisLine = theText.Substring(0, currPos + 1) > 'Remove this extracted part from the original string too. > theText = theText.Remove(0, currPos + 1) > 'Append this line and a CrLf to the StringBuilder > allLines.Append(thisLine) > allLines.Append(vbCrLf) > End If > End While > 'Append the remaining part of the text(last line) to the StringBuilder > allLines.Append(theText) > 'Display the Text in a Multiline Textbox > TextBox1.Text = allLines.ToString() > > ============================================ > > HTH, > > Regards, > > Cerebrus. > Lol, you're welcome, CJ.
> I knew there was some reason I'm still employed. And I doubt it's for my knowledge of VB.net :) Well, I'm not yet employed. Still looking for a job ! ;-(The InStrRev function seems perfect for the job in this case. I tried to find a .NET equivalent, but nothing else will do the job in this situation. (Since we're breaking the string *after* finding the space.) Just a reminder for anyone planning to use similar code, the InStrRev function returns a 1-based index, so you'd need to increment the index by 1 more when using the Substring method. In my code, I used "currPos + 1" to get the trailing space as well into the substring. (Forgot to trim it later !) Regards, Cerebrus. > The InStrRev function seems perfect for the job in this case. I tried There is String.LastIndexOf method which does the same as InStrRev.> to find a .NET equivalent, but nothing else will do the job in this -- Peter Macej Helixoft - http://www.vbdocman.com VBdocman - Automatic generator of technical documentation for VB, VB ..NET and ASP .NET code Hi Peter,
I did consider the String.LastIndexOf() method, but it didn't seem suited for the job, since in this case (if you analyse the original question), we need to *start* searching backwards from the 70th character for the *first* space. While, the LastIndexOf function will search *forward* for the *last* space. Since we break the string, only after searching for the space, LastIndexOf didn't seem appropriate. If the String.IndexOf() function had a "direction" parameter, it could have been used. Please let me know if you can think of a way to do it using .NET functions. Regards, Cerebrus. > While, the LastIndexOf function will Sorry, but that's wrong. LastIndexOf searches BACKWARDS. There is also > search *forward* for the *last* space. overloaded method with starting index, in your case 70: String.LastIndexOf Method (String, Int32) see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemstringclasslastindexoftopic4.asp From the documentation: "The search begins at the startIndex character position of this instance and proceeds backwards towards the beginning until either value is found or the first character position has been examined." -- Peter Macej Helixoft - http://www.vbdocman.com VBdocman - Automatic generator of technical documentation for VB, VB ..NET and ASP .NET code Oops ! It seems I missed that part. Thank you so much for that
correction, Peter. I stand corrected. :-) Regards, Cerebrus. Yep, Peter located the .net replacement. I knew someone would. Now to
remember that for when I need to use it. Cerebrus wrote: Show quoteHide quote > Oops ! It seems I missed that part. Thank you so much for that > correction, Peter. > > I stand corrected. :-) > > Regards, > > Cerebrus. > On Mon, 27 Feb 2006 09:51:34 +0100, Peter Macej <pe***@vbdocman.com>
wrote: >> While, the LastIndexOf function will LastIndexOf goes through quite a lot of validation before finally>> search *forward* for the *last* space. > >Sorry, but that's wrong. LastIndexOf searches BACKWARDS. There is also >overloaded method with starting index, in your case 70: >String.LastIndexOf Method (String, Int32) >see >http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemstringclasslastindexoftopic4.asp > > From the documentation: >"The search begins at the startIndex character position of this instance >and proceeds backwards towards the beginning until either value is found >or the first character position has been examined." making it to an InternalCall to the CLR. Since you are only looking for a single space and you are likely to find it (on average) within 5-6 iterations(?), I think the fastest approach is to search for it "manually", looping back from pos 70 of each line down to the first space. Another comment about your source sample: In your sample, you remove the lines you found from the original string. This is easy to read, but is likely to be costly in terms of performance. Instead, do not modify the original string at all during the loop, but just keep track of where your next line begins, i.e. last line end found becomes the next line start position. If you are just testing it with a single paragraph or page, you are unlikely to see any effect of this optimization, but if you are writing an eBook converter or high-volume data import function, it could be noticable. /JB >Another comment about your source sample: In your sample, Two more comments:>you remove the lines you found from the original string. >This is easy to read, but is likely to be costly in terms of >performance. >Instead, do not modify the original string at all during the loop, >but just keep track of where your next line begins, i.e. last >line end found becomes the next line start position. >If you are just testing it with a single paragraph or page, >you are unlikely to see any effect of this optimization, but >if you are writing an eBook converter or high-volume data >import function, it could be noticable. 1) After you find your lines, make sure you trim them. 2) Make sure your algorithm handles lines with "words" that are longer than the line length specified. I haven't checked, but I am fairly sure the posted sample would enter an infinite loop if such a beast was encountered. Yes, this could happen. Or have you never seen something like klajsdflkajsdflkjasdklfjaslkjdflkjasdlkjfkasdfjklasdjklflkjadskljfklasdflkjlkjasdflkjlkajsdfljkasjkldfjklalsdkjflkjasdflkjalkjsdflkjalskjfdlkjasdlkjfjkladflkjalkjsdflkjasdfljkalsdkjfjlafdljkakljfd in a text file? /JB I kinda agree with you that perhaps the string search methods and
functions like InStrRev and LastIndexOf will not be the fastest way. Also perhaps not the fastest way but I'm impressed with the, new to me at least, split command and can see this as parsing the whole thing out into words then adding up words. Still, only Daren knows how fast it needs to be. Many times the difference isn't that much. Many times for me it comes down to what I understand best. For me if it works usually everyone is happy. Something like this if I had the time might intrigue me to test it all 3 ways on a huge chunk of data. I'd have the program time itself. It goes without saying you are correct of course on the need for error detection. Interesting you should point out "words" larger than 70. That's an error a lot of folks could overlook but the could occur. This conversation on the fastest way makes me think of something I've noticed over the years. Please note, I don't condone this and I have NOT done this, on purpose, before. Throwing together a slow app that gets the job done wins you praise for getting the program written quickly. Wait till they grumble it's slow and then throw in a faster routine and your a hero again! I heard of a programmer who took this to the extreme. He built wait loops into his code to purposely slow it down. Months later when given the project to try to speed up the processing he say he'd try. Weeks later he was praised for making it so much faster. All he'd done was reduce the number of iterations his code spent in the wait loops. Makes you sick doesn't it? Of course this only works if your the only one that sees the code! I think that's how he got caught. What have I learned from these observations and this fellow? People want the job done NOW. It's what I'm paid for. I do the best I can making sure it's done within the time alloted. Everyone's happy. (Except me, I'm rarely happy with my code but the realization that getting it done even if not the best way IS doing my job helps me cope.) I then continue to work on the code as I have time until I get it right and put in the changes. Of course if you follow my lead on this, make darn sure you are improving things with your changes. You don't want to introduce bugs into something that's working. Joergen Bech <jbech<NOSPAM>@ wrote: Show quoteHide quote >> Another comment about your source sample: In your sample, >> you remove the lines you found from the original string. >> This is easy to read, but is likely to be costly in terms of >> performance. >> Instead, do not modify the original string at all during the loop, >> but just keep track of where your next line begins, i.e. last >> line end found becomes the next line start position. >> If you are just testing it with a single paragraph or page, >> you are unlikely to see any effect of this optimization, but >> if you are writing an eBook converter or high-volume data >> import function, it could be noticable. > > Two more comments: > > 1) After you find your lines, make sure you trim them. > > 2) Make sure your algorithm handles lines with > "words" that are longer than the line length specified. > I haven't checked, but I am fairly sure the posted > sample would enter an infinite loop if such a beast > was encountered. > Yes, this could happen. Or have you never seen something > like > > klajsdflkajsdflkjasdklfjaslkjdflkjasdlkjfkasdfjklasdjklflkjadskljfklasdflkjlkjasdflkjlkajsdfljkasjkldfjklalsdkjflkjasdflkjalkjsdflkjalskjfdlkjasdlkjfjkladflkjalkjsdflkjasdfljkalsdkjfjlafdljkakljfd > > in a text file? > > /JB > > > ---snip---
>This conversation on the fastest way makes me think of something I've First, write for clarity. Second, measure performance. Third, optimize>noticed over the years. Please note, I don't condone this and I have >NOT done this, on purpose, before. Throwing together a slow app that >gets the job done wins you praise for getting the program written >quickly. Wait till they grumble it's slow and then throw in a faster ---snip--- if necessary. As for removing each line from the original string, I was merely pointing this out because this *is* a common "error" - just as bad as creating one large string by concatenating many small strings rather than using the StringBuilder class. The Split approach would avoid the ">70-characters line" problem. I am sure the final code would be cleaner, but not shorter than keeping track of start/end positions and extracting substrings, but I would guess that performance would be worse. /JB Sorry to hear about your employment situation. I've been there too.
The IT job market is still tough in a lot of areas. You know .net well and that will help. Cerebrus wrote: Show quoteHide quote > Lol, you're welcome, CJ. > >> I knew there was some reason I'm still employed. And I doubt it's for my knowledge of VB.net :) > > Well, I'm not yet employed. Still looking for a job ! ;-( > > The InStrRev function seems perfect for the job in this case. I tried > to find a .NET equivalent, but nothing else will do the job in this > situation. (Since we're breaking the string *after* finding the space.) > > Just a reminder for anyone planning to use similar code, the InStrRev > function returns a 1-based index, so you'd need to increment the index > by 1 more when using the Substring method. In my code, I used "currPos > + 1" to get the trailing space as well into the substring. (Forgot to > trim it later !) > > Regards, > > Cerebrus. > You're looking for a line wrapping algorithm. I don't know if there is a
built-in function to do it for you.... this is fun and a good excerise to try and come up with on your own without help. It's not hard. Basically you split your text into an array of words. Fill a string with the words until you determine that adding the next word would surpass the length limit, make a typewriter DING sound in your head, add the string to your lines array, move on to the next line. Show quoteHide quote "Daren" <spe***@gmail.com> wrote in message news:1140794260.254113.219400@e56g2000cwe.googlegroups.com... > Hi, > > I need to be able to split large string variables into an array of > lines, each line can be no longer than 70 chars. > > The string variables are text, so I would additionally like the lines > to end at the end of a word, if you catch my drift. > > For example, I have a large string variable containing the text: > "I've seen things you people wouldn't believe. Attack ships on fire > off the Shoulder of Orion. I watched C-beams glistening in the > moonlight at the Tannhauser Gate. All these moments will be lost, like > tears in rain. Time to die." > > Now, with a limit of 70 chars per line (and lines *must* end with a > completed word), I want this to appear like this: > "I've seen things you people wouldn't believe. Attack ships on fire" > "off the Shoulder of Orion. I watched C-beams glistening in the" > "moonlight at the Tannhauser Gate. All these moments will be lost," > "like tears in rain. Time to die." > > ie, split into an array of x lines. > > Thanks, > > Daren >
object reference not set to an instance???
Can you emulate an ActiveX exe in .NET After export to Excel, that excel cannot open Late binding equivalent Custom size paper problem "The path is not of a legal form" error - WinFoms designer Field token out of range (System.BadImageFormatException) BindingList: How to implement Find? Saving Rich Text to a SQL Database Sending Mail via MAPI - VB .NET 2005 |
|||||||||||||||||||||||