Home All Groups Group Topic Archive Search About

IO function in Vb.Net slower than in Vb6.0

Author
28 Mar 2005 10:38 PM
hillcountry74
Hi,

I'm re-writing a VB6 app in Vb.Net. This basically reads a text file
using streamreader one line at a time, parses the string using
substring, trim functions and writes the parsed string to an output
text file using streamwriter. I've noticed while testing that this is
15 secs slower than the VB6 app. Wonder why it is slow. Can someone
give me some pointers?

Thanks. Appreciate your time.

Author
28 Mar 2005 10:45 PM
Herfried K. Wagner [MVP]
"hillcountry74" <shruth***@yahoo.com> schrieb:
> I'm re-writing a VB6 app in Vb.Net. This basically reads a text file
> using streamreader one line at a time, parses the string using
> substring, trim functions and writes the parsed string to an output
> text file using streamwriter. I've noticed while testing that this is
> 15 secs slower than the VB6 app. Wonder why it is slow. Can someone
> give me some pointers?

VB.NET applications are stored in IL (Intermediate Language) instead of
native code.  At runtime, the CLR's JIT compiler converts the methods' IL to
native code.  This process will take some time and can influence the runtime
of your application.  However, I think that there might be a different
reason for the performance differences.  Could you post the VB6 code and the
corresponding VB.NET version of this code?

--
M S   Herfried K. Wagner
M V P  <URL:http://dotnet.mvps.org/>
V B   <URL:http://classicvb.org/petition/>
Author
28 Mar 2005 10:49 PM
hillcountry74
Thanks for your response

Here is the code:

Public Overrides Sub PreferredInputProcessing()
        Dim InputFileReader As StreamReader
        Dim UndefinedBenefitsFileWriter As StreamWriter
        Dim DataBlock As MHNet.ApplicationBlocks.Data.SqlHelper
        Dim ds As New DataSet()
        Dim dr As SqlClient.SqlDataReader
        Dim sSql As New StringBuilder()
        Dim iNewOptionCount As Integer
        Dim sPreferredInputFile As System.String
        Dim iRecordsProcessed As System.Int32

        'Open input file
        InputFileReader = New StreamReader(_sInputFileLocation)

        'Open output file
        OpenOutputFiles()

        'Create UndefinedBenefits file
        UndefinedBenefitsFileWriter = New StreamWriter("C:\EDIFiles\" &
_sRateCode & "UndefinedBenefits.csv")

        'Zero out the number of records processed.
        ProcessReport.RecordsProcessed = 0

        'Set validation properties
        _bValidateEnrollType = True
        _bValidateHeadofHouse = True
        'The Celanese benefits don't have a value for PrimaryStatus in
the elig file,
        'but are actually = Primary. So, clsMain defaults it to primary
status.
        Select Case _sRateCode
            Case "CELANESEMBH", "MMSI", "GLOBALHEALTH"
                _bValidatePrimaryStatus = False
            Case Else
                _bValidatePrimaryStatus = True
        End Select
        _bValidateMaritalStatus = False

        Do While InputFileReader.Peek > -1

            InitializeInputVariables()
            sPreferredInputFile = Nothing
            sPreferredInputFile = InputFileReader.ReadLine
            '/ skip blank lines
            If sPreferredInputFile.Trim <> "" AndAlso
sPreferredInputFile.Trim("?"c) <> "" AndAlso
sPreferredInputFile.Trim.Length >= 439 Then
                iRecordsProcessed = iRecordsProcessed + 1

                '/ update display every 100 records
                If iRecordsProcessed Mod 100 = 0 Then
                    Status = "Records processed: " & iRecordsProcessed
                    RaiseEvent ProcessStatus(Me, New
System.EventArgs())
                End If

                'Set the input properties by extracting specific
                'values from the input record.
                _sInActionCode = Trim(Mid(sPreferredInputFile, 1, 1))
                _sInCarrierMemId = Trim(Mid(sPreferredInputFile, 2,
25))
                _sInLastName = Trim(Mid(sPreferredInputFile, 27, 60))
                _sInFirstName = Trim(Mid(sPreferredInputFile, 87, 30))
                _sInMiddleName = Trim(Mid(sPreferredInputFile, 117,
15))
                _sInAddr1 = Trim(Mid(sPreferredInputFile, 132, 60))
                _sInAddr2 = Trim(Mid(sPreferredInputFile, 192, 60))
                _sInCity = Trim(Mid(sPreferredInputFile, 252, 30))
                _sInState = Trim(Mid(sPreferredInputFile, 282, 2))
                _sInZip = Trim(Mid(sPreferredInputFile, 284, 10))
                _sInBenefitOption = Trim(Mid(sPreferredInputFile, 294,
60))
                _sInEmployerGroup = Trim(Mid(sPreferredInputFile, 354,
15))
                _sInOptionEffDate = Trim(Mid(sPreferredInputFile, 369,
8))
                _sInHPEffDate = Trim(Mid(sPreferredInputFile, 377, 8))
                _sInTermDate = Trim(Mid(sPreferredInputFile, 385, 8))

                If _sInTermDate = "" Or Not
IsDateValid(AddDateDashes(_sInTermDate)) Then
                    _sInTermDate = _sMagicTermDate
                End If
                'TERMING PLANS to set date to manual date
                Select Case _sRateCode
                    Case "MMSI"
                        If _sInTermDate > "20041231" Then
                            _sInTermDate = "20041231"
                        End If
                End Select
                _sInSex = sPreferredInputFile.Substring(392, 1).Trim
                Dim sTmp As System.String
                sTmp = Trim(Mid(sPreferredInputFile, 394, 8))
                If sTmp <> "" Then
                    _sInDOB = Trim(Mid(sTmp, 1, 4)) & "-" &
Trim(Mid(sTmp, 5, 2)) & _
                                "-" & Trim(Mid(sTmp, 7, 2))
                End If
                _sInSSN = Trim(Mid(sPreferredInputFile, 402, 9))
                _sInPhone = Trim(Mid(sPreferredInputFile, 411, 12))
                If _sInPhone.Length = 12 Then
                    _sInPhone = Trim(Mid(_sInPhone, 1, 3)) &
Trim(Mid(_sInPhone, 5, 3)) & Trim(Mid(_sInPhone, 9, 4))
                End If
                sTmp = sPreferredInputFile.Substring(422, 8).Trim
                If sTmp <> "" Then
                    _sInEmployerGroupAnivDate = Trim(Mid(sTmp, 1, 4)) &
_
                                                "-" & Trim(Mid(sTmp, 5,
2)) & _
                                                "-" & Trim(Mid(sTmp, 7,
2))
                End If

                _sInHeadOfHouse = Trim(Mid(sPreferredInputFile, 431,
9))
                If _sInHeadOfHouse = "" Then
                    _sInHeadOfHouse = Trim(Mid(_sInCarrierMemId, 2, 9))
                End If
                _sInPrimaryStatus = Trim(Mid(sPreferredInputFile, 440,
1))
                _sInEnrollType = Trim(Mid(sPreferredInputFile, 441, 1))

                Try
                    _sInMaritalStatus = Trim(Mid(sPreferredInputFile,
442, 1))
                Catch ex As System.ArgumentOutOfRangeException
                    If ex.Message.IndexOf("Index and length must refer
to a location within the string") > 0 Then _sInMaritalStatus = ""
                End Try
                'Validate the incoming record.

                    Validate()
f _bValidated Then
                    BuildOutputRecord()
                    WriteOutputRecord()
                    ProcessReport.TotalSuccessfulRecords =
ProcessReport.TotalRecordsProcessed - ProcessReport.TotalErrorRecords
                Else
                    WriteOutputErrorRecord()
                    ProcessReport.TotalErrorRecords =
ProcessReport.TotalErrorRecords + 1
                End If
            End If 'skip blank lines
        Loop

This is the main parsing routine.
Thanks for your help.
Author
28 Mar 2005 11:05 PM
Herfried K. Wagner [MVP]
"hillcountry74" <shruth***@yahoo.com> schrieb:
>                Try
>                    _sInMaritalStatus = Trim(Mid(sPreferredInputFile, 442,
> 1))
>                Catch ex As System.ArgumentOutOfRangeException
>                    If ex.Message.IndexOf("Index and length must refer
>to a location within the string") > 0 Then _sInMaritalStatus = ""
>                End Try

How often is this exception thrown?  Instead of catching an exception make
sure that the indices are valid.  In addition to that, check the performance
of the release version (not the debug) version of the application when it's
started outside the IDE.

--
M S   Herfried K. Wagner
M V P  <URL:http://dotnet.mvps.org/>
V B   <URL:http://classicvb.org/petition/>
Author
29 Mar 2005 12:24 AM
hillcountry74
This exception is called maybe 1 out of 1000 times. I've tried to see
if this makes a difference by commenting out this piece of code , but
no diff.

I've compiled in release mode and executed the app for the resulting
exe. But there is also a .pdb file which I think is created when I run
the app in debug mode.

Is there anything else I'm missing. Do you think the substring,trim
functions will slow down? Or is it the IO the cause?

Thanks for your time.
Author
29 Mar 2005 12:52 AM
MeltingPoint
Show quote Hide quote
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112055874.828774.191950@g14g2000cwa.googlegroups.com:

> This exception is called maybe 1 out of 1000 times. I've tried to see
> if this makes a difference by commenting out this piece of code , but
> no diff.
>
> I've compiled in release mode and executed the app for the resulting
> exe. But there is also a .pdb file which I think is created when I run
> the app in debug mode.
>
> Is there anything else I'm missing. Do you think the substring,trim
> functions will slow down? Or is it the IO the cause?
>
> Thanks for your time.
>

I've spent a year or so on vb.net and still consider myself new, but in
my opinion it is the many small reads that are slowing you down. Since I
don't know the structue of the file (but it sounds like a text file with
the records all smashed together) I'll suggest a couple of way *I THINK*
will speed it up.
1) (If file has delimiters) - Read the whole file into a string, and use
String.Split() to create an array that you can then map to variables or
just write it straight out ->outFile.Write(array(elementNumber))

2) Read the whole file into a string, and use RegularExpressions.RegEx
and RegularExpressions.MatchCollection to break the string into parts
and process from there (done right this should solve the "Trim" problem.

3) If you have control over the file format(which i assume you don't)
fix the file format so you can read it in line by line without further
processing.

4) Out of Ideas:) Let me know if it helps or you need help with one of
the above:)

MP
Author
29 Mar 2005 1:45 AM
Stephany Young
At face value there does not appear to be anything that is an obvious
bottleneck, however you do call a number of methods that you have not
described, (OpenOutputFiles, InitializeInputVariables, Validate,
BuildOutputRecord, WriteOutputRecord, WriteOutputErrorRecord,
WriteOutputErrorRecord), and it is possible that there is bottleneck in any
of those. In addition you raise event ProcessStatus regularly and it would
be prudent to ensure that that whatever is handling that event is not
blocking the process for an inordinate length of time.

From my point of view the number of calls to Trim could be a factor and
perhaps some of of them are redundant. For example, take the line:

  _sInActionCode = Trim(Mid(sPreferredInputFile, 1, 1))

If the first character of a 'record' always contains a non-space character
then Trim is redundant. In this case there are 3 new strings being created,
(remember that strings are immutable), and there is an overhead, abeit
small, involved in the creation of each string. Removing the Trim from this
line would mean that there are only 2 new strings being created thus
reducing the overhead accordingly. With the number of string operations in
your PreferredInputProcessing method this could be significant.

You might also try modifying the string parsing to the '.NET way'., for
example:

  _sInActionCode = sPreferredInputFile.SubString(0, 1).Trim

or

  _sInActionCode = sPreferredInputFile.SubString(0, 1)

I do not have any benchmarking data but it is possible that you might find a
performance increase.

Another place where, in my view there extraneous overhead is:

  If sPreferredInputFile.Trim <> "" AndAlso sPreferredInputFile.Trim("?"c)
<> "" AndAlso
sPreferredInputFile.Trim.Length >= 439 Then

Note here that you are using the System.String.Trim method rather than the
Microsoft.VisualBasic.Trim function. The Microsoft.VisualBasic.Trim function
returns the source string with leading and trailing space (&H20) characters
removed while the System.String.Trim method returns the source string after
white space characters are removed from the beginning and end. Note that
there is a difference between 'space' characters and 'white space'
characters. It is unclear what actual character is being specified in the
sPreferredInputFile.Trim("?"c) clause but it is highly likely that it
qualifies as 'white space' and is therfore being removed by the first
clause. I would be inclined to code the test this:

  sPreferredInputFile = sPreferredInputFile.Trim()
  If sPreferredInputFile.Length >= 439 Then

The 3 string operations are now reduced to 1 and the number of comparison
operations is also reduced from three to one. Given the above you might be
able to refine your parsing code and identify further redundancies.


"hillcountry74" <shruth***@yahoo.com> wrote in message
news:1112050162.696747.167520@g14g2000cwa.googlegroups.com...
Thanks for your response

Here is the code:

Public Overrides Sub PreferredInputProcessing()
        Dim InputFileReader As StreamReader
        Dim UndefinedBenefitsFileWriter As StreamWriter
        Dim DataBlock As MHNet.ApplicationBlocks.Data.SqlHelper
        Dim ds As New DataSet()
        Dim dr As SqlClient.SqlDataReader
        Dim sSql As New StringBuilder()
        Dim iNewOptionCount As Integer
        Dim sPreferredInputFile As System.String
        Dim iRecordsProcessed As System.Int32

        'Open input file
        InputFileReader = New StreamReader(_sInputFileLocation)

        'Open output file
        OpenOutputFiles()

        'Create UndefinedBenefits file
        UndefinedBenefitsFileWriter = New StreamWriter("C:\EDIFiles\" &
_sRateCode & "UndefinedBenefits.csv")

        'Zero out the number of records processed.
        ProcessReport.RecordsProcessed = 0

        'Set validation properties
        _bValidateEnrollType = True
        _bValidateHeadofHouse = True
        'The Celanese benefits don't have a value for PrimaryStatus in
the elig file,
        'but are actually = Primary. So, clsMain defaults it to primary
status.
        Select Case _sRateCode
            Case "CELANESEMBH", "MMSI", "GLOBALHEALTH"
                _bValidatePrimaryStatus = False
            Case Else
                _bValidatePrimaryStatus = True
        End Select
        _bValidateMaritalStatus = False

        Do While InputFileReader.Peek > -1

            InitializeInputVariables()
            sPreferredInputFile = Nothing
            sPreferredInputFile = InputFileReader.ReadLine
            '/ skip blank lines
            If sPreferredInputFile.Trim <> "" AndAlso
sPreferredInputFile.Trim("?"c) <> "" AndAlso
sPreferredInputFile.Trim.Length >= 439 Then
                iRecordsProcessed = iRecordsProcessed + 1

                '/ update display every 100 records
                If iRecordsProcessed Mod 100 = 0 Then
                    Status = "Records processed: " & iRecordsProcessed
                    RaiseEvent ProcessStatus(Me, New
System.EventArgs())
                End If

                'Set the input properties by extracting specific
                'values from the input record.
                _sInActionCode = Trim(Mid(sPreferredInputFile, 1, 1))
                _sInCarrierMemId = Trim(Mid(sPreferredInputFile, 2,
25))
                _sInLastName = Trim(Mid(sPreferredInputFile, 27, 60))
                _sInFirstName = Trim(Mid(sPreferredInputFile, 87, 30))
                _sInMiddleName = Trim(Mid(sPreferredInputFile, 117,
15))
                _sInAddr1 = Trim(Mid(sPreferredInputFile, 132, 60))
                _sInAddr2 = Trim(Mid(sPreferredInputFile, 192, 60))
                _sInCity = Trim(Mid(sPreferredInputFile, 252, 30))
                _sInState = Trim(Mid(sPreferredInputFile, 282, 2))
                _sInZip = Trim(Mid(sPreferredInputFile, 284, 10))
                _sInBenefitOption = Trim(Mid(sPreferredInputFile, 294,
60))
                _sInEmployerGroup = Trim(Mid(sPreferredInputFile, 354,
15))
                _sInOptionEffDate = Trim(Mid(sPreferredInputFile, 369,
8))
                _sInHPEffDate = Trim(Mid(sPreferredInputFile, 377, 8))
                _sInTermDate = Trim(Mid(sPreferredInputFile, 385, 8))

                If _sInTermDate = "" Or Not
IsDateValid(AddDateDashes(_sInTermDate)) Then
                    _sInTermDate = _sMagicTermDate
                End If
                'TERMING PLANS to set date to manual date
                Select Case _sRateCode
                    Case "MMSI"
                        If _sInTermDate > "20041231" Then
                            _sInTermDate = "20041231"
                        End If
                End Select
                _sInSex = sPreferredInputFile.Substring(392, 1).Trim
                Dim sTmp As System.String
                sTmp = Trim(Mid(sPreferredInputFile, 394, 8))
                If sTmp <> "" Then
                    _sInDOB = Trim(Mid(sTmp, 1, 4)) & "-" &
Trim(Mid(sTmp, 5, 2)) & _
                                "-" & Trim(Mid(sTmp, 7, 2))
                End If
                _sInSSN = Trim(Mid(sPreferredInputFile, 402, 9))
                _sInPhone = Trim(Mid(sPreferredInputFile, 411, 12))
                If _sInPhone.Length = 12 Then
                    _sInPhone = Trim(Mid(_sInPhone, 1, 3)) &
Trim(Mid(_sInPhone, 5, 3)) & Trim(Mid(_sInPhone, 9, 4))
                End If
                sTmp = sPreferredInputFile.Substring(422, 8).Trim
                If sTmp <> "" Then
                    _sInEmployerGroupAnivDate = Trim(Mid(sTmp, 1, 4)) &
_
                                                "-" & Trim(Mid(sTmp, 5,
2)) & _
                                                "-" & Trim(Mid(sTmp, 7,
2))
                End If

                _sInHeadOfHouse = Trim(Mid(sPreferredInputFile, 431,
9))
                If _sInHeadOfHouse = "" Then
                    _sInHeadOfHouse = Trim(Mid(_sInCarrierMemId, 2, 9))
                End If
                _sInPrimaryStatus = Trim(Mid(sPreferredInputFile, 440,
1))
                _sInEnrollType = Trim(Mid(sPreferredInputFile, 441, 1))

                Try
                    _sInMaritalStatus = Trim(Mid(sPreferredInputFile,
442, 1))
                Catch ex As System.ArgumentOutOfRangeException
                    If ex.Message.IndexOf("Index and length must refer
to a location within the string") > 0 Then _sInMaritalStatus = ""
                End Try
                'Validate the incoming record.

                    Validate()
f _bValidated Then
                    BuildOutputRecord()
                    WriteOutputRecord()
                    ProcessReport.TotalSuccessfulRecords =
ProcessReport.TotalRecordsProcessed - ProcessReport.TotalErrorRecords
                Else
                    WriteOutputErrorRecord()
                    ProcessReport.TotalErrorRecords =
ProcessReport.TotalErrorRecords + 1
                End If
            End If 'skip blank lines
        Loop

This is the main parsing routine.
Thanks for your help.
Author
29 Mar 2005 9:23 AM
Cor Ligthert
HillCountry,

> I'm re-writing a VB6 app in Vb.Net. This basically reads a text file
> using streamreader one line at a time, parses the string using
> substring, trim functions and writes the parsed string to an output
> text file using streamwriter. I've noticed while testing that this is
> 15 secs slower than the VB6 app. Wonder why it is slow. Can someone
> give me some pointers?
>
When you want to test this, than you should use comparable code.

That means

Read inputline
outputline = inputline
Write outputline.

Because the fact that I don't have VB6 installed I cannot test that.

However it looks strange to me.

Cor
Author
29 Mar 2005 3:13 PM
hillcountry74
Thanks guys for your suggestions.

MP,
The file does not have delimiters but follows a specific format and
hence I used Mid to parse.

Can you please give me more info on using regular exprs as a
replacement for Trim function?


Stephany,
The file might contain a valid character in position 1. So, I still
need to use Trim. even I'm suspecting Trim to be the cause. I read this
article on MSDN,
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechart/html/vbtchmicrosoftvisualbasicnetinternals.asp
and recommend using Mid instead of Substring. I couldn't get the diff.

I will change the If condition as you have mentioned and let you know
the results. And "?", I'm thinking is an unicode character. Earlier, I
was not checking for this and in one of the files after the last
record, this was there and when it tried to do a substring, it threw an
exception. I was using Substring previously instead of Trim adn changed
it subsequently after reading the above article.

Guys, more suggestions really appreciated. I'm stuck with this issue
from past 1 week. Please help!!

Thanks again for your time.
Author
29 Mar 2005 3:21 PM
hillcountry74
I wanted to add to the above: The parsing routine is in dll and is
being called from the frontend app which is a separate project but in
the same solution. Could this architecture be a problem?
Author
29 Mar 2005 3:36 PM
hillcountry74
Guys,

I commented calling the Validate method and it was faster by 10 secs.
Here is the Validate method code. Please let me know how I can optimize
this. (Please note that ValidateLast, firstname, state etc are all the
same). I'm using IndexOf method in DoesBadCharacterExist method? Is
there a better way? Thanks

Protected Sub Validate()
        '**********************************************************
        ' Validate the current input record
        '**********************************************************
        Dim sTmpHold As System.String
        '/ Set the validated flag to True.
        _bValidated = True


        '/ Initialize the output error record.
        BuildOutputErrorRecord()

        '/ Member ID Validation
        If _sInCarrierMemId = "" Then
            Throw New InvalidFieldException.MissingMemberIDException()
        End If

        '/ Mhnet Member Validation
        If Not _bMhnetMember Then
            Select Case _sRateCode
                Case "HCUSA", "HUMANAFLHMO", "HUMANAFLPPO", "HUMANA"
                    'Commented out BB 2003-03-04 line from criteria
because added humanafl
                    'And frmMain.cboRateCode = "HCUSA" Then
                    Throw New
InvalidFieldException.MHNetMemberException()
                Case Else
            End Select
        End If

        '/ Last name Validation - chars "A-Z,.-'0-9"
        Select Case ValidateLastName(_sInLastName, _sLastnameChars)
            Case 0
                Throw New
InvalidFieldException.MissingLastnameException()
            Case 1
                Throw New
InvalidFieldException.BadFormatLastnameException()
            Case Else
        End Select

        '/ First name validation - chars "A-Z.'"
        Select Case ValidateFirstName(_sInFirstName, _sFirstNameChars)
            Case 0
                Throw New
InvalidFieldException.MissingFirstnameException()
            Case 1
                Throw New
InvalidFieldException.BadFormatFirstnameException()
            Case Else
        End Select

        '/ Middle name validation - chars "A-Z"
        If ValidateMiddleName(_sInMiddleName, _sMiddleNameChars) <>
True Then
            Throw New
InvalidFieldException.BadFormatMiddlenameException()
        End If

        '/ City name validation - chars "A-Z.-'"
        If _sInCity <> "" Then
            _sInCity = _sInCity.Replace("/", "")  'added 20050221 BB
            _sInCity = _sInCity.Replace("\", "")  'added 20050221 BB
            _sInCity = _sInCity.Replace(",", "") 'added 20050221 BB
            If ValidateCityName(_sInCity, _sCityNameChars) <> True Then
                Throw New
InvalidFieldException.BadFormatCityException()
            End If
        End If

        '/ State name validation - chars "A-Z"
        If ValidateStateName(_sInState, _sStateNameChars) <> True Then
            Throw New InvalidFieldException.BadFormatStateException()
        End If

        '/ SSN validation - make sure SSN is only numeric if it exists
        If _sInSSN <> "" Then
            _sInSSN = Mid(_sInSSN, 1, 9)
            If Not IsNumeric(_sInSSN) Then
                Throw New InvalidFieldException.BadFormatSSNException()
            End If
        End If

        '/ Phone validation - make sure Phone is only numeric if it
exists
        If _sInPhone <> "" Then
            _sInPhone = _sInPhone.Replace("-", "")
            _sInPhone = _sInPhone.Replace(" ", "")   'added 20050221 BB
            _sInPhone = _sInPhone.Replace("*", "")   'added 20050221 BB
            _sInPhone = _sInPhone.Replace(".", "")   'added 20050221 BB
            _sInPhone = _sInPhone.Replace("/", "")   'added 20050221 BB
            If Not IsNumeric(_sInPhone) Then
                Throw New
InvalidFieldException.BadFormatPhoneException()
            End If
        End If

        '/ Date of Birth Validation
        sTmpHold = AddDateDashes(_sInDOB)
        If Not IsDateValid(sTmpHold) Then
            Throw New InvalidFieldException.DateOfBirthException()
        Else
            _sInDOB = System.String.Format("{0:yyyyMMdd}",
CType(sTmpHold, Date))
        End If
        If sTmpHold > _sMagicTermDateWithDashes Then
            _sInDOB = _sMagicTermDate
        End If
        '_sInDOB = CheckMaxDate(_sInDOB)

        sTmpHold = AddDateDashes(_sInOptionEffDate)
        If IsDateValid(sTmpHold) Then
            If System.String.Format("{0:yyyyMMdd}", _sInOptionEffDate)
< System.String.Format("{0:yyyyMMdd}", _sInceptionDate) Then
                _sInOptionEffDate =
System.String.Format("{0:yyyyMMdd}", _sInceptionDate)
            Else
                _sInOptionEffDate =
System.String.Format("{0:yyyyMMdd}", _sInOptionEffDate)
            End If
        Else ' _sInOptionEffDate in not valid
            Throw New InvalidFieldException.OptionEffDateException()
        End If
        'End If
        'Commented out above Code, goes with above comment out code
block  bb 2002-12-12
        If sTmpHold > _sMagicTermDateWithDashes Then
            _sInOptionEffDate = _sMagicTermDate
        End If
        '_sInOptionEffDate = CheckMaxDate(_sInOptionEffDate)

        '/ if this contains "        " (8 blanks) then this allows the
        'code to choose inception or filedate for hpeffdate
        'Commented code below while re-writing in .Net
        'as for an invalid date it was always defaulting to an empty
string and was processed
        'when it should actually be errored.
        'If Not IsDateValid(AddDateDashes(_sInHPEffDate)) Then
        '    _sInHPEffDate = ""
        'End If

        sTmpHold = AddDateDashes(_sInHPEffDate)
        If _sInHPEffDate = "" Then
            'If the file date is before the inception date, use the
inception date.
            'Otherwise, use the file date.
            If System.String.Format("{0:yyyyMMdd}",
FileDateTime(_sInputFileLocation).ToString) <
System.String.Format("{0:yyyyMMdd}", _sInceptionDate) Then
                _sInHPEffDate = System.String.Format("{0:yyyyMMdd}",
_sInceptionDate)
            Else
                _sInHPEffDate = System.String.Format("{0:yyyyMMdd}",
FileDateTime(_sInputFileLocation).ToString)
            End If
        Else
            If IsDateValid(sTmpHold) Then
                _sInHPEffDate = System.String.Format("{0:yyyyMMdd}",
CType(sTmpHold, Date))
            Else
                Throw New InvalidFieldException.HPEffDateException()
            End If
        End If
        '_sInHPEffDate = CheckMaxDate(_sInHPEffDate)
        If sTmpHold > _sMagicTermDateWithDashes Then
            _sInHPEffDate = _sMagicTermDate
        End If
        '/ Benefit Option Validation
        If _sInBenefitOption = "" Then
            Throw New
InvalidFieldException.MissingBenefitOptionException()
        End If

        '/ Employer Group Validation
        If _sInEmployerGroup = "" And _bValidateEmployerGroup Then
            Throw New
InvalidFieldException.MissingEmployerGroupException()
        End If

        '/ Set the Term Date to 12.31.2078 if the Term date is not a
valid date.
        sTmpHold = AddDateDashes(_sInTermDate)
        If IsDateValid(sTmpHold) Then
            _sInTermDate = System.String.Format("{0:yyyyMMdd}",
CType(sTmpHold, Date))
        Else
            _sInTermDate = _sMagicTermDate
        End If

        ''/ Term Date Validation
        ''/ If (msInTermDate < Format(Now(), "yyyymmdd")) Then
        If (_sInTermDate < System.String.Format("{0:yyyyMMdd}",
CType(_sInceptionDate, Date))) Then
            'msOutErrorRec = msOutErrorRec & " Invalid Term Date Error:
" & msInTermDate
            '/ changed per Kit on 4-17-2001
            _sInTermDate = System.String.Format("{0:yyyyMMdd}",
CType(_sInceptionDate, Date))
            Throw New InvalidFieldException.TermDateException()
        End If
        '_sInTermDate = CheckMaxDate(_sInTermDate)
        If _sInTermDate > _sMagicTermDateWithDashes Then
            _sInTermDate = _sMagicTermDate
        End If

        '/ Employer Group Aniversary date validation
        If Not IsDateValid(AddDateDashes(_sInEmployerGroupAnivDate))
Then
            _sInEmployerGroupAnivDate = ""
        Else
            _sInEmployerGroupAnivDate =
System.String.Format("{yyyymmdd}",
AddDateDashes(_sInEmployerGroupAnivDate))
        End If
        '_sInEmployerGroupAnivDate =
CheckMaxDate(_sInEmployerGroupAnivDate)
        If _sInEmployerGroupAnivDate > _sMagicTermDateWithDashes Then
            _sInEmployerGroupAnivDate = _sMagicTermDate
        End If

        '/ If the Head of House is blank and the element was NOT
supplied in the
        '/ submitted positive enrollment file, use the left nine
characters
        '/ of the Carrier Member ID.
        If _sInHeadOfHouse = "" And _bValidateHeadofHouse = False Then
            _sInHeadOfHouse = _sInCarrierMemId.Substring(0, 9)
        End If

        '/ Head of House Validation - chars "A-Z,.-'0-9"
        Select Case ValidateHeadHouse(_sInHeadOfHouse,
_sHeadOfHouseChars)
            Case 0 'If the Head of House element was supplied and was
blank, reject the record.
                Throw New
InvalidFieldException.MissingHeadofHouseException()
                '/ If Head of House contains garbage chars reject the
record
            Case 1
                Throw New
InvalidFieldException.BadFormatHeadofHouseException()
            Case Else
        End Select

        '/ Primary Status Validation
        '/ If the Primary Status is blank and the Primary Status
element was NOT
        '/ submitted as an element of the positive enrollment file, use
"P"
        If _sInPrimaryStatus = "" And _bValidatePrimaryStatus = False
Then
            _sInPrimaryStatus = "P"
            '/ If the Primary Status element was supplied and was
blank, reject the record.
        ElseIf _sInPrimaryStatus = "" And _bValidatePrimaryStatus =
True Then
            Throw New
InvalidFieldException.MissingPrimaryStatusException()
        Else '/ it was supplied, make sure it is a P or S
            Select Case _sInPrimaryStatus.ToUpper
                Case "P", "S"
                Case Else
                    Throw New
InvalidFieldException.BadFormatPrimaryStatusException()
            End Select
        End If

        '/ Enroll Type Validation
        '/ If Enroll Type is blank and it was not one of the supplied
elements in
        '/ the health plans positive enrollment file, set Enroll Type
to "I".
        If _sInEnrollType = "" And _bValidateEnrollType = False Then
            _sInEnrollType = "I"
            '/ If the Enroll Type element was supplied and was blank,
reject the record.
        ElseIf _sInEnrollType = "" And _bValidateEnrollType = True Then
            Throw New
InvalidFieldException.MissingEnrollTypeException()
        Else '/ it was supplied, make sure it it a I,S,D,or C
            Select Case _sInEnrollType.ToUpper
                Case "I", "S", "D", "C"
                Case Else
                    Throw New
InvalidFieldException.BadFormatEnrollTypeException()
            End Select
        End If

        '/ If Marital status is supplied and was blank reject
        If _sInMaritalStatus = "" And _bValidateMaritalStatus = True
Then
            Throw New
InvalidFieldException.MissingMaritalStatusException()
            '/ assure that only "S" and "M" are passed
        Else
            Select Case _sInMaritalStatus.ToUpper
                Case "S", "M", ""
                Case Else
                    Throw New
InvalidFieldException.BadFormatMaritalStatusException()
            End Select
        End If
    End Sub

Protected Overridable Function ValidateLastName(ByVal sSuspect As
String, ByVal sGoodChars As String) As Integer
        If sSuspect.Length = 0 Then
            Return 0
        End If
        If DoesBadCharExist(sSuspect, sGoodChars) = True Then
            Return 1
        Else
            Return 2
        End If
    End Function

    Protected Overridable Function ValidateFirstName(ByVal sSuspect As
String, ByVal sGoodChars As String) As Integer
        If sSuspect.Length = 0 Then
            Return 0
        End If
        If DoesBadCharExist(sSuspect, sGoodChars) = True Then
            Return 1
        Else
            Return 2
        End If
    End Function

Protected Overridable Function AddDateDashes(ByVal sSuspect As String)
As String
        '/ add dashes to dates so that Isdate function willl work
properly
        '/ 2000-12-26 rlt
        Dim sCached As String

        sCached = sSuspect.Trim
        If sCached.Length = 8 Then
            Return sCached.Substring(0, 4) & "-" & sCached.Substring(4,
2) & "-" & sCached.Substring(6, 2)
        Else
            Return sCached
        End If
    End Function

Protected Overridable Function DoesBadCharExist(ByVal sSuspect As
String, ByVal sGoodChars As String) As Boolean
        Dim iCount As Integer
        For iCount = 0 To sSuspect.Length - 1
            If sGoodChars.IndexOf(sSuspect.ToUpper.Chars(iCount)) < 0
Then
                Return True
            End If
        Next iCount
        Return False
    End Function
Author
29 Mar 2005 10:17 PM
Stephany Young
I could be wrong but, I'm sure that MeltingPoint didn't realise that you are
dealing with a 'fixed' record when he alluded to using RegEx for the
parsing.

However RegEx would certainly be of assistance in the validation. Careful
construction of RegEx expressions would effeciencies in this method e.g. it
would make DoesBadCharacterExist obsolete.

Now, don't take this the wrong way here, but from the fragments you have
supplied and the obvious complexity of the operation, it is getting into the
area where you might be better off engaging a consultant to review the
project and make recommendations. Analysing the overall operation and making
the appropriate recommendations would take a number of hours, if not days,
and it would be unfair to expect those who donate their time and expertise,
quite freely I might add, to advise on something with the scope of your
project without being given the full picture.

My analysis of your fragments is that there there is a lot more to your
'problem' than meets the eye and I consider that if you try to get advice
'piecemeal' then you won't end up getting the performance boost you are
looking for and/or you will get advice that is entirely appropriate for the
fragment in question but might cause problems for you in the 'bigger
picture'.

That said, feel free to post 'questions' about specfic things that you like
advice on like 'How would I go about doing a benchmark test to see if Mid is
more efficient than SubString' or 'How would I construct a Regex expression
to make sure a string contains only certain characters'.


Show quoteHide quote
"hillcountry74" <shruth***@yahoo.com> wrote in message
news:1112110584.467762.175290@o13g2000cwo.googlegroups.com...
> Guys,
>
> I commented calling the Validate method and it was faster by 10 secs.
> Here is the Validate method code. Please let me know how I can optimize
> this. (Please note that ValidateLast, firstname, state etc are all the
> same). I'm using IndexOf method in DoesBadCharacterExist method? Is
> there a better way? Thanks
>
> Protected Sub Validate()
>        '**********************************************************
>        ' Validate the current input record
>        '**********************************************************
>        Dim sTmpHold As System.String
>        '/ Set the validated flag to True.
>        _bValidated = True
>
>
>        '/ Initialize the output error record.
>        BuildOutputErrorRecord()
>
>        '/ Member ID Validation
>        If _sInCarrierMemId = "" Then
>            Throw New InvalidFieldException.MissingMemberIDException()
>        End If
>
>        '/ Mhnet Member Validation
>        If Not _bMhnetMember Then
>            Select Case _sRateCode
>                Case "HCUSA", "HUMANAFLHMO", "HUMANAFLPPO", "HUMANA"
>                    'Commented out BB 2003-03-04 line from criteria
> because added humanafl
>                    'And frmMain.cboRateCode = "HCUSA" Then
>                    Throw New
> InvalidFieldException.MHNetMemberException()
>                Case Else
>            End Select
>        End If
>
>        '/ Last name Validation - chars "A-Z,.-'0-9"
>        Select Case ValidateLastName(_sInLastName, _sLastnameChars)
>            Case 0
>                Throw New
> InvalidFieldException.MissingLastnameException()
>            Case 1
>                Throw New
> InvalidFieldException.BadFormatLastnameException()
>            Case Else
>        End Select
>
>        '/ First name validation - chars "A-Z.'"
>        Select Case ValidateFirstName(_sInFirstName, _sFirstNameChars)
>            Case 0
>                Throw New
> InvalidFieldException.MissingFirstnameException()
>            Case 1
>                Throw New
> InvalidFieldException.BadFormatFirstnameException()
>            Case Else
>        End Select
>
>        '/ Middle name validation - chars "A-Z"
>        If ValidateMiddleName(_sInMiddleName, _sMiddleNameChars) <>
> True Then
>            Throw New
> InvalidFieldException.BadFormatMiddlenameException()
>        End If
>
>        '/ City name validation - chars "A-Z.-'"
>        If _sInCity <> "" Then
>            _sInCity = _sInCity.Replace("/", "")  'added 20050221 BB
>            _sInCity = _sInCity.Replace("\", "")  'added 20050221 BB
>            _sInCity = _sInCity.Replace(",", "") 'added 20050221 BB
>            If ValidateCityName(_sInCity, _sCityNameChars) <> True Then
>                Throw New
> InvalidFieldException.BadFormatCityException()
>            End If
>        End If
>
>        '/ State name validation - chars "A-Z"
>        If ValidateStateName(_sInState, _sStateNameChars) <> True Then
>            Throw New InvalidFieldException.BadFormatStateException()
>        End If
>
>        '/ SSN validation - make sure SSN is only numeric if it exists
>        If _sInSSN <> "" Then
>            _sInSSN = Mid(_sInSSN, 1, 9)
>            If Not IsNumeric(_sInSSN) Then
>                Throw New InvalidFieldException.BadFormatSSNException()
>            End If
>        End If
>
>        '/ Phone validation - make sure Phone is only numeric if it
> exists
>        If _sInPhone <> "" Then
>            _sInPhone = _sInPhone.Replace("-", "")
>            _sInPhone = _sInPhone.Replace(" ", "")   'added 20050221 BB
>            _sInPhone = _sInPhone.Replace("*", "")   'added 20050221 BB
>            _sInPhone = _sInPhone.Replace(".", "")   'added 20050221 BB
>            _sInPhone = _sInPhone.Replace("/", "")   'added 20050221 BB
>            If Not IsNumeric(_sInPhone) Then
>                Throw New
> InvalidFieldException.BadFormatPhoneException()
>            End If
>        End If
>
>        '/ Date of Birth Validation
>        sTmpHold = AddDateDashes(_sInDOB)
>        If Not IsDateValid(sTmpHold) Then
>            Throw New InvalidFieldException.DateOfBirthException()
>        Else
>            _sInDOB = System.String.Format("{0:yyyyMMdd}",
> CType(sTmpHold, Date))
>        End If
>        If sTmpHold > _sMagicTermDateWithDashes Then
>            _sInDOB = _sMagicTermDate
>        End If
>        '_sInDOB = CheckMaxDate(_sInDOB)
>
>        sTmpHold = AddDateDashes(_sInOptionEffDate)
>        If IsDateValid(sTmpHold) Then
>            If System.String.Format("{0:yyyyMMdd}", _sInOptionEffDate)
> < System.String.Format("{0:yyyyMMdd}", _sInceptionDate) Then
>                _sInOptionEffDate =
> System.String.Format("{0:yyyyMMdd}", _sInceptionDate)
>            Else
>                _sInOptionEffDate =
> System.String.Format("{0:yyyyMMdd}", _sInOptionEffDate)
>            End If
>        Else ' _sInOptionEffDate in not valid
>            Throw New InvalidFieldException.OptionEffDateException()
>        End If
>        'End If
>        'Commented out above Code, goes with above comment out code
> block  bb 2002-12-12
>        If sTmpHold > _sMagicTermDateWithDashes Then
>            _sInOptionEffDate = _sMagicTermDate
>        End If
>        '_sInOptionEffDate = CheckMaxDate(_sInOptionEffDate)
>
>        '/ if this contains "        " (8 blanks) then this allows the
>        'code to choose inception or filedate for hpeffdate
>        'Commented code below while re-writing in .Net
>        'as for an invalid date it was always defaulting to an empty
> string and was processed
>        'when it should actually be errored.
>        'If Not IsDateValid(AddDateDashes(_sInHPEffDate)) Then
>        '    _sInHPEffDate = ""
>        'End If
>
>        sTmpHold = AddDateDashes(_sInHPEffDate)
>        If _sInHPEffDate = "" Then
>            'If the file date is before the inception date, use the
> inception date.
>            'Otherwise, use the file date.
>            If System.String.Format("{0:yyyyMMdd}",
> FileDateTime(_sInputFileLocation).ToString) <
> System.String.Format("{0:yyyyMMdd}", _sInceptionDate) Then
>                _sInHPEffDate = System.String.Format("{0:yyyyMMdd}",
> _sInceptionDate)
>            Else
>                _sInHPEffDate = System.String.Format("{0:yyyyMMdd}",
> FileDateTime(_sInputFileLocation).ToString)
>            End If
>        Else
>            If IsDateValid(sTmpHold) Then
>                _sInHPEffDate = System.String.Format("{0:yyyyMMdd}",
> CType(sTmpHold, Date))
>            Else
>                Throw New InvalidFieldException.HPEffDateException()
>            End If
>        End If
>        '_sInHPEffDate = CheckMaxDate(_sInHPEffDate)
>        If sTmpHold > _sMagicTermDateWithDashes Then
>            _sInHPEffDate = _sMagicTermDate
>        End If
>        '/ Benefit Option Validation
>        If _sInBenefitOption = "" Then
>            Throw New
> InvalidFieldException.MissingBenefitOptionException()
>        End If
>
>        '/ Employer Group Validation
>        If _sInEmployerGroup = "" And _bValidateEmployerGroup Then
>            Throw New
> InvalidFieldException.MissingEmployerGroupException()
>        End If
>
>        '/ Set the Term Date to 12.31.2078 if the Term date is not a
> valid date.
>        sTmpHold = AddDateDashes(_sInTermDate)
>        If IsDateValid(sTmpHold) Then
>            _sInTermDate = System.String.Format("{0:yyyyMMdd}",
> CType(sTmpHold, Date))
>        Else
>            _sInTermDate = _sMagicTermDate
>        End If
>
>        ''/ Term Date Validation
>        ''/ If (msInTermDate < Format(Now(), "yyyymmdd")) Then
>        If (_sInTermDate < System.String.Format("{0:yyyyMMdd}",
> CType(_sInceptionDate, Date))) Then
>            'msOutErrorRec = msOutErrorRec & " Invalid Term Date Error:
> " & msInTermDate
>            '/ changed per Kit on 4-17-2001
>            _sInTermDate = System.String.Format("{0:yyyyMMdd}",
> CType(_sInceptionDate, Date))
>            Throw New InvalidFieldException.TermDateException()
>        End If
>        '_sInTermDate = CheckMaxDate(_sInTermDate)
>        If _sInTermDate > _sMagicTermDateWithDashes Then
>            _sInTermDate = _sMagicTermDate
>        End If
>
>        '/ Employer Group Aniversary date validation
>        If Not IsDateValid(AddDateDashes(_sInEmployerGroupAnivDate))
> Then
>            _sInEmployerGroupAnivDate = ""
>        Else
>            _sInEmployerGroupAnivDate =
> System.String.Format("{yyyymmdd}",
> AddDateDashes(_sInEmployerGroupAnivDate))
>        End If
>        '_sInEmployerGroupAnivDate =
> CheckMaxDate(_sInEmployerGroupAnivDate)
>        If _sInEmployerGroupAnivDate > _sMagicTermDateWithDashes Then
>            _sInEmployerGroupAnivDate = _sMagicTermDate
>        End If
>
>        '/ If the Head of House is blank and the element was NOT
> supplied in the
>        '/ submitted positive enrollment file, use the left nine
> characters
>        '/ of the Carrier Member ID.
>        If _sInHeadOfHouse = "" And _bValidateHeadofHouse = False Then
>            _sInHeadOfHouse = _sInCarrierMemId.Substring(0, 9)
>        End If
>
>        '/ Head of House Validation - chars "A-Z,.-'0-9"
>        Select Case ValidateHeadHouse(_sInHeadOfHouse,
> _sHeadOfHouseChars)
>            Case 0 'If the Head of House element was supplied and was
> blank, reject the record.
>                Throw New
> InvalidFieldException.MissingHeadofHouseException()
>                '/ If Head of House contains garbage chars reject the
> record
>            Case 1
>                Throw New
> InvalidFieldException.BadFormatHeadofHouseException()
>            Case Else
>        End Select
>
>        '/ Primary Status Validation
>        '/ If the Primary Status is blank and the Primary Status
> element was NOT
>        '/ submitted as an element of the positive enrollment file, use
> "P"
>        If _sInPrimaryStatus = "" And _bValidatePrimaryStatus = False
> Then
>            _sInPrimaryStatus = "P"
>            '/ If the Primary Status element was supplied and was
> blank, reject the record.
>        ElseIf _sInPrimaryStatus = "" And _bValidatePrimaryStatus =
> True Then
>            Throw New
> InvalidFieldException.MissingPrimaryStatusException()
>        Else '/ it was supplied, make sure it is a P or S
>            Select Case _sInPrimaryStatus.ToUpper
>                Case "P", "S"
>                Case Else
>                    Throw New
> InvalidFieldException.BadFormatPrimaryStatusException()
>            End Select
>        End If
>
>        '/ Enroll Type Validation
>        '/ If Enroll Type is blank and it was not one of the supplied
> elements in
>        '/ the health plans positive enrollment file, set Enroll Type
> to "I".
>        If _sInEnrollType = "" And _bValidateEnrollType = False Then
>            _sInEnrollType = "I"
>            '/ If the Enroll Type element was supplied and was blank,
> reject the record.
>        ElseIf _sInEnrollType = "" And _bValidateEnrollType = True Then
>            Throw New
> InvalidFieldException.MissingEnrollTypeException()
>        Else '/ it was supplied, make sure it it a I,S,D,or C
>            Select Case _sInEnrollType.ToUpper
>                Case "I", "S", "D", "C"
>                Case Else
>                    Throw New
> InvalidFieldException.BadFormatEnrollTypeException()
>            End Select
>        End If
>
>        '/ If Marital status is supplied and was blank reject
>        If _sInMaritalStatus = "" And _bValidateMaritalStatus = True
> Then
>            Throw New
> InvalidFieldException.MissingMaritalStatusException()
>            '/ assure that only "S" and "M" are passed
>        Else
>            Select Case _sInMaritalStatus.ToUpper
>                Case "S", "M", ""
>                Case Else
>                    Throw New
> InvalidFieldException.BadFormatMaritalStatusException()
>            End Select
>        End If
>    End Sub
>
> Protected Overridable Function ValidateLastName(ByVal sSuspect As
> String, ByVal sGoodChars As String) As Integer
>        If sSuspect.Length = 0 Then
>            Return 0
>        End If
>        If DoesBadCharExist(sSuspect, sGoodChars) = True Then
>            Return 1
>        Else
>            Return 2
>        End If
>    End Function
>
>    Protected Overridable Function ValidateFirstName(ByVal sSuspect As
> String, ByVal sGoodChars As String) As Integer
>        If sSuspect.Length = 0 Then
>            Return 0
>        End If
>        If DoesBadCharExist(sSuspect, sGoodChars) = True Then
>            Return 1
>        Else
>            Return 2
>        End If
>    End Function
>
> Protected Overridable Function AddDateDashes(ByVal sSuspect As String)
> As String
>        '/ add dashes to dates so that Isdate function willl work
> properly
>        '/ 2000-12-26 rlt
>        Dim sCached As String
>
>        sCached = sSuspect.Trim
>        If sCached.Length = 8 Then
>            Return sCached.Substring(0, 4) & "-" & sCached.Substring(4,
> 2) & "-" & sCached.Substring(6, 2)
>        Else
>            Return sCached
>        End If
>    End Function
>
> Protected Overridable Function DoesBadCharExist(ByVal sSuspect As
> String, ByVal sGoodChars As String) As Boolean
>        Dim iCount As Integer
>        For iCount = 0 To sSuspect.Length - 1
>            If sGoodChars.IndexOf(sSuspect.ToUpper.Chars(iCount)) < 0
> Then
>                Return True
>            End If
>        Next iCount
>        Return False
>    End Function
>
Author
30 Mar 2005 12:53 AM
MeltingPoint
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112110584.467762.175290@o13g2000cwo.googlegroups.com:

<lots o code>

Some good ideas so far. I've started to put the regex expression together
for you, could have it done in a few hours. If you want to send me one of
these files, (important info changed of course) I could fine tune the
expression. macmanic(zero)(zero)atHotmail.com

Note to anyone else reading this thread, Any ideas on the speed of regex as
opposed to Substring/IndexOf. I can say for sure that I've parsed a 4mb
file with regex in a few hundred milliseconds.
Author
30 Mar 2005 3:18 PM
hillcountry74
MeltingPoint,

Thanks for your help. I've just emailed a sample file (3.4MB).


MeltingPoint wrote:
Show quoteHide quote
> "hillcountry74" <shruth***@yahoo.com> wrote in
> news:1112110584.467762.175290@o13g2000cwo.googlegroups.com:
>
> <lots o code>
>
> Some good ideas so far. I've started to put the regex expression
together
> for you, could have it done in a few hours. If you want to send me
one of
> these files, (important info changed of course) I could fine tune the

> expression. macmanic(zero)(zero)atHotmail.com
>
> Note to anyone else reading this thread, Any ideas on the speed of
regex as
> opposed to Substring/IndexOf. I can say for sure that I've parsed a
4mb
> file with regex in a few hundred milliseconds.
Author
30 Mar 2005 1:14 AM
MeltingPoint
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112110584.467762.175290@o13g2000cwo.googlegroups.com:

<lots o code>

Some good ideas so far. I've started to put the regex expression
together for you, could have it done in a few hours. If you want to send
me one of these files, (important info changed of course) I could fine
tune the expression. macmanic(zero)(zero)atHotmail.com

Note to anyone else reading this thread, Any ideas on the speed of regex
as opposed to Substring/IndexOf. I can say for sure that I've parsed a
4mb file with regex in a few hundred milliseconds.

++Just saw stefs comment. I'm not sure what difference it makes as to
weather its fixed or not. RegEx still works and its alot easier on the
eyes:)
((?<ActionCode>.)
(?<CarrierID>\d{0,25})
(?<LastName>\w{0,60}\s*\b)
(?<FirstName>\w{0,30}\s*\b)
(?<MiddleName>\w{0,15}\s*\b)
(?<Addr1>.{0,60}\s*\b)
(?<Addr2>.{0,60}\s*\b)
(?<City>.{0,30}\s*\b)
(?<State>.{0,2}\s*\b)
(?<Zip>.{0,10}\s*\b))
Actually the fact that its fixed makes it easier.

And a note as to how close I was paying attention:
sPreferredInputFile.Trim.Length >= 439
does not 'allude' to me that it is totally fixed.

However, I don't know Stef, she probably knows more than me, considering
I just started using RegEx a month ago. But the above Regex does match
the following:

e8374837463784958473627495Sc9ott                         8nglis                                       
Micheal        554 sdf sdf                                               
667 rtert ertwert                                        Hell                   
FL90210

Which I think is what the record looks like (at least so far)

Let me know, both of you :)
MP
Author
30 Mar 2005 2:32 AM
Stephany Young
I surrender. I was having an abberration and thinking of Regex for simple
pattern matching rather than it's 'extraction' capability.

_sInHeadOfHouse = Trim(Mid(sPreferredInputFile, 431, 9))
....
_sInPrimaryStatus = Trim(Mid(sPreferredInputFile, 440, 1))
_sInEnrollType = Trim(Mid(sPreferredInputFile, 441, 1))
Try
    _sInMaritalStatus = Trim(Mid(sPreferredInputFile, 442, 1))
....

This stuff here indicates to me that the record is, more than likely, fixed.
Note the Try ... Catch ... End Try to catch if there are not 442 characters,
but there is no matching construct for position 440 and 441. The earlier
test is for a record length of 439 characters or more, so the record might
be 439, 440, 441 or 442 characters. The catcher on position 442 implies that
characters 440 and 441 are always present. I read between the lines and
decided that 442 'should' always be present. Given that hilcountry74 hasn't
provided all the information this was a 50/50 call but for the purposes of
the exercise is largely irrelevant.

I have no problem with being proved wrong, but I don't think that your regex
will work for parsing here.

In your example thus far you rely on there being exactly 25 digits for
CarrierID. If there are less then your match attempt for LastName won't
start at position 27. Remember that the start position for each component of
the string is specifically defined. Also there is no indication that
CarrierID is numeric which means that it should use . instead of \d. To read
the correct number of characters, the quantifier must be {25} rather than
{0,25} and this means that you have read any trailing spaces as well which
still have to be trimmed off when the matches are read out.

(?<LastName>\w{0,60}\s*\b) will only handle simple names - those with no
imbedded spaces or punctuation characters like "van Allen", "O'Brien",
"Mandeville-Brown". Also it is common for company names to be stored in a
LastName field and other name fields left blank like "Acme Inc.". \w will
miss imbedded spaces, apostrophes, hyphens and periods. Another factor is
that you get idiots hitting the spacebar just as they are starting to type a
name and never correcting it so you can get " Smith". The \w will report no
match at all in this case. Use of the \b will only make things worse in such
cases.

In this case I think that the Mid or SubString methods are best for the
actual parsing, however regex will certainly make the validation routine
more compact and efficient because here you are operating on each individual
string rather than trying to pick the character sequence from postion x to
position y and therefore 2nd guessing what is actually there or not there as
the case may be.

BTW: I have a perfectly good name - there is no need to assume that it needs
contracting or that the spelling needs changing.


Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:Ca6dnTEa3_YVYNTfRVn-jg@rogers.com...
> "hillcountry74" <shruth***@yahoo.com> wrote in
> news:1112110584.467762.175290@o13g2000cwo.googlegroups.com:
>
> <lots o code>
>
> Some good ideas so far. I've started to put the regex expression
> together for you, could have it done in a few hours. If you want to send
> me one of these files, (important info changed of course) I could fine
> tune the expression. macmanic(zero)(zero)atHotmail.com
>
> Note to anyone else reading this thread, Any ideas on the speed of regex
> as opposed to Substring/IndexOf. I can say for sure that I've parsed a
> 4mb file with regex in a few hundred milliseconds.
>
> ++Just saw stefs comment. I'm not sure what difference it makes as to
> weather its fixed or not. RegEx still works and its alot easier on the
> eyes:)
> ((?<ActionCode>.)
> (?<CarrierID>\d{0,25})
> (?<LastName>\w{0,60}\s*\b)
> (?<FirstName>\w{0,30}\s*\b)
> (?<MiddleName>\w{0,15}\s*\b)
> (?<Addr1>.{0,60}\s*\b)
> (?<Addr2>.{0,60}\s*\b)
> (?<City>.{0,30}\s*\b)
> (?<State>.{0,2}\s*\b)
> (?<Zip>.{0,10}\s*\b))
> Actually the fact that its fixed makes it easier.
>
> And a note as to how close I was paying attention:
> sPreferredInputFile.Trim.Length >= 439
> does not 'allude' to me that it is totally fixed.
>
> However, I don't know Stef, she probably knows more than me, considering
> I just started using RegEx a month ago. But the above Regex does match
> the following:
>
> e8374837463784958473627495Sc9ott                         8nglis
> Micheal        554 sdf sdf
> 667 rtert ertwert                                        Hell
> FL90210
>
> Which I think is what the record looks like (at least so far)
>
> Let me know, both of you :)
> MP
Author
30 Mar 2005 3:48 AM
MeltingPoint
Show quote Hide quote
"Stephany Young" <noone@localhost> wrote in
news:O$CKyCNNFHA.1176@TK2MSFTNGP15.phx.gbl:

> I surrender. I was having an abberration and thinking of Regex for
> simple pattern matching rather than it's 'extraction' capability.
>
> _sInHeadOfHouse = Trim(Mid(sPreferredInputFile, 431, 9))
> ...
> _sInPrimaryStatus = Trim(Mid(sPreferredInputFile, 440, 1))
> _sInEnrollType = Trim(Mid(sPreferredInputFile, 441, 1))
> Try
>     _sInMaritalStatus = Trim(Mid(sPreferredInputFile, 442, 1))
> ...
>
> This stuff here indicates to me that the record is, more than likely,
> fixed. Note the Try ... Catch ... End Try to catch if there are not
> 442 characters, but there is no matching construct for position 440
> and 441. The earlier test is for a record length of 439 characters or
> more, so the record might be 439, 440, 441 or 442 characters. The
> catcher on position 442 implies that characters 440 and 441 are always
> present. I read between the lines and decided that 442 'should' always
> be present. Given that hilcountry74 hasn't provided all the
> information this was a 50/50 call but for the purposes of the exercise
> is largely irrelevant.
>
> I have no problem with being proved wrong, but I don't think that your
> regex will work for parsing here.
>
> In your example thus far you rely on there being exactly 25 digits for
> CarrierID. If there are less then your match attempt for LastName
> won't start at position 27. Remember that the start position for each
> component of the string is specifically defined. Also there is no
> indication that CarrierID is numeric which means that it should use .
> instead of \d. To read the correct number of characters, the
> quantifier must be {25} rather than {0,25} and this means that you
> have read any trailing spaces as well which still have to be trimmed
> off when the matches are read out.
>
> (?<LastName>\w{0,60}\s*\b) will only handle simple names - those with
> no imbedded spaces or punctuation characters like "van Allen",
> "O'Brien", "Mandeville-Brown". Also it is common for company names to
> be stored in a LastName field and other name fields left blank like
> "Acme Inc.". \w will miss imbedded spaces, apostrophes, hyphens and
> periods. Another factor is that you get idiots hitting the spacebar
> just as they are starting to type a name and never correcting it so
> you can get " Smith". The \w will report no match at all in this case.
> Use of the \b will only make things worse in such cases.
>
> In this case I think that the Mid or SubString methods are best for
> the actual parsing, however regex will certainly make the validation
> routine more compact and efficient because here you are operating on
> each individual string rather than trying to pick the character
> sequence from postion x to position y and therefore 2nd guessing what
> is actually there or not there as the case may be.
>
> BTW: I have a perfectly good name - there is no need to assume that it
> needs contracting or that the spelling needs changing.

I knew I would catch it for that :) Force of habit from my personal
life:)

OK just checked it, imbedded spaces screw it up. And theres nothing I
can think of readily. I've seen some funky reg exp's - I'm sure it can
be done but not by me:) I tried just doing:

((?<ActionCode>.{1})" _
& "(?<CarrierID>.{25})" _
& "(?<LastName>.{60})" _
& "(?<FirstName>.{30})" _
& "(?<MiddleName>.{15})" _
& "(?<Addr1>.{60})" _
& "(?<Addr2>.{60})" _
& "(?<City>.{30)" _
& "(?<State>.{2})" _
& "(?<Zip>.{10}))"

....and my computer actually laughed at me!!

Back to the drawing board...
Author
30 Mar 2005 6:44 AM
Stephany Young
You have a typo in your "(?<City>.{30)" - a missing }

Anyway, this works a treat with the caveat that the target string has to be
the expected length (442) or longer.

On my machine 10000 takes 1 second give or take a few milliseconds and
100000 iterations takes 10 seconds give or take a few milliseconds. It is
fair to say that, as writ and on my machine, as a parser it will handle
approx 1000 records per second.

So, I stand educated, you can do rudimentary parsing with Regex so long as
the expression is very carefully constructed.

Dim _s As String = "ACarrierID<16 spaces>" & _
  "LastName<52 spaces>" & _
  "FirstName<21 spaces>" & _
  "MiddleName<5 spaces>" & _
  "Addr1<55 spaces>" & _
  "Addr2<55 spaces>" & _
  "City<26 spaces>" & _
  "StZip<7 spaces>" & _
  "BenefitOption<47 spaces>" & _
  "EmployerGroup  OptionEfHPEffDatTermDate" & _
  "SDOB<5 spaces>" & _
  "SSN<6 spaces>" & _
  "Phone<7 spaces>" & _
  "EmployerHeadOfHouPM"

Dim _exp As String = "(?<ActionCode>.{1})" & _
  "(?<CarrierID>.{25})" & _
  "(?<LastName>.{60})" & _
  "(?<FirstName>.{30})" & _
  "(?<MiddleName>.{15})" & _
  "(?<Addr1>.{60})" & _
  "(?<Addr2>.{60})" & _
  "(?<City>.{30})" & _
  "(?<State>.{2})" & _
  "(?<Zip>.{10})" & _
  "(?<BenefitOption>.{60})" & _
  "(?<EmployerGroup>.{15})" & _
  "(?<OptionEffDate>.{8})" & _
  "(?<HPEffDate>.{8})" & _
  "(?<TermDate>.{8})" & _
  "(?<Sex>.{1})" & _
  "(?<DOB>.{8})" & _
  "(?<SSN>.{9})" & _
  "(?<Phone>.{12})" & _
  "(?<EmployerGroupAnivDate>.{8})" & _
  "(?<HeadOfHouse>.{9})" & _
  "(?<PrimaryStatus>.{1})" & _
  "(?<MaritalStatus>.{1})"

Dim r As Regex = New Regex(_exp)

Dim m As Match = r.Match(_s)

Dim _sInActionCode As String = m.Groups("ActionCode").ToString.Trim
Dim _sInCarrierID As String = m.Groups("CarrierID").ToString.Trim
Dim _sInLastName As String = m.Groups("LastName").ToString.Trim
Dim _sInFirstName As String = m.Groups("FirstName").ToString.Trim
Dim _sInMiddleName As String = m.Groups("MiddleName").ToString.Trim
Dim _sInAddr1 As String = m.Groups("Addr1").ToString.Trim
Dim _sInAddr2 As String = m.Groups("Addr2").ToString.Trim
Dim _sInCity As String = m.Groups("City").ToString.Trim
Dim _sInState As String = m.Groups("State").ToString.Trim
Dim _sInZip As String = m.Groups("Zip").ToString.Trim
Dim _sInBenefitOption As String = m.Groups("BenefitOption").ToString.Trim
Dim _sInEmployerGroup As String = m.Groups("EmployerGroup").ToString.Trim
Dim _sInOptionEffDate As String = m.Groups("OptionEffDate").ToString.Trim
Dim _sInHPEffDate As String = m.Groups("OptionEffDate").ToString.Trim
Dim _sInTermDate As String = m.Groups("HPEffDate").ToString.Trim
Dim _sInSex As String = m.Groups("TermDate").ToString.Trim
Dim _sInDOB As String = m.Groups("DOB").ToString.Trim
Dim _sInSSN As String = m.Groups("SSN").ToString.Trim
Dim _sInPhone As String = m.Groups("Phone").ToString.Trim
Dim _sInEmployerGroupAnivDate As String =
m.Groups("EmployerGroupAnivDate").ToString.Trim
Dim _sInHeadOfHouse As String = m.Groups("HeadOfHouse").ToString.Trim
Dim _sInPrimaryStatus As String = m.Groups("PrimaryStatus").ToString.Trim
Dim _sInMaritalStatus As String = m.Groups("MaritalStatus").ToString.Trim

Console.WriteLine("_sInActionCode = " & _sInActionCode)
Console.WriteLine("_sInCarrierID = " & _sInCarrierID)
Console.WriteLine("_sInLastName = " & _sInLastName)
Console.WriteLine("_sInFirstName = " & _sInFirstName)
Console.WriteLine("_sInMiddleName = " & _sInMiddleName)
Console.WriteLine("_sInAddr1 = " & _sInAddr1)
Console.WriteLine("_sInAddr2 = " & _sInAddr2)
Console.WriteLine("_sInCity = " & _sInCity)
Console.WriteLine("_sInState = " & _sInState)
Console.WriteLine("_sInZip = " & _sInZip)
Console.WriteLine("_sInBenefitOption = " & _sInBenefitOption)
Console.WriteLine("_sInEmployerGroup = " & _sInEmployerGroup)
Console.WriteLine("_sInOptionEffDate = " & _sInOptionEffDate)
Console.WriteLine("_sInHPEffDate = " & _sInHPEffDate)
Console.WriteLine("_sInTermDate = " & _sInTermDate)
Console.WriteLine("_sInSex = " & _sInSex)
Console.WriteLine("_sInDOB = " & _sInDOB)
Console.WriteLine("_sInSSN = " & _sInSSN)
Console.WriteLine("_sInPhone = " & _sInPhone)
Console.WriteLine("_sInEmployerGroupAnivDate = " &
_sInEmployerGroupAnivDate)
Console.WriteLine("_sInHeadOfHouse = " & _sInHeadOfHouse)
Console.WriteLine("_sInPrimaryStatus = " & _sInPrimaryStatus)
Console.WriteLine("_sInMaritalStatus = " & _sInMaritalStatus)

Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:YpmdndvMEa8bvNffRVn-sg@rogers.com...
> "Stephany Young" <noone@localhost> wrote in
> news:O$CKyCNNFHA.1176@TK2MSFTNGP15.phx.gbl:
>
> <snip>
>
> OK just checked it, imbedded spaces screw it up. And theres nothing I
> can think of readily. I've seen some funky reg exp's - I'm sure it can
> be done but not by me:) I tried just doing:
>
> ((?<ActionCode>.{1})" _
> & "(?<CarrierID>.{25})" _
> & "(?<LastName>.{60})" _
> & "(?<FirstName>.{30})" _
> & "(?<MiddleName>.{15})" _
> & "(?<Addr1>.{60})" _
> & "(?<Addr2>.{60})" _
> & "(?<City>.{30)" _
> & "(?<State>.{2})" _
> & "(?<Zip>.{10}))"
>
> ...and my computer actually laughed at me!!
>
> Back to the drawing board...
Author
30 Mar 2005 4:48 PM
hillcountry74
Stephany,

Thanks for the code. I tried your sample, it doesn't seem to work. I'm
assuming _s variable is the string to be parsed and need not
necessarily have the fieldnames like Lastname etc, right?

How does the regex engine know to take 26 characters for extracting
City and that it is not the first 26 chrs. Please explain. And excuse
me for my ignorance. Never used reg exprs.
Author
30 Mar 2005 8:51 PM
MeltingPoint
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112201281.988499.58130@l41g2000cwc.googlegroups.com:

> Stephany,
>
> Thanks for the code. I tried your sample, it doesn't seem to work. I'm
> assuming _s variable is the string to be parsed and need not
> necessarily have the fieldnames like Lastname etc, right?
>
> How does the regex engine know to take 26 characters for extracting
> City and that it is not the first 26 chrs. Please explain. And excuse
> me for my ignorance. Never used reg exprs.
>

Imports System.Text
Imports System.IO
Imports System.Text.RegularExpressions
Module Module1

    Sub Main()

        Dim aStreamReader As TextReader
        aStreamReader = New StreamReader("C:\SAMPLE FILE.txt")
        Dim _s As String = aStreamReader.ReadToEnd
        aStreamReader.Close()

        Dim _exp As String = "((?<ActionCode>.{1})" & _
          "(?<CarrierID>.{25})" & _
          "(?<LastName>.{60})" & _
          "(?<FirstName>.{30})" & _
          "(?<MiddleName>.{15})" & _
          "(?<Addr1>.{60})" & _
          "(?<Addr2>.{60})" & _
          "(?<City>.{30})" & _
          "(?<State>.{2})" & _
          "(?<Zip>.{10})" & _
          "(?<BenefitOption>.{60})" & _
          "(?<EmployerGroup>.{15})" & _
          "(?<OptionEffDate>.{8})" & _
          "(?<HPEffDate>.{8})" & _
          "(?<TermDate>.{8})" & _
          "(?<Sex>.{1})" & _
          "(?<DOB>.{8})" & _
          "(?<SSN>.{9})" & _
          "(?<Phone>.{12})" & _
          "(?<EmployerGroupAnivDate>.{8})" & _
          "(?<HeadOfHouse>.{9})" & _
          "(?<PrimaryStatus>.{1})" & _
          "(?<MaritalStatus>.{1}))"

        Dim r As Regex = New Regex(_exp)

        Dim g As MatchCollection = r.Matches(_s)
        Dim m As Match

        Dim _sInActionCode As String
        Dim _sInCarrierID As String
        Dim _sInLastName As String
        Dim _sInFirstName As String
        Dim _sInMiddleName As String
        Dim _sInAddr1 As String
        Dim _sInAddr2 As String
        Dim _sInCity As String
        Dim _sInState As String
        Dim _sInZip As String
        Dim _sInBenefitOption As String
        Dim _sInEmployerGroup As String
        Dim _sInOptionEffDate As String
        Dim _sInHPEffDate As String
        Dim _sInTermDate As String
        Dim _sInSex As String
        Dim _sInDOB As String
        Dim _sInSSN As String
        Dim _sInPhone As String
        Dim _sInEmployerGroupAnivDate As String
        Dim _sInHeadOfHouse As String
        Dim _sInPrimaryStatus As String
        Dim _sInMaritalStatus As String
        Dim d As New DateTime
        Dim dt As Double

        d = DateTime.Now

        For i As Int32 = 0 To g.Count - 1
            m = g.Item(i)

            _sInActionCode = m.Groups("ActionCode").ToString.Trim
            _sInCarrierID = m.Groups("CarrierID").ToString.Trim
            _sInLastName = m.Groups("LastName").ToString.Trim
            _sInFirstName = m.Groups("FirstName").ToString.Trim
            _sInMiddleName = m.Groups("MiddleName").ToString.Trim
            _sInAddr1 = m.Groups("Addr1").ToString.Trim
            _sInAddr2 = m.Groups("Addr2").ToString.Trim
            _sInCity = m.Groups("City").ToString.Trim
            _sInState = m.Groups("State").ToString.Trim
            _sInZip = m.Groups("Zip").ToString.Trim
            _sInBenefitOption = m.Groups("BenefitOption").ToString.Trim
            _sInEmployerGroup = m.Groups("EmployerGroup").ToString.Trim
            _sInOptionEffDate = m.Groups("OptionEffDate").ToString.Trim
            _sInHPEffDate = m.Groups("OptionEffDate").ToString.Trim
            _sInTermDate = m.Groups("HPEffDate").ToString.Trim
            _sInSex = m.Groups("TermDate").ToString.Trim
            _sInDOB = m.Groups("DOB").ToString.Trim
            _sInSSN = m.Groups("SSN").ToString.Trim
            _sInPhone = m.Groups("Phone").ToString.Trim
            _sInEmployerGroupAnivDate = m.Groups
("EmployerGroupAnivDate").ToString.Trim()
            _sInHeadOfHouse = m.Groups("HeadOfHouse").ToString.Trim
            _sInPrimaryStatus = m.Groups("PrimaryStatus").ToString.Trim
            _sInMaritalStatus = m.Groups("MaritalStatus").ToString.Trim
            'Console.WriteLine()
            Console.WriteLine(i)
            'Console.WriteLine()
            'Console.WriteLine("_sInActionCode = " & _sInActionCode)
            'Console.WriteLine("_sInCarrierID = " & _sInCarrierID)
            'Console.WriteLine("_sInLastName = " & _sInLastName)
            'Console.WriteLine("_sInFirstName = " & _sInFirstName)
            'Console.WriteLine("_sInMiddleName = " & _sInMiddleName)
            'Console.WriteLine("_sInAddr1 = " & _sInAddr1)
            'Console.WriteLine("_sInAddr2 = " & _sInAddr2)
            'Console.WriteLine("_sInCity = " & _sInCity)
            'Console.WriteLine("_sInState = " & _sInState)
            'Console.WriteLine("_sInZip = " & _sInZip)
            'Console.WriteLine("_sInBenefitOption = " &
_sInBenefitOption)
            'Console.WriteLine("_sInEmployerGroup = " &
_sInEmployerGroup)
            'Console.WriteLine("_sInOptionEffDate = " &
_sInOptionEffDate)
            'Console.WriteLine("_sInHPEffDate = " & _sInHPEffDate)
            'Console.WriteLine("_sInTermDate = " & _sInTermDate)
            'Console.WriteLine("_sInSex = " & _sInSex)
            'Console.WriteLine("_sInDOB = " & _sInDOB)
            'Console.WriteLine("_sInSSN = " & _sInSSN)
            'Console.WriteLine("_sInPhone = " & _sInPhone)
            'Console.WriteLine("_sInEmployerGroupAnivDate = " &
_sInEmployerGroupAnivDate)
            'Console.WriteLine("_sInHeadOfHouse = " & _sInHeadOfHouse)
            'Console.WriteLine("_sInPrimaryStatus = " &
_sInPrimaryStatus)
            'Console.WriteLine("_sInMaritalStatus = " &
_sInMaritalStatus)
        Next
        Dim dt2 = DateTime.Now.Subtract(d).TotalSeconds

        Console.WriteLine(dt2)
        Console.ReadLine()
    End Sub

End Module

Sorry the code is a little messy. But it works. Parsed 8064 Records in 2
seconds flat. Simulate some other work by outputing everything to a
console window an it takes 34 seconds.

To answer your question RegEx uses a position marker *simular* to that
of reading a file where the position is incremented relative to the
amount read (for comparison sakes). So just telling it how much to read
is good enough.

Thankyou Stephany for writing that all out:)

A note about your sample file: I hope fields were left blank, and things
like HeadOfHouse is a number, otherwise this isn't working.
Sample:
_sInActionCode =
_sInCarrierID = 00000050101
_sInLastName = SMITH
_sInFirstName = VICKI
_sInMiddleName =
_sInAddr1 = C/O SUE EDDY  -  MISD BENEFITS
_sInAddr2 = 405 EAST DAVIS
_sInCity = MESQUITE
_sInState = TX
_sInZip = 75149
_sInBenefitOption = 001
_sInEmployerGroup = 2002MISD
_sInOptionEffDate = 20050301
_sInHPEffDate = 20050301
_sInTermDate = 20040401
_sInSex = 20050331
_sInDOB = 19510125
_sInSSN = 000010009
_sInPhone =
_sInEmployerGroupAnivDate =
_sInHeadOfHouse = 464088770
_sInPrimaryStatus = P
_sInMaritalStatus = I

Let me know,
MP
Author
30 Mar 2005 11:02 PM
hillcountry74
Thanks a lot MP. Really appreciate your help.

Can you please paste the regular expression for this? Can't find it in
the code.

Also, on the headofhouse, it could be alphanumeric. And yes, some
fields would be blank.

There could be files of size 400MB. In such a case, reading till
endoffile might not work. Instead, if it is changed to reading one line
at a time, do you think the speed will reduce?

Thanks again.
Author
30 Mar 2005 11:23 PM
MeltingPoint
Show quote Hide quote
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112223769.480509.272010@g14g2000cwa.googlegroups.com:

> Thanks a lot MP. Really appreciate your help.
>
> Can you please paste the regular expression for this? Can't find it in
> the code.
>
> Also, on the headofhouse, it could be alphanumeric. And yes, some
> fields would be blank.
>
> There could be files of size 400MB. In such a case, reading till
> endoffile might not work. Instead, if it is changed to reading one
line
> at a time, do you think the speed will reduce?
>
> Thanks again.
>

This is the Regular Expression:
Dim _exp As String = "((?<ActionCode>.{1})" & _
          "(?<CarrierID>.{25})" & _
          "(?<LastName>.{60})" & _
          "(?<FirstName>.{30})" & _
          "(?<MiddleName>.{15})" & _
          "(?<Addr1>.{60})" & _
          "(?<Addr2>.{60})" & _
          "(?<City>.{30})" & _
          "(?<State>.{2})" & _
          "(?<Zip>.{10})" & _
          "(?<BenefitOption>.{60})" & _
          "(?<EmployerGroup>.{15})" & _
          "(?<OptionEffDate>.{8})" & _
          "(?<HPEffDate>.{8})" & _
          "(?<TermDate>.{8})" & _
          "(?<Sex>.{1})" & _
          "(?<DOB>.{8})" & _
          "(?<SSN>.{9})" & _
          "(?<Phone>.{12})" & _
          "(?<EmployerGroupAnivDate>.{8})" & _
          "(?<HeadOfHouse>.{9})" & _
          "(?<PrimaryStatus>.{1})" & _
          "(?<MaritalStatus>.{1}))"

A pretty good definition can be found on MSDN. Search For RegEx or
Regular Expressions. :)

I'll try to "simulate"(wink wink:) a 400mb file and check performance.
Reading one line at a time is out of the question for this experiment,
as it would require a couple of million reads(guessing), reading in
442bytes * nRecords would be better if not the best way to do it. BUT
this and ReadToEnd both REQUIRE every record to be 442 bytes (or
whatever it is) Off by one byte, and kiss you're records goodbye.

MP
Author
31 Mar 2005 12:05 AM
Stephany Young
I'm just a little concerned that you might have missed a critical point her
MeltingPoint.

Each record in the file is terminated by a LF or a CR/LF pair. This is hown
by the use of the ReadLine method in the original code fragment.

A line is defined as a sequence of characters followed by a line feed or a
carriage return immediately followed by a line feed. The string that is
returned does not contain the terminating carriage return or line feed. The
returned value is a null reference (Nothing in Visual Basic) if the end of
the input stream is reached.

If you use the ReadToEnd method then you have to identify what the record
delimiter is and split the input into 'records' based on that delimiter
before you can apply the RegEx anyway. Unless, of course the reGex is
preceded by a '$' to indicate start at the beginning of each line.

If you dont handle this then each record, subsequent to the first, will be
off by 1 or 2 characters compounding.



Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:FOqdnRmyx9xgqdbfRVn-ug@rogers.com...
> "hillcountry74" <shruth***@yahoo.com> wrote in
> news:1112223769.480509.272010@g14g2000cwa.googlegroups.com:
>
>> Thanks a lot MP. Really appreciate your help.
>>
>> Can you please paste the regular expression for this? Can't find it in
>> the code.
>>
>> Also, on the headofhouse, it could be alphanumeric. And yes, some
>> fields would be blank.
>>
>> There could be files of size 400MB. In such a case, reading till
>> endoffile might not work. Instead, if it is changed to reading one
> line
>> at a time, do you think the speed will reduce?
>>
>> Thanks again.
>>
>
> This is the Regular Expression:
> Dim _exp As String = "((?<ActionCode>.{1})" & _
>          "(?<CarrierID>.{25})" & _
>          "(?<LastName>.{60})" & _
>          "(?<FirstName>.{30})" & _
>          "(?<MiddleName>.{15})" & _
>          "(?<Addr1>.{60})" & _
>          "(?<Addr2>.{60})" & _
>          "(?<City>.{30})" & _
>          "(?<State>.{2})" & _
>          "(?<Zip>.{10})" & _
>          "(?<BenefitOption>.{60})" & _
>          "(?<EmployerGroup>.{15})" & _
>          "(?<OptionEffDate>.{8})" & _
>          "(?<HPEffDate>.{8})" & _
>          "(?<TermDate>.{8})" & _
>          "(?<Sex>.{1})" & _
>          "(?<DOB>.{8})" & _
>          "(?<SSN>.{9})" & _
>          "(?<Phone>.{12})" & _
>          "(?<EmployerGroupAnivDate>.{8})" & _
>          "(?<HeadOfHouse>.{9})" & _
>          "(?<PrimaryStatus>.{1})" & _
>          "(?<MaritalStatus>.{1}))"
>
> A pretty good definition can be found on MSDN. Search For RegEx or
> Regular Expressions. :)
>
> I'll try to "simulate"(wink wink:) a 400mb file and check performance.
> Reading one line at a time is out of the question for this experiment,
> as it would require a couple of million reads(guessing), reading in
> 442bytes * nRecords would be better if not the best way to do it. BUT
> this and ReadToEnd both REQUIRE every record to be 442 bytes (or
> whatever it is) Off by one byte, and kiss you're records goodbye.
>
> MP
Author
31 Mar 2005 12:45 AM
MeltingPoint
Show quote Hide quote
"Stephany Young" <noone@localhost> wrote in
news:OFvYnVYNFHA.3844@TK2MSFTNGP14.phx.gbl:

> I'm just a little concerned that you might have missed a critical
> point her MeltingPoint.
>
> Each record in the file is terminated by a LF or a CR/LF pair. This is
> hown by the use of the ReadLine method in the original code fragment.
>
> A line is defined as a sequence of characters followed by a line feed
> or a carriage return immediately followed by a line feed. The string
> that is returned does not contain the terminating carriage return or
> line feed. The returned value is a null reference (Nothing in Visual
> Basic) if the end of the input stream is reached.
>
> If you use the ReadToEnd method then you have to identify what the
> record delimiter is and split the input into 'records' based on that
> delimiter before you can apply the RegEx anyway. Unless, of course the
> reGex is preceded by a '$' to indicate start at the beginning of each
> line.
>
> If you dont handle this then each record, subsequent to the first,
> will be off by 1 or 2 characters compounding.
>
>
>
> "MeltingPoint" <n***@all.com> wrote in message
> news:FOqdnRmyx9xgqdbfRVn-ug@rogers.com...
>> "hillcountry74" <shruth***@yahoo.com> wrote in
>> news:1112223769.480509.272010@g14g2000cwa.googlegroups.com:
>>
>>> Thanks a lot MP. Really appreciate your help.
>>>
>>> Can you please paste the regular expression for this? Can't find it
>>> in the code.
>>>
>>> Also, on the headofhouse, it could be alphanumeric. And yes, some
>>> fields would be blank.
>>>
>>> There could be files of size 400MB. In such a case, reading till
>>> endoffile might not work. Instead, if it is changed to reading one
>> line
>>> at a time, do you think the speed will reduce?
>>>
>>> Thanks again.
>>>
>>
>> This is the Regular Expression:
>> Dim _exp As String = "((?<ActionCode>.{1})" & _
>>          "(?<CarrierID>.{25})" & _
>>          "(?<LastName>.{60})" & _
>>          "(?<FirstName>.{30})" & _
>>          "(?<MiddleName>.{15})" & _
>>          "(?<Addr1>.{60})" & _
>>          "(?<Addr2>.{60})" & _
>>          "(?<City>.{30})" & _
>>          "(?<State>.{2})" & _
>>          "(?<Zip>.{10})" & _
>>          "(?<BenefitOption>.{60})" & _
>>          "(?<EmployerGroup>.{15})" & _
>>          "(?<OptionEffDate>.{8})" & _
>>          "(?<HPEffDate>.{8})" & _
>>          "(?<TermDate>.{8})" & _
>>          "(?<Sex>.{1})" & _
>>          "(?<DOB>.{8})" & _
>>          "(?<SSN>.{9})" & _
>>          "(?<Phone>.{12})" & _
>>          "(?<EmployerGroupAnivDate>.{8})" & _
>>          "(?<HeadOfHouse>.{9})" & _
>>          "(?<PrimaryStatus>.{1})" & _
>>          "(?<MaritalStatus>.{1}))"
>>
>> A pretty good definition can be found on MSDN. Search For RegEx or
>> Regular Expressions. :)
>>
>> I'll try to "simulate"(wink wink:) a 400mb file and check
>> performance. Reading one line at a time is out of the question for
>> this experiment, as it would require a couple of million
>> reads(guessing), reading in 442bytes * nRecords would be better if
>> not the best way to do it. BUT this and ReadToEnd both REQUIRE every
>> record to be 442 bytes (or whatever it is) Off by one byte, and kiss
>> you're records goodbye.
>>
>> MP
>
>
>

I figured it out. Either way, if it is a fixed record then the cr would
be included in the record size. So the above would be 443*nRecords. No
harm no foul. The point is the file can evenly be divided by the number
of bytes in a record plus the delimiter (which is the first thing I
asked him 10 posts ago and was told the was no delimiter).

MP
Author
31 Mar 2005 2:05 AM
Stephany Young
What was said was that the fields were not delimited.

The fact that there is a record delimiter is a given because of the use of
the ReadLine method.

Remember that the code in VB6 works and the 'ReadLine' method is a straight
conversion of the 'Line Input' statement which does, ostensibly, the same
thing.

Anyway the detectives have been at work.

Parsing a 100000 file of 422 characters per record in a line by line read
using regex on my workstation takes approx 73 seconds.

Parsing the same file in a line by line read using the Trim and Mid
functions takes approx 15 seconds.

Parsing the same file in a line by line read using the String.SubString and
String.Trim methods takes approx 17 seconds.

The VB6 equivalent takes approx 40 seconds.

As I said in my first post, it is highly likely that the '15 second'
difference was due to one of the the other methods that is executed on a per
record basis, rather than the reading and parsing of the file and these
results bear that out.

Although it has been an interesting exercise, I don't think that regex is
the way to go in this case.


Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:68ydnYrAS_Xa1dbfRVn-3A@rogers.com...
> "Stephany Young" <noone@localhost> wrote in
> news:OFvYnVYNFHA.3844@TK2MSFTNGP14.phx.gbl:
>
>> I'm just a little concerned that you might have missed a critical
>> point her MeltingPoint.
>>
>> Each record in the file is terminated by a LF or a CR/LF pair. This is
>> hown by the use of the ReadLine method in the original code fragment.
>>
>> A line is defined as a sequence of characters followed by a line feed
>> or a carriage return immediately followed by a line feed. The string
>> that is returned does not contain the terminating carriage return or
>> line feed. The returned value is a null reference (Nothing in Visual
>> Basic) if the end of the input stream is reached.
>>
>> If you use the ReadToEnd method then you have to identify what the
>> record delimiter is and split the input into 'records' based on that
>> delimiter before you can apply the RegEx anyway. Unless, of course the
>> reGex is preceded by a '$' to indicate start at the beginning of each
>> line.
>>
>> If you dont handle this then each record, subsequent to the first,
>> will be off by 1 or 2 characters compounding.
>>
>>
>>
>> "MeltingPoint" <n***@all.com> wrote in message
>> news:FOqdnRmyx9xgqdbfRVn-ug@rogers.com...
>>> "hillcountry74" <shruth***@yahoo.com> wrote in
>>> news:1112223769.480509.272010@g14g2000cwa.googlegroups.com:
>>>
>>>> Thanks a lot MP. Really appreciate your help.
>>>>
>>>> Can you please paste the regular expression for this? Can't find it
>>>> in the code.
>>>>
>>>> Also, on the headofhouse, it could be alphanumeric. And yes, some
>>>> fields would be blank.
>>>>
>>>> There could be files of size 400MB. In such a case, reading till
>>>> endoffile might not work. Instead, if it is changed to reading one
>>> line
>>>> at a time, do you think the speed will reduce?
>>>>
>>>> Thanks again.
>>>>
>>>
>>> This is the Regular Expression:
>>> Dim _exp As String = "((?<ActionCode>.{1})" & _
>>>          "(?<CarrierID>.{25})" & _
>>>          "(?<LastName>.{60})" & _
>>>          "(?<FirstName>.{30})" & _
>>>          "(?<MiddleName>.{15})" & _
>>>          "(?<Addr1>.{60})" & _
>>>          "(?<Addr2>.{60})" & _
>>>          "(?<City>.{30})" & _
>>>          "(?<State>.{2})" & _
>>>          "(?<Zip>.{10})" & _
>>>          "(?<BenefitOption>.{60})" & _
>>>          "(?<EmployerGroup>.{15})" & _
>>>          "(?<OptionEffDate>.{8})" & _
>>>          "(?<HPEffDate>.{8})" & _
>>>          "(?<TermDate>.{8})" & _
>>>          "(?<Sex>.{1})" & _
>>>          "(?<DOB>.{8})" & _
>>>          "(?<SSN>.{9})" & _
>>>          "(?<Phone>.{12})" & _
>>>          "(?<EmployerGroupAnivDate>.{8})" & _
>>>          "(?<HeadOfHouse>.{9})" & _
>>>          "(?<PrimaryStatus>.{1})" & _
>>>          "(?<MaritalStatus>.{1}))"
>>>
>>> A pretty good definition can be found on MSDN. Search For RegEx or
>>> Regular Expressions. :)
>>>
>>> I'll try to "simulate"(wink wink:) a 400mb file and check
>>> performance. Reading one line at a time is out of the question for
>>> this experiment, as it would require a couple of million
>>> reads(guessing), reading in 442bytes * nRecords would be better if
>>> not the best way to do it. BUT this and ReadToEnd both REQUIRE every
>>> record to be 442 bytes (or whatever it is) Off by one byte, and kiss
>>> you're records goodbye.
>>>
>>> MP
>>
>>
>>
>
> I figured it out. Either way, if it is a fixed record then the cr would
> be included in the record size. So the above would be 443*nRecords. No
> harm no foul. The point is the file can evenly be divided by the number
> of bytes in a record plus the delimiter (which is the first thing I
> asked him 10 posts ago and was told the was no delimiter).
>
> MP
Author
30 Mar 2005 11:50 PM
Stephany Young
The speed when dealing with IO devices (disks, networks, etc.) is largely
subject because it depends on things like disk rpm, network bandwith,
network usage, processior type and speed, memory size and a lot of other
factors that it is not really not worth losing any sleep over.

Can we digress back to your original post and address your perception of
'slowness'.

You said that your VB.NET version takes 15 seconds longer than your VB6
version.

Now, that is 15 seconds longer in relation to what?

  - If you take a specific file and run it through the VB6 version then how
long does it take?

  -  If you run that same file through the VB.NET version then how long does
that take?

  - How many records were in the file?

If you run that same file through the VB.NET version again almost
immediately, then is the the run time any different than the first time.

Then we come to some usage scenario questions:

At what time of day does the VB6 version run.

  - Is it run by a user during the course of the business day?

  - Is it run as a 'batch process' at an 'off peak' time?

  - At what time of day were you running the VB.NET version?

  - Was the VB.NET version run on the same hardware as the VB6 version?

Do you see what I'm driving at? The question really is - Are we comparing
apples with apples and is a '15 second' difference really relevant?

If the VB6 version takes, for example, more than 2 minutes to process, say,
10000 records (approx 4MB), then I would suggest that an additional 15
seconds to be insignificant.

If, however, the VB6 version takes, for example, less than 10 seconds to
process, say, 10000 records (approx 4MB), then, obviously, an additional 15
seconds is highly significant.

You say that files could be up to 400MB which indicates somewhre around
1000000 records.

  - Is this file size a regular occurrence or does this size occur only
occasionally?

  - How long does the VB6 version take to process a file of this size and
how long does the VB.NET version take.

  - If it takes longer, is the time differenece relevant to the number of
records?

For example, if it takes 15 seconds longer to process 10000 records, does it
take 1500 seconds longer to process 1000000 records (100 times the records
ergo 100 times longer).

If, for instance, it always takes 15 seconds longer, regardless of the
number of records, then that would indicate that it's nothing to do with
your processing code at all, rather the 'problem' would lie in the general
program overhead under .NET.

It would be interesting to hear your comments and/or finding of any/all of
thses factors, remembering, of course, that the the factors I have thrown
into the ring are really only scratching the surface.


Show quoteHide quote
"hillcountry74" <shruth***@yahoo.com> wrote in message
news:1112223769.480509.272010@g14g2000cwa.googlegroups.com...
> Thanks a lot MP. Really appreciate your help.
>
> Can you please paste the regular expression for this? Can't find it in
> the code.
>
> Also, on the headofhouse, it could be alphanumeric. And yes, some
> fields would be blank.
>
> There could be files of size 400MB. In such a case, reading till
> endoffile might not work. Instead, if it is changed to reading one line
> at a time, do you think the speed will reduce?
>
> Thanks again.
>
Author
31 Mar 2005 3:08 PM
hillcountry74
Stephany, MP

I'm still reading the thread. I guess it's the time diff and so I'm not
around when you guys are discussing.

Stephany, to answer some of your questions:

I've tested the same file in both Vb6 and VB.Net. This specific file
has 4564 records and Vb6 processes it in avg of 35 secs and VB.Net
takes an avg of 45 secs. If I process the same file again in Vb.Net,
it's about the same speed +/- 1 sec.

On the usage:
1. Yes, it is run by a user during the business day and sometimes when
the file is too big like 400MB, it runs thru the following day and this
is for the VB6 ver.
2. No, it is not run as a batch. Basically, the user selects a file
processes it and then if there are additional files, continues to
process one at atime.
3. I've been testing the .Net ver thru' out the day to check if time
makes a diff. But I've noticed in the Vb6 ver, that at times(no
specific time of the day) it is processing real fast and for the same
file it slows down and then again picks up the speed. Note that no
other application is run. Not sure what causes this. On the other hand,
..Net ver always processes about the same speed.

Well, the 400MB files have more than 350K records.

That's exactly, even I was thinking if I was here comparing apples to
apple or not.

400MB file size is regular. Basically, this file is sent by our client
adn we process the files convert it to our format and then run a
backend job to update the database with this info. Our's is a
healthcare industry.

I've not tested the VB.Net version for a 400MB size. I just found out
from the user who runs the VB6 ver and he said it takes about 7 1/2
hrs. So, I don't know if the there will be significant diff or not. To
begin with, I started testing a smaller file. Since this is 15 secs
slower, I decided to debug and try and optimize if necessary before
testing a bigger file.

We are re-writing this in Vb.Net as most of our other appls are already
in .net and this is one of the older apps.

I'm planning to test the 400MB file sometime today. Will keep you guys
posted.

Thanks.
Author
31 Mar 2005 5:18 PM
Cor Ligthert
HillCountry,

I told you before that you should test this clean.
I have seen in your sample a dataset, generic VBNet stuff and more what is
not possible in VB6 and probably do you have in that part (with a quick
look) not used the most optimal methods to load bulk data.

Therefore when you ask if the VBNet IO function is slower than VB6 than you
should in my opinion test the most common VB6 IO functions agains the most
common VBNet IO functions to write files.

What you now are doing is in my opinion comparing apples with fishes.

Just my thought,

Cor
Author
31 Mar 2005 6:02 PM
Stephany Young
I have just sent an email to the email address that 'shows' for you.

Please let me know either way if you do or don't get it.


Show quoteHide quote
"hillcountry74" <shruth***@yahoo.com> wrote in message
news:1112281681.094407.126860@z14g2000cwz.googlegroups.com...
> Stephany, MP
>
> I'm still reading the thread. I guess it's the time diff and so I'm not
> around when you guys are discussing.
>
> Stephany, to answer some of your questions:
>
> I've tested the same file in both Vb6 and VB.Net. This specific file
> has 4564 records and Vb6 processes it in avg of 35 secs and VB.Net
> takes an avg of 45 secs. If I process the same file again in Vb.Net,
> it's about the same speed +/- 1 sec.
>
> On the usage:
> 1. Yes, it is run by a user during the business day and sometimes when
> the file is too big like 400MB, it runs thru the following day and this
> is for the VB6 ver.
> 2. No, it is not run as a batch. Basically, the user selects a file
> processes it and then if there are additional files, continues to
> process one at atime.
> 3. I've been testing the .Net ver thru' out the day to check if time
> makes a diff. But I've noticed in the Vb6 ver, that at times(no
> specific time of the day) it is processing real fast and for the same
> file it slows down and then again picks up the speed. Note that no
> other application is run. Not sure what causes this. On the other hand,
> .Net ver always processes about the same speed.
>
> Well, the 400MB files have more than 350K records.
>
> That's exactly, even I was thinking if I was here comparing apples to
> apple or not.
>
> 400MB file size is regular. Basically, this file is sent by our client
> adn we process the files convert it to our format and then run a
> backend job to update the database with this info. Our's is a
> healthcare industry.
>
> I've not tested the VB.Net version for a 400MB size. I just found out
> from the user who runs the VB6 ver and he said it takes about 7 1/2
> hrs. So, I don't know if the there will be significant diff or not. To
> begin with, I started testing a smaller file. Since this is 15 secs
> slower, I decided to debug and try and optimize if necessary before
> testing a bigger file.
>
> We are re-writing this in Vb.Net as most of our other appls are already
> in .net and this is one of the older apps.
>
> I'm planning to test the 400MB file sometime today. Will keep you guys
> posted.
>
> Thanks.
>
Author
30 Mar 2005 11:08 PM
Stephany Young
In you output, note that you've got some fields out of whack.

>            _sInHPEffDate = m.Groups("OptionEffDate").ToString.Trim
>            _sInTermDate = m.Groups("HPEffDate").ToString.Trim
>            _sInSex = m.Groups("TermDate").ToString.Trim

This shows in that the display of _sInSex is not 1 character.

I don't understand your last comment:

> A note about your sample file: I hope fields were left blank, and things
> like HeadOfHouse is a number, otherwise this isn't working.

What do you mean by 'I hope fields were left blank'? If you mean, for
example, ActionCode being left blank if it is not supplied, i.e. a space
character as a place holder for it, then I would assume yes because
otherwise the original parsing routine would never work in the first place.
When I refer to a value being 'missing' then I am really saying that the
character positions that it would normally occupy are filled with spaces.

I don't think that you can assume (unless you are privvy to something that
I'm not) that HeadOfHouse is numeric. From an earlier post - Head of House
Validation - chars "A-Z,.-'0-9" so this indicates any combination of
characters in the list. The only other thing that can be implied about it is
that if it is 'missing' then it is assigned the first 9 characters of
CarrierId and there is no indication that CarrierId should be numeric.
Anyway, checking that is the role of validation rather than parsing.


Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:fOCdnd23SO3QjNbfRVn-vA@rogers.com...
> "hillcountry74" <shruth***@yahoo.com> wrote in
> news:1112201281.988499.58130@l41g2000cwc.googlegroups.com:
>
>> Stephany,
>>
>> Thanks for the code. I tried your sample, it doesn't seem to work. I'm
>> assuming _s variable is the string to be parsed and need not
>> necessarily have the fieldnames like Lastname etc, right?
>>
>> How does the regex engine know to take 26 characters for extracting
>> City and that it is not the first 26 chrs. Please explain. And excuse
>> me for my ignorance. Never used reg exprs.
>>
>
> Imports System.Text
> Imports System.IO
> Imports System.Text.RegularExpressions
> Module Module1
>
>    Sub Main()
>
>        Dim aStreamReader As TextReader
>        aStreamReader = New StreamReader("C:\SAMPLE FILE.txt")
>        Dim _s As String = aStreamReader.ReadToEnd
>        aStreamReader.Close()
>
>        Dim _exp As String = "((?<ActionCode>.{1})" & _
>          "(?<CarrierID>.{25})" & _
>          "(?<LastName>.{60})" & _
>          "(?<FirstName>.{30})" & _
>          "(?<MiddleName>.{15})" & _
>          "(?<Addr1>.{60})" & _
>          "(?<Addr2>.{60})" & _
>          "(?<City>.{30})" & _
>          "(?<State>.{2})" & _
>          "(?<Zip>.{10})" & _
>          "(?<BenefitOption>.{60})" & _
>          "(?<EmployerGroup>.{15})" & _
>          "(?<OptionEffDate>.{8})" & _
>          "(?<HPEffDate>.{8})" & _
>          "(?<TermDate>.{8})" & _
>          "(?<Sex>.{1})" & _
>          "(?<DOB>.{8})" & _
>          "(?<SSN>.{9})" & _
>          "(?<Phone>.{12})" & _
>          "(?<EmployerGroupAnivDate>.{8})" & _
>          "(?<HeadOfHouse>.{9})" & _
>          "(?<PrimaryStatus>.{1})" & _
>          "(?<MaritalStatus>.{1}))"
>
>        Dim r As Regex = New Regex(_exp)
>
>        Dim g As MatchCollection = r.Matches(_s)
>        Dim m As Match
>
>        Dim _sInActionCode As String
>        Dim _sInCarrierID As String
>        Dim _sInLastName As String
>        Dim _sInFirstName As String
>        Dim _sInMiddleName As String
>        Dim _sInAddr1 As String
>        Dim _sInAddr2 As String
>        Dim _sInCity As String
>        Dim _sInState As String
>        Dim _sInZip As String
>        Dim _sInBenefitOption As String
>        Dim _sInEmployerGroup As String
>        Dim _sInOptionEffDate As String
>        Dim _sInHPEffDate As String
>        Dim _sInTermDate As String
>        Dim _sInSex As String
>        Dim _sInDOB As String
>        Dim _sInSSN As String
>        Dim _sInPhone As String
>        Dim _sInEmployerGroupAnivDate As String
>        Dim _sInHeadOfHouse As String
>        Dim _sInPrimaryStatus As String
>        Dim _sInMaritalStatus As String
>        Dim d As New DateTime
>        Dim dt As Double
>
>        d = DateTime.Now
>
>        For i As Int32 = 0 To g.Count - 1
>            m = g.Item(i)
>
>            _sInActionCode = m.Groups("ActionCode").ToString.Trim
>            _sInCarrierID = m.Groups("CarrierID").ToString.Trim
>            _sInLastName = m.Groups("LastName").ToString.Trim
>            _sInFirstName = m.Groups("FirstName").ToString.Trim
>            _sInMiddleName = m.Groups("MiddleName").ToString.Trim
>            _sInAddr1 = m.Groups("Addr1").ToString.Trim
>            _sInAddr2 = m.Groups("Addr2").ToString.Trim
>            _sInCity = m.Groups("City").ToString.Trim
>            _sInState = m.Groups("State").ToString.Trim
>            _sInZip = m.Groups("Zip").ToString.Trim
>            _sInBenefitOption = m.Groups("BenefitOption").ToString.Trim
>            _sInEmployerGroup = m.Groups("EmployerGroup").ToString.Trim
>            _sInOptionEffDate = m.Groups("OptionEffDate").ToString.Trim
>            _sInHPEffDate = m.Groups("OptionEffDate").ToString.Trim
>            _sInTermDate = m.Groups("HPEffDate").ToString.Trim
>            _sInSex = m.Groups("TermDate").ToString.Trim
>            _sInDOB = m.Groups("DOB").ToString.Trim
>            _sInSSN = m.Groups("SSN").ToString.Trim
>            _sInPhone = m.Groups("Phone").ToString.Trim
>            _sInEmployerGroupAnivDate = m.Groups
> ("EmployerGroupAnivDate").ToString.Trim()
>            _sInHeadOfHouse = m.Groups("HeadOfHouse").ToString.Trim
>            _sInPrimaryStatus = m.Groups("PrimaryStatus").ToString.Trim
>            _sInMaritalStatus = m.Groups("MaritalStatus").ToString.Trim
>            'Console.WriteLine()
>            Console.WriteLine(i)
>            'Console.WriteLine()
>            'Console.WriteLine("_sInActionCode = " & _sInActionCode)
>            'Console.WriteLine("_sInCarrierID = " & _sInCarrierID)
>            'Console.WriteLine("_sInLastName = " & _sInLastName)
>            'Console.WriteLine("_sInFirstName = " & _sInFirstName)
>            'Console.WriteLine("_sInMiddleName = " & _sInMiddleName)
>            'Console.WriteLine("_sInAddr1 = " & _sInAddr1)
>            'Console.WriteLine("_sInAddr2 = " & _sInAddr2)
>            'Console.WriteLine("_sInCity = " & _sInCity)
>            'Console.WriteLine("_sInState = " & _sInState)
>            'Console.WriteLine("_sInZip = " & _sInZip)
>            'Console.WriteLine("_sInBenefitOption = " &
> _sInBenefitOption)
>            'Console.WriteLine("_sInEmployerGroup = " &
> _sInEmployerGroup)
>            'Console.WriteLine("_sInOptionEffDate = " &
> _sInOptionEffDate)
>            'Console.WriteLine("_sInHPEffDate = " & _sInHPEffDate)
>            'Console.WriteLine("_sInTermDate = " & _sInTermDate)
>            'Console.WriteLine("_sInSex = " & _sInSex)
>            'Console.WriteLine("_sInDOB = " & _sInDOB)
>            'Console.WriteLine("_sInSSN = " & _sInSSN)
>            'Console.WriteLine("_sInPhone = " & _sInPhone)
>            'Console.WriteLine("_sInEmployerGroupAnivDate = " &
> _sInEmployerGroupAnivDate)
>            'Console.WriteLine("_sInHeadOfHouse = " & _sInHeadOfHouse)
>            'Console.WriteLine("_sInPrimaryStatus = " &
> _sInPrimaryStatus)
>            'Console.WriteLine("_sInMaritalStatus = " &
> _sInMaritalStatus)
>        Next
>        Dim dt2 = DateTime.Now.Subtract(d).TotalSeconds
>
>        Console.WriteLine(dt2)
>        Console.ReadLine()
>    End Sub
>
> End Module
>
> Sorry the code is a little messy. But it works. Parsed 8064 Records in 2
> seconds flat. Simulate some other work by outputing everything to a
> console window an it takes 34 seconds.
>
> To answer your question RegEx uses a position marker *simular* to that
> of reading a file where the position is incremented relative to the
> amount read (for comparison sakes). So just telling it how much to read
> is good enough.
>
> Thankyou Stephany for writing that all out:)
>
> A note about your sample file: I hope fields were left blank, and things
> like HeadOfHouse is a number, otherwise this isn't working.
> Sample:
> _sInActionCode =
> _sInCarrierID = 00000050101
> _sInLastName = SMITH
> _sInFirstName = VICKI
> _sInMiddleName =
> _sInAddr1 = C/O SUE EDDY  -  MISD BENEFITS
> _sInAddr2 = 405 EAST DAVIS
> _sInCity = MESQUITE
> _sInState = TX
> _sInZip = 75149
> _sInBenefitOption = 001
> _sInEmployerGroup = 2002MISD
> _sInOptionEffDate = 20050301
> _sInHPEffDate = 20050301
> _sInTermDate = 20040401
> _sInSex = 20050331
> _sInDOB = 19510125
> _sInSSN = 000010009
> _sInPhone =
> _sInEmployerGroupAnivDate =
> _sInHeadOfHouse = 464088770
> _sInPrimaryStatus = P
> _sInMaritalStatus = I
>
> Let me know,
> MP
>
Author
31 Mar 2005 12:10 AM
MeltingPoint
"Stephany Young" <noone@localhost> wrote in
news:#TjZp1XNFHA.2132@TK2MSFTNGP14.phx.gbl:

> In you output, note that you've got some fields out of whack.
>
>>            _sInHPEffDate = m.Groups("OptionEffDate").ToString.Trim
>>            _sInTermDate = m.Groups("HPEffDate").ToString.Trim
>>            _sInSex = m.Groups("TermDate").ToString.Trim
>
> This shows in that the display of _sInSex is not 1 character.
>

Sorry Stephany, post order is getting screwed up. The above is copied from
one of your posts(and thank you for it again), but thanks for pointing it
out, I was wondering why sex was "45738495". As for the rest of your
comment, I was sent some sample data from hillcountry74, and thought I was
replying under his thread. Sorry for the confusion. By the way, do you mind
if I ask what field your in?

Cheers,
MP
Author
31 Mar 2005 12:27 AM
Stephany Young
I'm an IT Consultant, with close to 30 experience in the industry.

Since 1994 I have specialised in VB related software and have been using
VB.NET and C#.NET since their first 'retail' release.

I still have a few applications that I support in Vb4, VB5 and VB6 but all
new development is in VB.Net or C#.NET.


Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:QI-dnblgf6FyotbfRVn-uw@rogers.com...
> "Stephany Young" <noone@localhost> wrote in
> news:#TjZp1XNFHA.2132@TK2MSFTNGP14.phx.gbl:
>
>> In you output, note that you've got some fields out of whack.
>>
>>>            _sInHPEffDate = m.Groups("OptionEffDate").ToString.Trim
>>>            _sInTermDate = m.Groups("HPEffDate").ToString.Trim
>>>            _sInSex = m.Groups("TermDate").ToString.Trim
>>
>> This shows in that the display of _sInSex is not 1 character.
>>
>
> Sorry Stephany, post order is getting screwed up. The above is copied from
> one of your posts(and thank you for it again), but thanks for pointing it
> out, I was wondering why sex was "45738495". As for the rest of your
> comment, I was sent some sample data from hillcountry74, and thought I was
> replying under his thread. Sorry for the confusion. By the way, do you
> mind
> if I ask what field your in?
>
> Cheers,
> MP
Author
30 Mar 2005 11:14 PM
hillcountry74
MP,
Posting this msg for the 2nd time.

Thanks a lot for the code. I really appreciate your help and time.

Can you please post the regular expr for this as I can't find it in the
code?

As there could be files of size 400MB, reading till endoffile might not
work. Instead, if it is changed to reading one line at a time, will it
slowdown the parsing?

Also, on the headofhouse, it can be alphanumeric. And yes, some fields
could be blank.

Thanks.


MeltingPoint wrote:
Show quoteHide quote
> "hillcountry74" <shruth***@yahoo.com> wrote in
> news:1112201281.988499.58130@l41g2000cwc.googlegroups.com:
>
> > Stephany,
> >
> > Thanks for the code. I tried your sample, it doesn't seem to work.
I'm
> > assuming _s variable is the string to be parsed and need not
> > necessarily have the fieldnames like Lastname etc, right?
> >
> > How does the regex engine know to take 26 characters for extracting
> > City and that it is not the first 26 chrs. Please explain. And
excuse
> > me for my ignorance. Never used reg exprs.
> >
>
> Imports System.Text
> Imports System.IO
> Imports System.Text.RegularExpressions
> Module Module1
>
>     Sub Main()
>
>         Dim aStreamReader As TextReader
>         aStreamReader = New StreamReader("C:\SAMPLE FILE.txt")
>         Dim _s As String = aStreamReader.ReadToEnd
>         aStreamReader.Close()
>
>         Dim _exp As String = "((?<ActionCode>.{1})" & _
>           "(?<CarrierID>.{25})" & _
>           "(?<LastName>.{60})" & _
>           "(?<FirstName>.{30})" & _
>           "(?<MiddleName>.{15})" & _
>           "(?<Addr1>.{60})" & _
>           "(?<Addr2>.{60})" & _
>           "(?<City>.{30})" & _
>           "(?<State>.{2})" & _
>           "(?<Zip>.{10})" & _
>           "(?<BenefitOption>.{60})" & _
>           "(?<EmployerGroup>.{15})" & _
>           "(?<OptionEffDate>.{8})" & _
>           "(?<HPEffDate>.{8})" & _
>           "(?<TermDate>.{8})" & _
>           "(?<Sex>.{1})" & _
>           "(?<DOB>.{8})" & _
>           "(?<SSN>.{9})" & _
>           "(?<Phone>.{12})" & _
>           "(?<EmployerGroupAnivDate>.{8})" & _
>           "(?<HeadOfHouse>.{9})" & _
>           "(?<PrimaryStatus>.{1})" & _
>           "(?<MaritalStatus>.{1}))"
>
>         Dim r As Regex = New Regex(_exp)
>
>         Dim g As MatchCollection = r.Matches(_s)
>         Dim m As Match
>
>         Dim _sInActionCode As String
>         Dim _sInCarrierID As String
>         Dim _sInLastName As String
>         Dim _sInFirstName As String
>         Dim _sInMiddleName As String
>         Dim _sInAddr1 As String
>         Dim _sInAddr2 As String
>         Dim _sInCity As String
>         Dim _sInState As String
>         Dim _sInZip As String
>         Dim _sInBenefitOption As String
>         Dim _sInEmployerGroup As String
>         Dim _sInOptionEffDate As String
>         Dim _sInHPEffDate As String
>         Dim _sInTermDate As String
>         Dim _sInSex As String
>         Dim _sInDOB As String
>         Dim _sInSSN As String
>         Dim _sInPhone As String
>         Dim _sInEmployerGroupAnivDate As String
>         Dim _sInHeadOfHouse As String
>         Dim _sInPrimaryStatus As String
>         Dim _sInMaritalStatus As String
>         Dim d As New DateTime
>         Dim dt As Double
>
>         d = DateTime.Now
>
>         For i As Int32 = 0 To g.Count - 1
>             m = g.Item(i)
>
>             _sInActionCode = m.Groups("ActionCode").ToString.Trim
>             _sInCarrierID = m.Groups("CarrierID").ToString.Trim
>             _sInLastName = m.Groups("LastName").ToString.Trim
>             _sInFirstName = m.Groups("FirstName").ToString.Trim
>             _sInMiddleName = m.Groups("MiddleName").ToString.Trim
>             _sInAddr1 = m.Groups("Addr1").ToString.Trim
>             _sInAddr2 = m.Groups("Addr2").ToString.Trim
>             _sInCity = m.Groups("City").ToString.Trim
>             _sInState = m.Groups("State").ToString.Trim
>             _sInZip = m.Groups("Zip").ToString.Trim
>             _sInBenefitOption =
m.Groups("BenefitOption").ToString.Trim
>             _sInEmployerGroup =
m.Groups("EmployerGroup").ToString.Trim
>             _sInOptionEffDate =
m.Groups("OptionEffDate").ToString.Trim
>             _sInHPEffDate = m.Groups("OptionEffDate").ToString.Trim
>             _sInTermDate = m.Groups("HPEffDate").ToString.Trim
>             _sInSex = m.Groups("TermDate").ToString.Trim
>             _sInDOB = m.Groups("DOB").ToString.Trim
>             _sInSSN = m.Groups("SSN").ToString.Trim
>             _sInPhone = m.Groups("Phone").ToString.Trim
>             _sInEmployerGroupAnivDate = m.Groups
> ("EmployerGroupAnivDate").ToString.Trim()
>             _sInHeadOfHouse = m.Groups("HeadOfHouse").ToString.Trim
>             _sInPrimaryStatus =
m.Groups("PrimaryStatus").ToString.Trim
>             _sInMaritalStatus =
m.Groups("MaritalStatus").ToString.Trim
Show quoteHide quote
>             'Console.WriteLine()
>             Console.WriteLine(i)
>             'Console.WriteLine()
>             'Console.WriteLine("_sInActionCode = " & _sInActionCode)
>             'Console.WriteLine("_sInCarrierID = " & _sInCarrierID)
>             'Console.WriteLine("_sInLastName = " & _sInLastName)
>             'Console.WriteLine("_sInFirstName = " & _sInFirstName)
>             'Console.WriteLine("_sInMiddleName = " & _sInMiddleName)
>             'Console.WriteLine("_sInAddr1 = " & _sInAddr1)
>             'Console.WriteLine("_sInAddr2 = " & _sInAddr2)
>             'Console.WriteLine("_sInCity = " & _sInCity)
>             'Console.WriteLine("_sInState = " & _sInState)
>             'Console.WriteLine("_sInZip = " & _sInZip)
>             'Console.WriteLine("_sInBenefitOption = " &
> _sInBenefitOption)
>             'Console.WriteLine("_sInEmployerGroup = " &
> _sInEmployerGroup)
>             'Console.WriteLine("_sInOptionEffDate = " &
> _sInOptionEffDate)
>             'Console.WriteLine("_sInHPEffDate = " & _sInHPEffDate)
>             'Console.WriteLine("_sInTermDate = " & _sInTermDate)
>             'Console.WriteLine("_sInSex = " & _sInSex)
>             'Console.WriteLine("_sInDOB = " & _sInDOB)
>             'Console.WriteLine("_sInSSN = " & _sInSSN)
>             'Console.WriteLine("_sInPhone = " & _sInPhone)
>             'Console.WriteLine("_sInEmployerGroupAnivDate = " &
> _sInEmployerGroupAnivDate)
>             'Console.WriteLine("_sInHeadOfHouse = " &
_sInHeadOfHouse)
>             'Console.WriteLine("_sInPrimaryStatus = " &
> _sInPrimaryStatus)
>             'Console.WriteLine("_sInMaritalStatus = " &
> _sInMaritalStatus)
>         Next
>         Dim dt2 = DateTime.Now.Subtract(d).TotalSeconds
>
>         Console.WriteLine(dt2)
>         Console.ReadLine()
>     End Sub
>
> End Module
>
> Sorry the code is a little messy. But it works. Parsed 8064 Records
in 2
> seconds flat. Simulate some other work by outputing everything to a
> console window an it takes 34 seconds.
>
> To answer your question RegEx uses a position marker *simular* to
that
> of reading a file where the position is incremented relative to the
> amount read (for comparison sakes). So just telling it how much to
read
> is good enough.
>
> Thankyou Stephany for writing that all out:)
>
> A note about your sample file: I hope fields were left blank, and
things
> like HeadOfHouse is a number, otherwise this isn't working.
> Sample:
> _sInActionCode =
> _sInCarrierID = 00000050101
> _sInLastName = SMITH
> _sInFirstName = VICKI
> _sInMiddleName =
> _sInAddr1 = C/O SUE EDDY  -  MISD BENEFITS
> _sInAddr2 = 405 EAST DAVIS
> _sInCity = MESQUITE
> _sInState = TX
> _sInZip = 75149
> _sInBenefitOption = 001
> _sInEmployerGroup = 2002MISD
> _sInOptionEffDate = 20050301
> _sInHPEffDate = 20050301
> _sInTermDate = 20040401
> _sInSex = 20050331
> _sInDOB = 19510125
> _sInSSN = 000010009
> _sInPhone =
> _sInEmployerGroupAnivDate =
> _sInHeadOfHouse = 464088770
> _sInPrimaryStatus = P
> _sInMaritalStatus = I
>
> Let me know,
> MP
Author
31 Mar 2005 12:33 AM
MeltingPoint
Show quote Hide quote
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112224455.933751.78520@l41g2000cwc.googlegroups.com:

> MP,
> Posting this msg for the 2nd time.
>
> Thanks a lot for the code. I really appreciate your help and time.
>
> Can you please post the regular expr for this as I can't find it in
the
> code?
>
> As there could be files of size 400MB, reading till endoffile might
not
> work. Instead, if it is changed to reading one line at a time, will it
> slowdown the parsing?
>
> Also, on the headofhouse, it can be alphanumeric. And yes, some fields
> could be blank.
>
> Thanks.

I just thought of something. How can you be using ReadLine if there's no
delimiters? The answer is: There is a delimiter. A carriage return at
the end of every record. However, I don't thinks this helps the regex
thing. But 'just so ya know' a delimiter can be anything, not just a
comma! :)

I'm just cleaning up the code so you can read in chunks at a time,(1.5
gigs of ram wasn't even enough to read in 400mb of text) will post back
soon. By the by, are you using VB.NET?

MP
Author
30 Mar 2005 10:45 PM
Stephany Young
Yes, you are correct, _s is a string to simmulate a record read from your
input file and my use of fieldnames etc, was jsut to create some
placeholders. Where it says e.g. <7 spaces>, you need to replace that bit
(including the angle brackets) with that number of space characters. Because
of the way newsgroup readers wrap text etc, it was difficult to show the
actual values.

The numbers in the curly brackets e.g. {8}, in the regex expression tell it
how many character positions each element takes up. If you add them all up
you will find that they come to 442 which is why you your input record must
be a minimum of 442 characters long. If it is shorter then it won't work. If
it is longer than only the first 442 characters are utilised.

I have been assuming that your input records are, in fact, fixed-length
fields in a fixed-length record. Is this actually the case or are there some
records that are shorter. It would be nice if you could show a smaple record
(doctored to hide sensitive information of course).

In addition what did you find about my other comments regarding the stray
'unicode' character etc.


Show quoteHide quote
"hillcountry74" <shruth***@yahoo.com> wrote in message
news:1112201281.988499.58130@l41g2000cwc.googlegroups.com...
> Stephany,
>
> Thanks for the code. I tried your sample, it doesn't seem to work. I'm
> assuming _s variable is the string to be parsed and need not
> necessarily have the fieldnames like Lastname etc, right?
>
> How does the regex engine know to take 26 characters for extracting
> City and that it is not the first 26 chrs. Please explain. And excuse
> me for my ignorance. Never used reg exprs.
>
Author
30 Mar 2005 1:19 AM
MeltingPoint
"hillcountry74" <shruth***@yahoo.com> wrote in
news:1112110584.467762.175290@o13g2000cwo.googlegroups.com:



Some good ideas so far. I've started to put the regex expression
together for you, could have it done in a few hours. If you want to send
me one of these files, (important info changed of course) I could fine
tune the expression. macmanic(zero)(zero)atHotmail.com

Note to anyone else reading this thread, Any ideas on the speed of regex
as opposed to Substring/IndexOf. I can say for sure that I've parsed a
4mb file with regex in a few hundred milliseconds.

++Just saw stefs comment. I'm not sure what difference it makes as to
weather its fixed or not. RegEx still works and its alot easier on the
eyes:)
((?<ActionCode>.)
(?<CarrierID>\d{0,25})
(?<LastName>\w{0,60}\s*\b)
(?<FirstName>\w{0,30}\s*\b)
(?<MiddleName>\w{0,15}\s*\b)
(?<Addr1>.{0,60}\s*\b)
(?<Addr2>.{0,60}\s*\b)
(?<City>.{0,30}\s*\b)
(?<State>.{0,2}\s*)
(?<Zip>.{0,10}\s*\b))
Actually the fact that its fixed makes it easier.

And a note as to how close I was paying attention:
sPreferredInputFile.Trim.Length >= 439
does not 'allude' to me that it is totally fixed.

However, I don't know Stef, she probably knows more than me, considering
I just started using RegEx a month ago. But the above Regex does match
the following:

e8374837463784958473627495Sc9ott                         8nglis                                       
Micheal        554 sdf sdf                                               
667 rtert ertwert                                        Hell                   
FL90210

Which I think is what the record looks like (at least so far)

Let me know, both of you :)
MP
Author
30 Mar 2005 1:24 AM
MeltingPoint
Sorry about all the posts (xnews acting up)


<lots o code>

Some good ideas so far. I've started to put the regex expression
together for you, could have it done in a few hours. If you want to send
me one of these files, (important info changed of course) I could fine
tune the expression. macmanic(zero)(zero)atHotmail.com

Note to anyone else reading this thread, Any ideas on the speed of regex
as opposed to Substring/IndexOf. I can say for sure that I've parsed a
4mb file with regex in a few hundred milliseconds.

++Just saw stefs comment. I'm not sure what difference it makes as to
weather its fixed or not. RegEx still works and its alot easier on the
eyes:)
((?<ActionCode>.)
(?<CarrierID>\d{0,25})
(?<LastName>\w{0,60}\s*\b)
(?<FirstName>\w{0,30}\s*\b)
(?<MiddleName>\w{0,15}\s*\b)
(?<Addr1>.{0,60}\s*\b)
(?<Addr2>.{0,60}\s*\b)
(?<City>.{0,30}\s*\b)
(?<State>.{0,2}\s*)
(?<Zip>.{0,10}\s*\b))
Actually the fact that its fixed makes it easier.

And a note as to how close I was paying attention:
sPreferredInputFile.Trim.Length >= 439
does not 'allude' to me that it is totally fixed.

However, I don't know Stef, she probably knows more than me, considering
I just started using RegEx a month ago. But the above Regex does match
the following:

e8374837463784958473627495Tomlin                         Nilmot                                       
Micheal        554 Some Street                                               
667 Some Other Street                                        Hell                   
FL90210

Which I think is what the record looks like (at least so far)

Let me know, both of you :)
MP
Author
31 Mar 2005 2:47 AM
MeltingPoint
"hillcountry74" <shruth***@yahoo.com> wrote in news:1112049500.118960.294240@f14g2000cwb.googlegroups.com:

Inside the IDE - Not being displayed to console.
Processed 83265 records in 5.6875 seconds. At 2 Records per pass.
Processed 83265 records in 4.28125 seconds. At 20 Records per pass.
Processed 83265 records in 4.046875 seconds. At 50 Records per pass.
Processed 83265 records in 4.046875 seconds. At 75 Records per pass.
Processed 83265 records in 4.765625 seconds. At 100 Records per pass. Breaking Point Reached.

Compiled Application
Processed 83265 records in 3.53125 seconds. At 75 Records per pass.
Processed 83265 records in 3.625 seconds. At 100 Records per pass.
Processed 83265 records in 3.59375 seconds. At 200 Records per pass.
Processed 83265 records in 3.609375 seconds. At 500 Records per pass.
Processed 83265 records in 3.625 seconds. At 1000 Records per pass.
Processed 83265 records in 3.609375 seconds. At 10000 Records per pass.
Processed 83265 records in 3.59375 seconds. At 50000 Records per pass.

You be the judge.
Heres the source code.
Let me know if you need help with the verify routines.

Imports System.Text
Imports System.IO
Imports System.Text.RegularExpressions
Module Module1

    Sub Main()
    'File path and number of records to parse per pass
        ReadAndParse("C:\SAMPLE FILE.txt", 1000)
    End Sub

#Region " Expression Definition "
    Dim _exp As String = "((?<ActionCode>.{1})" & _
      "(?<CarrierID>.{25})" & _
      "(?<LastName>.{60})" & _
      "(?<FirstName>.{30})" & _
      "(?<MiddleName>.{15})" & _
      "(?<Addr1>.{60})" & _
      "(?<Addr2>.{60})" & _
      "(?<City>.{30})" & _
      "(?<State>.{2})" & _
      "(?<Zip>.{10})" & _
      "(?<BenefitOption>.{60})" & _
      "(?<EmployerGroup>.{15})" & _
      "(?<OptionEffDate>.{8})" & _
      "(?<HPEffDate>.{8})" & _
      "(?<TermDate>.{8})" & _
      "(?<Sex>.{1})" & _
      "(?<DOB>.{8})" & _
      "(?<SSN>.{9})" & _
      "(?<Phone>.{12})" & _
      "(?<EmployerGroupAnivDate>.{8})" & _
      "(?<HeadOfHouse>.{9})" & _
      "(?<PrimaryStatus>.{1})" & _
      "(?<MaritalStatus>.{1}))"
#End Region

#Region " Label Definitions "
    Dim _sInActionCode As String
    Dim _sInCarrierID As String
    Dim _sInLastName As String
    Dim _sInFirstName As String
    Dim _sInMiddleName As String
    Dim _sInAddr1 As String
    Dim _sInAddr2 As String
    Dim _sInCity As String
    Dim _sInState As String
    Dim _sInZip As String
    Dim _sInBenefitOption As String
    Dim _sInEmployerGroup As String
    Dim _sInOptionEffDate As String
    Dim _sInHPEffDate As String
    Dim _sInTermDate As String
    Dim _sInSex As String
    Dim _sInDOB As String
    Dim _sInSSN As String
    Dim _sInPhone As String
    Dim _sInEmployerGroupAnivDate As String
    Dim _sInHeadOfHouse As String
    Dim _sInPrimaryStatus As String
    Dim _sInMaritalStatus As String
#End Region

#Region " Timing "
    Dim startTime As New DateTime
    Dim finishTime As Double
#End Region



    Sub ReadAndParse(ByVal inFilePath As String, ByVal numRecordsPerBlock As Int32)
        Const RECORD_SIZE As Int32 = 443
        Dim inputFile As New FileInfo(inFilePath)
        Dim inputFileLen As Int64 = inputFile.Length
        Dim iterations As Int32
        Dim bytesPerIteration As Int32
        Dim totalRecords As Int32
        Dim moreRecords As Boolean

        'Verify Length
        If Not inputFileLen Mod 443 = 0 Then
            Throw New ApplicationException("File Length Error")
        End If

        'Figure out how many times to loop
        iterations = inputFileLen \ (numRecordsPerBlock * RECORD_SIZE)
        'Bytes(records) per loop
        bytesPerIteration = numRecordsPerBlock * RECORD_SIZE
        'Check to see if we got lucky
        moreRecords = ((iterations * RECORD_SIZE) <> inputFileLen)
        'reset total records
        totalRecords = 0

        'Get input stream
        Dim inStream As New StreamReader(inputFile.FullName)
        Dim inputBlock As String
        Dim buf(bytesPerIteration) As Char

        'Set up regex
        Dim regExp As New Regex(_exp, RegexOptions.Compiled) ' I think this speeds it up'
        Dim mc As MatchCollection
        Dim record As Match

        'Set up and loop
        startTime = Now()
        For i As Int32 = 1 To iterations

            inStream.ReadBlock(buf, 0, bytesPerIteration)
            inputBlock = New String(buf)

            'Parse it
            mc = regExp.Matches(inputBlock)
            For j As Int32 = 0 To mc.Count - 1
                record = mc.Item(j)
                'Verify record proc here
                totalRecords += 1
                _sInActionCode = record.Groups("ActionCode").ToString.Trim
                _sInCarrierID = record.Groups("CarrierID").ToString.Trim
                _sInLastName = record.Groups("LastName").ToString.Trim
                _sInFirstName = record.Groups("FirstName").ToString.Trim
                _sInMiddleName = record.Groups("MiddleName").ToString.Trim
                _sInAddr1 = record.Groups("Addr1").ToString.Trim
                _sInAddr2 = record.Groups("Addr2").ToString.Trim
                _sInCity = record.Groups("City").ToString.Trim
                _sInState = record.Groups("State").ToString.Trim
                _sInZip = record.Groups("Zip").ToString.Trim
                _sInBenefitOption = record.Groups("BenefitOption").ToString.Trim
                _sInEmployerGroup = record.Groups("EmployerGroup").ToString.Trim
                _sInOptionEffDate = record.Groups("OptionEffDate").ToString.Trim
                _sInHPEffDate = record.Groups("HPEffDate").ToString.Trim
                _sInTermDate = record.Groups("TermDate").ToString.Trim
                _sInSex = record.Groups("Sex").ToString.Trim
                _sInDOB = record.Groups("DOB").ToString.Trim
                _sInSSN = record.Groups("SSN").ToString.Trim
                _sInPhone = record.Groups("Phone").ToString.Trim
                _sInEmployerGroupAnivDate = record.Groups("EmployerGroupAnivDate").ToString.Trim()
                _sInHeadOfHouse = record.Groups("HeadOfHouse").ToString.Trim
                _sInPrimaryStatus = record.Groups("PrimaryStatus").ToString.Trim
                _sInMaritalStatus = record.Groups("MaritalStatus").ToString.Trim
                'REMOVE
                DisplayToConsole(record)
                'END REMOVE

            Next
        Next

        'One last time through
        If moreRecords Then
            inputBlock = inStream.ReadToEnd() 'Finish off reading
            inStream.Close()
            mc = regExp.Matches(inputBlock)
            For j As Int32 = 0 To mc.Count - 1
                record = mc.Item(j)
                'Verify record proc here
                totalRecords += 1
                _sInActionCode = record.Groups("ActionCode").ToString.Trim
                _sInCarrierID = record.Groups("CarrierID").ToString.Trim
                _sInLastName = record.Groups("LastName").ToString.Trim
                _sInFirstName = record.Groups("FirstName").ToString.Trim
                _sInMiddleName = record.Groups("MiddleName").ToString.Trim
                _sInAddr1 = record.Groups("Addr1").ToString.Trim
                _sInAddr2 = record.Groups("Addr2").ToString.Trim
                _sInCity = record.Groups("City").ToString.Trim
                _sInState = record.Groups("State").ToString.Trim
                _sInZip = record.Groups("Zip").ToString.Trim
                _sInBenefitOption = record.Groups("BenefitOption").ToString.Trim
                _sInEmployerGroup = record.Groups("EmployerGroup").ToString.Trim
                _sInOptionEffDate = record.Groups("OptionEffDate").ToString.Trim
                _sInHPEffDate = record.Groups("HPEffDate").ToString.Trim
                _sInTermDate = record.Groups("TermDate").ToString.Trim
                _sInSex = record.Groups("Sex").ToString.Trim
                _sInDOB = record.Groups("DOB").ToString.Trim
                _sInSSN = record.Groups("SSN").ToString.Trim
                _sInPhone = record.Groups("Phone").ToString.Trim
                _sInEmployerGroupAnivDate = record.Groups("EmployerGroupAnivDate").ToString.Trim()
                _sInHeadOfHouse = record.Groups("HeadOfHouse").ToString.Trim
                _sInPrimaryStatus = record.Groups("PrimaryStatus").ToString.Trim
                _sInMaritalStatus = record.Groups("MaritalStatus").ToString.Trim
                'REMOVE
                DisplayToConsole(record)
                'END REMOVE
            Next
        Else
            inStream.Close()
        End If

        Dim finishTime = DateTime.Now.Subtract(startTime).TotalSeconds
        Console.WriteLine()
        Console.WriteLine("Processed {0} records in {1} seconds.", totalRecords, finishTime)
        Console.ReadLine()
    End Sub
    Sub DisplayToConsole(ByVal record As Match)
        _sInActionCode = record.Groups("ActionCode").ToString.Trim
        _sInCarrierID = record.Groups("CarrierID").ToString.Trim
        _sInLastName = record.Groups("LastName").ToString.Trim
        _sInFirstName = record.Groups("FirstName").ToString.Trim
        _sInMiddleName = record.Groups("MiddleName").ToString.Trim
        _sInAddr1 = record.Groups("Addr1").ToString.Trim
        _sInAddr2 = record.Groups("Addr2").ToString.Trim
        _sInCity = record.Groups("City").ToString.Trim
        _sInState = record.Groups("State").ToString.Trim
        _sInZip = record.Groups("Zip").ToString.Trim
        _sInBenefitOption = record.Groups("BenefitOption").ToString.Trim
        _sInEmployerGroup = record.Groups("EmployerGroup").ToString.Trim
        _sInOptionEffDate = record.Groups("OptionEffDate").ToString.Trim
        _sInHPEffDate = record.Groups("HPEffDate").ToString.Trim
        _sInTermDate = record.Groups("TermDate").ToString.Trim
        _sInSex = record.Groups("Sex").ToString.Trim
        _sInDOB = record.Groups("DOB").ToString.Trim
        _sInSSN = record.Groups("SSN").ToString.Trim
        _sInPhone = record.Groups("Phone").ToString.Trim
        _sInEmployerGroupAnivDate = record.Groups("EmployerGroupAnivDate").ToString.Trim()
        _sInHeadOfHouse = record.Groups("HeadOfHouse").ToString.Trim
        _sInPrimaryStatus = record.Groups("PrimaryStatus").ToString.Trim
        _sInMaritalStatus = record.Groups("MaritalStatus").ToString.Trim

        Console.WriteLine()
        Console.WriteLine("_sInActionCode = " & _sInActionCode)
        Console.WriteLine("_sInCarrierID = " & _sInCarrierID)
        Console.WriteLine("_sInLastName = " & _sInLastName)
        Console.WriteLine("_sInFirstName = " & _sInFirstName)
        Console.WriteLine("_sInMiddleName = " & _sInMiddleName)
        Console.WriteLine("_sInAddr1 = " & _sInAddr1)
        Console.WriteLine("_sInAddr2 = " & _sInAddr2)
        Console.WriteLine("_sInCity = " & _sInCity)
        Console.WriteLine("_sInState = " & _sInState)
        Console.WriteLine("_sInZip = " & _sInZip)
        Console.WriteLine("_sInBenefitOption = " & _sInBenefitOption)
        Console.WriteLine("_sInEmployerGroup = " & _sInEmployerGroup)
        Console.WriteLine("_sInOptionEffDate = " & _sInOptionEffDate)
        Console.WriteLine("_sInHPEffDate = " & _sInHPEffDate)
        Console.WriteLine("_sInTermDate = " & _sInTermDate)
        Console.WriteLine("_sInSex = " & _sInSex)
        Console.WriteLine("_sInDOB = " & _sInDOB)
        Console.WriteLine("_sInSSN = " & _sInSSN)
        Console.WriteLine("_sInPhone = " & _sInPhone)
        Console.WriteLine("_sInEmployerGroupAnivDate = " & _sInEmployerGroupAnivDate)
        Console.WriteLine("_sInHeadOfHouse = " & _sInHeadOfHouse)
        Console.WriteLine("_sInPrimaryStatus = " & _sInPrimaryStatus)
        Console.WriteLine("_sInMaritalStatus = " & _sInMaritalStatus)
    End Sub
End Module
Author
31 Mar 2005 5:18 AM
Stephany Young
Using your code, I am seeing similar results give or take a few
milliseconds. I noticed that the best results were achieved using about 40
records per block. Maybe RegEx uses some sort of optimisation based on
around about 16KB.

We obviously have differing amounts of RAM because I can do a ReadToEnd on a
200MB + file quite happily. My machine spits it's dummy at just over 260 MB.
That level, of course will vary depending on whatever else is running at the
time.

Using the ReadLine method on an 83265 record file, and using different
combinations of Mid, Trim, String.SubString and String.Trim I am seeing
results of between 1.5 and 2 seconds to pase the entire file, depending on
the combination. The difference between running it compiled for debug
configuration in the IDE and release configuration is insignificant (less
than 100 milliseconds).

Unfortunately you haven't provided any comparative reslts for your machine.

The evidence I see is that RegEx is actually a poor performer compared to
more convential string parsing in this particular case.

I am still of the opinion that the 'percieved slowness' is in one of the
other functions that is called on a per record basis rather in the file
IO/record parsing area per se.



Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:PfednS7FVvZH-dbfRVn-1Q@rogers.com...
> "hillcountry74" <shruth***@yahoo.com> wrote in
> news:1112049500.118960.294240@f14g2000cwb.googlegroups.com:
>
> Inside the IDE - Not being displayed to console.
> Processed 83265 records in 5.6875 seconds. At 2 Records per pass.
> Processed 83265 records in 4.28125 seconds. At 20 Records per pass.
> Processed 83265 records in 4.046875 seconds. At 50 Records per pass.
> Processed 83265 records in 4.046875 seconds. At 75 Records per pass.
> Processed 83265 records in 4.765625 seconds. At 100 Records per pass.
> Breaking Point Reached.
>
> Compiled Application
> Processed 83265 records in 3.53125 seconds. At 75 Records per pass.
> Processed 83265 records in 3.625 seconds. At 100 Records per pass.
> Processed 83265 records in 3.59375 seconds. At 200 Records per pass.
> Processed 83265 records in 3.609375 seconds. At 500 Records per pass.
> Processed 83265 records in 3.625 seconds. At 1000 Records per pass.
> Processed 83265 records in 3.609375 seconds. At 10000 Records per pass.
> Processed 83265 records in 3.59375 seconds. At 50000 Records per pass.
>
> You be the judge.
> Heres the source code.
> Let me know if you need help with the verify routines.
>
> Imports System.Text
> Imports System.IO
> Imports System.Text.RegularExpressions
> Module Module1
>
>    Sub Main()
> 'File path and number of records to parse per pass
>        ReadAndParse("C:\SAMPLE FILE.txt", 1000)
>    End Sub
>
> #Region " Expression Definition "
>    Dim _exp As String = "((?<ActionCode>.{1})" & _
>      "(?<CarrierID>.{25})" & _
>      "(?<LastName>.{60})" & _
>      "(?<FirstName>.{30})" & _
>      "(?<MiddleName>.{15})" & _
>      "(?<Addr1>.{60})" & _
>      "(?<Addr2>.{60})" & _
>      "(?<City>.{30})" & _
>      "(?<State>.{2})" & _
>      "(?<Zip>.{10})" & _
>      "(?<BenefitOption>.{60})" & _
>      "(?<EmployerGroup>.{15})" & _
>      "(?<OptionEffDate>.{8})" & _
>      "(?<HPEffDate>.{8})" & _
>      "(?<TermDate>.{8})" & _
>      "(?<Sex>.{1})" & _
>      "(?<DOB>.{8})" & _
>      "(?<SSN>.{9})" & _
>      "(?<Phone>.{12})" & _
>      "(?<EmployerGroupAnivDate>.{8})" & _
>      "(?<HeadOfHouse>.{9})" & _
>      "(?<PrimaryStatus>.{1})" & _
>      "(?<MaritalStatus>.{1}))"
> #End Region
>
> #Region " Label Definitions "
>    Dim _sInActionCode As String
>    Dim _sInCarrierID As String
>    Dim _sInLastName As String
>    Dim _sInFirstName As String
>    Dim _sInMiddleName As String
>    Dim _sInAddr1 As String
>    Dim _sInAddr2 As String
>    Dim _sInCity As String
>    Dim _sInState As String
>    Dim _sInZip As String
>    Dim _sInBenefitOption As String
>    Dim _sInEmployerGroup As String
>    Dim _sInOptionEffDate As String
>    Dim _sInHPEffDate As String
>    Dim _sInTermDate As String
>    Dim _sInSex As String
>    Dim _sInDOB As String
>    Dim _sInSSN As String
>    Dim _sInPhone As String
>    Dim _sInEmployerGroupAnivDate As String
>    Dim _sInHeadOfHouse As String
>    Dim _sInPrimaryStatus As String
>    Dim _sInMaritalStatus As String
> #End Region
>
> #Region " Timing "
>    Dim startTime As New DateTime
>    Dim finishTime As Double
> #End Region
>
>
>
>    Sub ReadAndParse(ByVal inFilePath As String, ByVal numRecordsPerBlock
> As Int32)
>        Const RECORD_SIZE As Int32 = 443
>        Dim inputFile As New FileInfo(inFilePath)
>        Dim inputFileLen As Int64 = inputFile.Length
>        Dim iterations As Int32
>        Dim bytesPerIteration As Int32
>        Dim totalRecords As Int32
>        Dim moreRecords As Boolean
>
>        'Verify Length
>        If Not inputFileLen Mod 443 = 0 Then
>            Throw New ApplicationException("File Length Error")
>        End If
>
>        'Figure out how many times to loop
>        iterations = inputFileLen \ (numRecordsPerBlock * RECORD_SIZE)
>        'Bytes(records) per loop
>        bytesPerIteration = numRecordsPerBlock * RECORD_SIZE
>        'Check to see if we got lucky
>        moreRecords = ((iterations * RECORD_SIZE) <> inputFileLen)
>        'reset total records
>        totalRecords = 0
>
>        'Get input stream
>        Dim inStream As New StreamReader(inputFile.FullName)
>        Dim inputBlock As String
>        Dim buf(bytesPerIteration) As Char
>
>        'Set up regex
>        Dim regExp As New Regex(_exp, RegexOptions.Compiled) ' I think this
> speeds it up'
>        Dim mc As MatchCollection
>        Dim record As Match
>
>        'Set up and loop
>        startTime = Now()
>        For i As Int32 = 1 To iterations
>
>            inStream.ReadBlock(buf, 0, bytesPerIteration)
>            inputBlock = New String(buf)
>
>            'Parse it
>            mc = regExp.Matches(inputBlock)
>            For j As Int32 = 0 To mc.Count - 1
>                record = mc.Item(j)
>                'Verify record proc here
>                totalRecords += 1
>                _sInActionCode = record.Groups("ActionCode").ToString.Trim
>                _sInCarrierID = record.Groups("CarrierID").ToString.Trim
>                _sInLastName = record.Groups("LastName").ToString.Trim
>                _sInFirstName = record.Groups("FirstName").ToString.Trim
>                _sInMiddleName = record.Groups("MiddleName").ToString.Trim
>                _sInAddr1 = record.Groups("Addr1").ToString.Trim
>                _sInAddr2 = record.Groups("Addr2").ToString.Trim
>                _sInCity = record.Groups("City").ToString.Trim
>                _sInState = record.Groups("State").ToString.Trim
>                _sInZip = record.Groups("Zip").ToString.Trim
>                _sInBenefitOption =
> record.Groups("BenefitOption").ToString.Trim
>                _sInEmployerGroup =
> record.Groups("EmployerGroup").ToString.Trim
>                _sInOptionEffDate =
> record.Groups("OptionEffDate").ToString.Trim
>                _sInHPEffDate = record.Groups("HPEffDate").ToString.Trim
>                _sInTermDate = record.Groups("TermDate").ToString.Trim
>                _sInSex = record.Groups("Sex").ToString.Trim
>                _sInDOB = record.Groups("DOB").ToString.Trim
>                _sInSSN = record.Groups("SSN").ToString.Trim
>                _sInPhone = record.Groups("Phone").ToString.Trim
>                _sInEmployerGroupAnivDate =
> record.Groups("EmployerGroupAnivDate").ToString.Trim()
>                _sInHeadOfHouse =
> record.Groups("HeadOfHouse").ToString.Trim
>                _sInPrimaryStatus =
> record.Groups("PrimaryStatus").ToString.Trim
>                _sInMaritalStatus =
> record.Groups("MaritalStatus").ToString.Trim
>                'REMOVE
>                DisplayToConsole(record)
>                'END REMOVE
>
>            Next
>        Next
>
>        'One last time through
>        If moreRecords Then
>            inputBlock = inStream.ReadToEnd() 'Finish off reading
>            inStream.Close()
>            mc = regExp.Matches(inputBlock)
>            For j As Int32 = 0 To mc.Count - 1
>                record = mc.Item(j)
>                'Verify record proc here
>                totalRecords += 1
>                _sInActionCode = record.Groups("ActionCode").ToString.Trim
>                _sInCarrierID = record.Groups("CarrierID").ToString.Trim
>                _sInLastName = record.Groups("LastName").ToString.Trim
>                _sInFirstName = record.Groups("FirstName").ToString.Trim
>                _sInMiddleName = record.Groups("MiddleName").ToString.Trim
>                _sInAddr1 = record.Groups("Addr1").ToString.Trim
>                _sInAddr2 = record.Groups("Addr2").ToString.Trim
>                _sInCity = record.Groups("City").ToString.Trim
>                _sInState = record.Groups("State").ToString.Trim
>                _sInZip = record.Groups("Zip").ToString.Trim
>                _sInBenefitOption =
> record.Groups("BenefitOption").ToString.Trim
>                _sInEmployerGroup =
> record.Groups("EmployerGroup").ToString.Trim
>                _sInOptionEffDate =
> record.Groups("OptionEffDate").ToString.Trim
>                _sInHPEffDate = record.Groups("HPEffDate").ToString.Trim
>                _sInTermDate = record.Groups("TermDate").ToString.Trim
>                _sInSex = record.Groups("Sex").ToString.Trim
>                _sInDOB = record.Groups("DOB").ToString.Trim
>                _sInSSN = record.Groups("SSN").ToString.Trim
>                _sInPhone = record.Groups("Phone").ToString.Trim
>                _sInEmployerGroupAnivDate =
> record.Groups("EmployerGroupAnivDate").ToString.Trim()
>                _sInHeadOfHouse =
> record.Groups("HeadOfHouse").ToString.Trim
>                _sInPrimaryStatus =
> record.Groups("PrimaryStatus").ToString.Trim
>                _sInMaritalStatus =
> record.Groups("MaritalStatus").ToString.Trim
>                'REMOVE
>                DisplayToConsole(record)
>                'END REMOVE
>            Next
>        Else
>            inStream.Close()
>        End If
>
>        Dim finishTime = DateTime.Now.Subtract(startTime).TotalSeconds
>        Console.WriteLine()
>        Console.WriteLine("Processed {0} records in {1} seconds.",
> totalRecords, finishTime)
>        Console.ReadLine()
>    End Sub
>    Sub DisplayToConsole(ByVal record As Match)
>        _sInActionCode = record.Groups("ActionCode").ToString.Trim
>        _sInCarrierID = record.Groups("CarrierID").ToString.Trim
>        _sInLastName = record.Groups("LastName").ToString.Trim
>        _sInFirstName = record.Groups("FirstName").ToString.Trim
>        _sInMiddleName = record.Groups("MiddleName").ToString.Trim
>        _sInAddr1 = record.Groups("Addr1").ToString.Trim
>        _sInAddr2 = record.Groups("Addr2").ToString.Trim
>        _sInCity = record.Groups("City").ToString.Trim
>        _sInState = record.Groups("State").ToString.Trim
>        _sInZip = record.Groups("Zip").ToString.Trim
>        _sInBenefitOption = record.Groups("BenefitOption").ToString.Trim
>        _sInEmployerGroup = record.Groups("EmployerGroup").ToString.Trim
>        _sInOptionEffDate = record.Groups("OptionEffDate").ToString.Trim
>        _sInHPEffDate = record.Groups("HPEffDate").ToString.Trim
>        _sInTermDate = record.Groups("TermDate").ToString.Trim
>        _sInSex = record.Groups("Sex").ToString.Trim
>        _sInDOB = record.Groups("DOB").ToString.Trim
>        _sInSSN = record.Groups("SSN").ToString.Trim
>        _sInPhone = record.Groups("Phone").ToString.Trim
>        _sInEmployerGroupAnivDate =
> record.Groups("EmployerGroupAnivDate").ToString.Trim()
>        _sInHeadOfHouse = record.Groups("HeadOfHouse").ToString.Trim
>        _sInPrimaryStatus = record.Groups("PrimaryStatus").ToString.Trim
>        _sInMaritalStatus = record.Groups("MaritalStatus").ToString.Trim
>
>        Console.WriteLine()
>        Console.WriteLine("_sInActionCode = " & _sInActionCode)
>        Console.WriteLine("_sInCarrierID = " & _sInCarrierID)
>        Console.WriteLine("_sInLastName = " & _sInLastName)
>        Console.WriteLine("_sInFirstName = " & _sInFirstName)
>        Console.WriteLine("_sInMiddleName = " & _sInMiddleName)
>        Console.WriteLine("_sInAddr1 = " & _sInAddr1)
>        Console.WriteLine("_sInAddr2 = " & _sInAddr2)
>        Console.WriteLine("_sInCity = " & _sInCity)
>        Console.WriteLine("_sInState = " & _sInState)
>        Console.WriteLine("_sInZip = " & _sInZip)
>        Console.WriteLine("_sInBenefitOption = " & _sInBenefitOption)
>        Console.WriteLine("_sInEmployerGroup = " & _sInEmployerGroup)
>        Console.WriteLine("_sInOptionEffDate = " & _sInOptionEffDate)
>        Console.WriteLine("_sInHPEffDate = " & _sInHPEffDate)
>        Console.WriteLine("_sInTermDate = " & _sInTermDate)
>        Console.WriteLine("_sInSex = " & _sInSex)
>        Console.WriteLine("_sInDOB = " & _sInDOB)
>        Console.WriteLine("_sInSSN = " & _sInSSN)
>        Console.WriteLine("_sInPhone = " & _sInPhone)
>        Console.WriteLine("_sInEmployerGroupAnivDate = " &
> _sInEmployerGroupAnivDate)
>        Console.WriteLine("_sInHeadOfHouse = " & _sInHeadOfHouse)
>        Console.WriteLine("_sInPrimaryStatus = " & _sInPrimaryStatus)
>        Console.WriteLine("_sInMaritalStatus = " & _sInMaritalStatus)
>    End Sub
> End Module
Author
31 Mar 2005 6:17 AM
MeltingPoint
"Stephany Young" <noone@localhost> wrote in
news:#8CTfEbNFHA.3296@TK2MSFTNGP15.phx.gbl:

So...... are we saying that the core of his parsing code was the fastest to
begin with, give or take a trim. I won't argue that. I started with the
impression that that many reads was causing a significant overhead, and
looked for a solution that required less reads. As far as comparitive
results, I think we have them, if your best time was 1.5 and mine 3.5 then
the results are in:) For a box to box test, just post the code and I'll run
it here and let you know the results (and hillcountry if he's still reading
this thread:)

Night,
MP
Author
1 Apr 2005 9:52 PM
Stephany Young
Hi, MeltingPoint.

I attempted to send you an email about 18 hours. The address I used is one I
interpreted from and earlier post in this thread. Did you get or dor I
misinterpret the address?


Show quoteHide quote
"MeltingPoint" <n***@all.com> wrote in message
news:0J6dnR_gUoZzCNbfRVn-rA@rogers.com...
> "Stephany Young" <noone@localhost> wrote in
> news:#8CTfEbNFHA.3296@TK2MSFTNGP15.phx.gbl:
>
> So...... are we saying that the core of his parsing code was the fastest
> to
> begin with, give or take a trim. I won't argue that. I started with the
> impression that that many reads was causing a significant overhead, and
> looked for a solution that required less reads. As far as comparitive
> results, I think we have them, if your best time was 1.5 and mine 3.5 then
> the results are in:) For a box to box test, just post the code and I'll
> run
> it here and let you know the results (and hillcountry if he's still
> reading
> this thread:)
>
> Night,
> MP