Home All Groups Group Topic Archive Search About
Author
13 Sep 2006 11:02 AM
JimmyKoolPantz
Situation:

I am writing a program that opens up a data file (file ext .dbf),
imports the information into a dataset.  The program also, searches
through the dataset for key words ("company names") and then displays
all records with company names in a datagridview.

Once the information has been binded to the datagridview control, I
allow the user to select all records that are not companies, for
example, during the search the key word "HONDA" may have pulled a
record that has the name "Rhonda Smith".  Since Rhonda contains the
word "HONDA" it was mistakenly pulled and marked as a company name and
displayed in the datagridview control.  In this particualar instance
the user would check a box in the datagridview control indicating this
is not a company name and they would like to keep this record in the
data file.

Once the user has checked all the records they want to keep, they
process the file.  Processing consists of A) Backing up the original
file B) Create a new file with just company names, and C) update the
original file (remove all records that are marked as companies).

Problem:

Processing time.  The time it takes to update the files is extreemly
long, and I'm sure the problem is logic related.  It approximately took
me around 70 seconds to update a file that contains 440 records.  I
tested a file the contains 22,000 records and the time it took was not
exceptable.

Question:

I'm looking for some advice on what would be the best way to process
the records that would speed up processing time.  I'm currently backing
up the original file, then removing all the records from the original
file (just keeping header information) and then then ,during
processing, import one record at a time.  If the record contains
company inforamtion, I import it into the the companies .dbf, and if it
does not contain company information, I import it into the original
file.

Possible solutions:

Export a dataset as a DBF file? which I am not sure, at this point, if
this is possible, and if it is, would it keep all field lengths for
each column.  For example, I need to keep the field "Fullname" as text
and it has to be at the length of 65 characters long.  If this is a
possible solution, I would need to find a way, to get field length
information from the original file, since all fields will never be the
same on all files.

Another possible solution, instead of removing all records from the
original file and then inserting one record at a time, it may be
logical just to write a query, the would just remove a record  that I
do not want to keep in the original file.  Since, most files I deal
with have less then 200 records that have company information.  It
seems feasible, it would take me less time to remove 200 records then
to insert 2000 records.

Any advice would be greatly appreciated.

Code Samples:

The following is a few methods which will illistrate the way I am
currently processing the records, I think the key method below would
probably be the "Processing" routine.  There are some routines that I
did leave out, so if I need to post further information please let me
know.  Please, note this program has not gone live, so you will notice
I still need to do some exception handling.

Private Sub openfile()
        Dim qy As String
        Dim file As New FileProcessing
        Dim dr As DataRow
        Dim dc As DataColumn
        Dim iFullname As Integer = 0
        Dim iFirst As Integer = 0
        Dim iLast As Integer = 0
        Dim boolFirst As Boolean = False
        Dim boolLast As Boolean = False
        Dim boolCompany As Boolean = False
        Dim i As Integer = 0
        Dim int As Integer = 0
        Dim Match As Boolean
        Dim dataCol As DataColumn
        Dim tbRow As DataRow
        Dim FirstRow As Boolean = True
        Dim temp As String = ""

        CrossThreadingCallsException = False
        Try
            qy = "select * from " + sFileName
            ds = New DataSet
            ds = file.Open_File(sPath, qy, sFileType)
            Application.DoEvents()

            btnUpdate.Visible = False
            btnRemove.Visible = False
            btnAdd.Visible = False

            tbFilter = New DataTable

            btnUpdate.Visible = False
            btnRemove.Visible = False
            btnAdd.Visible = False

            dgCheckBox = New DataGridViewCheckBoxColumn
            dgCheckBox.Name = "KEEP"
            dgCheckBox.Visible = False

            dgView.Columns.Add(dgCheckBox)

            For Each dc In ds.Tables(0).Columns

                If dc.ColumnName.ToUpper = "FULLNAME" Then
                    tbFilter.Columns.Add(dc.ColumnName)
                    iFullname = i
                End If
                If dc.ColumnName.ToUpper = "FIRST" Then
                    tbFilter.Columns.Add(dc.ColumnName)
                    iFirst = i
                    boolFirst = True
                End If
                If dc.ColumnName.ToUpper = "LAST" Then
                    tbFilter.Columns.Add(dc.ColumnName)
                    iLast = i
                    boolLast = True
                End If
                i += 1
            Next
            tbFilter.Columns.Add("FILTER")
            For Each dr In ds.Tables(0).Rows
                If dr.Item(iFullname) IsNot System.DBNull.Value Then
                    Match = CompanyFilter(dr.Item(iFullname))
                    If Match = True Then
                        tbRow = tbFilter.NewRow
                        For Each dataCol In tbFilter.Columns

                            If dataCol.ColumnName.ToUpper = "FULLNAME"
Then
                                tbRow.Item(int) = dr.Item(iFullname)
                            End If
                            If dataCol.ColumnName.ToUpper = "FIRST"
Then
                                If boolFirst = True Then
                                    tbRow.Item(int) = dr.Item(iFirst)
                                End If
                            End If
                            If dataCol.ColumnName.ToUpper = "LAST" Then
                                If boolLast = True Then
                                    tbRow.Item(int) = dr.Item(iLast)
                                End If
                            End If
                            int += 1
                        Next
                        tbRow.Item(3) = Filter
                        tbFilter.Rows.Add(tbRow)
                        Filter = ""
                        int = 0

                    End If
                Else
                    boolBlankNames = True
                End If
            Next
            formMessage.Close()
            txtSearch.Text = sFileName + ".DBF"
            tempSearch = txtSearch.Text
        Catch ex As Exception
        End Try

Private Function CompanyFilter(ByVal Fullname As String) As Boolean
        Dim dr As DataRow
        Dim drNames As DataRow
        Dim sTemp As String
        Dim tempArray() As String
        Dim tempName As String
        Dim NameMatch As Boolean = False

        For Each dr In dsFilter.Tables(0).Rows
            sTemp = dr.Item(0).ToString.ToUpper
            If sTemp.Length > 4 Then
                If Fullname.ToUpper.Contains(sTemp) Then
                    tempName = Fullname.ToUpper.Trim
                    For Each drNames In dsNames.Tables(0).Rows
                        If tempName = drNames.Item(0).ToString.ToUpper
Then
                            NameMatch = True
                            Exit For
                        End If
                    Next
                    If NameMatch = False Then
                        For Each drNames In dsNames.Tables(0).Rows
                            tempArray = Fullname.Split(" ")
                            If tempArray.GetUpperBound(0) > 0 Then
                                For i As Integer = 0 To
tempArray.GetUpperBound(0)
                                    tempName = tempArray(i).Trim
                                    If tempName.Length > 4 Then
                                        If
tempName.ToUpper.Contains(drNames.Item(0).ToString.ToUpper) Then
                                            NameMatch = True
                                            Exit For
                                        End If
                                    Else
                                        If tempName.ToUpper =
drNames.Item(0).ToString.ToUpper Then
                                            NameMatch = True
                                            Exit For
                                        End If
                                    End If

                                Next
                            Else
                                tempName = sTemp.Trim
                                If tempName.Length > 4 Then
                                    If
tempName.ToUpper.Contains(drNames.Item(0).ToString.ToUpper) Then
                                        NameMatch = True
                                        Exit For
                                    End If
                                Else
                                    If tempName =
drNames.Item(0).ToString.ToUpper Then
                                        NameMatch = True
                                        Exit For
                                    End If
                                End If

                            End If
                        Next
                    End If

                    If NameMatch = False Then
                        Filter = "%" + sTemp + "%"
                        Return True
                        Exit Function
                    Else
                        NameMatch = False
                    End If
                End If
            Else
                If Fullname.ToUpper = sTemp Then
                    Filter = "%" + sTemp + "%"
                    Return True
                    Exit Function
                End If
            End If
        Next
        Return False
    End Function

Private Sub Processing()
        Dim qy As String
        Dim qyCompanies As String = ""
        Dim qyNotCompanies As String = ""
        Dim path As String
        Dim temp() As String
        Dim sName As String
        Dim sfile As New FileProcessing
        Dim dr As DataRow
        Dim sTemp As String = ""
        Dim item As String = ""
        Dim iLength As Integer = 0
        Dim tempCompany As String
        Dim match As Boolean = False
        Dim dc As DataColumn
        Dim int As Integer = 0


        'back up original file
        File.Copy(SourceFile, DestinationFile)
        temp = txtSearch.Text.Split(".")
        sName = temp(0)

        'Delete all rows in original file
        Dim ext As String = "DBF"
        qy = "Delete * From " + sName
        path = SourcePath
        sfile.UpdateDBF(path, qy, ext)

        totProcessed = 0
        totRecords = 0
        totCompanies = 0
        totResidents = 0
        'create temperary holding file for companies
        'since the name of a dbf file is limited to 8 characters
        'I had to create a temp file for companies (if I named the file
12345Compnaies
        '.net would not allow me to insert records.  The file name can
not be longer then
        ' 8 characters.  After I insert the records into the file then
I will rename the file.

        tempCompany = path + sName + "c.DBF"
        File.Copy(SourceFile, tempCompany)

        'process compare dataset names to filter table names
        'if match then save in companies table else save in original
file
        'totRecords = ds.Tables(0).Rows.Count
        ' FormCompanyInfo.lblTotRecordsCount.Text = totRecords.ToString

        qy = ""
        'create companies query statement
        qyCompanies = "insert into " + sName + "c("
        For Each dc In ds.Tables(0).Columns
            sTemp = sTemp + "[" + dc.Caption + "],"
        Next
        sTemp = sTemp.Substring(0, sTemp.Length - 1) + ")"
        qyCompanies = qyCompanies + sTemp + "Values("

        sTemp = ""
        For Each dc In ds.Tables(0).Columns
            sTemp = sTemp + "@" + dc.Caption + ","
        Next
        sTemp = sTemp.Substring(0, sTemp.Length - 1) + ")"
        qyCompanies = qyCompanies + sTemp

        'create Not Companies query statement
        sTemp = ""
        qyNotCompanies = "insert into " + sName + "("
        For Each dc In ds.Tables(0).Columns
            sTemp = sTemp + "[" + dc.Caption + "],"
        Next
        sTemp = sTemp.Substring(0, sTemp.Length - 1) + ")"
        qyNotCompanies = qyNotCompanies + sTemp + "Values("

        sTemp = ""
        For Each dc In ds.Tables(0).Columns
            sTemp = sTemp + "@" + dc.Caption + ","
        Next
        sTemp = sTemp.Substring(0, sTemp.Length - 1) + ")"
        qyNotCompanies = qyNotCompanies + sTemp

        ds.Tables(0).Columns.Add("VALUE")
        int = ds.Tables(0).Columns.Count - 1
        For i As Integer = 0 To tbFilter.Rows.Count - 1
            If dgView.Item(0, i).Value = False Then
                For Each dr In ds.Tables(0).Rows
                    Dim tem As String
                    Dim tem2 As String
                    tem = tbFilter.Rows(i).Item(0)
                    tem2 = dr.Item(1).ToString.ToUpper
                    If tem.ToUpper = dr.Item(1).ToString.ToUpper Then
                        If dr.Item(int).ToString <> "x" Then
                            dr.Item(int) = "x"
                            Exit For
                        End If
                    End If
                Next
            End If
        Next

        int = ds.Tables(0).Columns.Count - 1
        Dim test As Integer = 0
        For Each dr In ds.Tables(0).Rows

            If dr.Item(int).ToString = "x" Then
                test += 1
                'this is a company name
                'save in company file
                Dim itemArray(0) As String
                Dim itemp As Integer = 0
                For Each dc In ds.Tables(0).Columns
                    itemArray(itemp) = dr.Item(itemp).ToString
                    ReDim Preserve itemArray(itemp + 1)
                    itemp += 1
                Next
                sfile.UpdateDBF(path, qyCompanies, ext, itemArray, ds)
                'totCompanies += 1

            Else
                'this is a resident name
                'save in original data file
                Dim itemArray(0) As String
                Dim itemp As Integer = 0
                For Each dc In ds.Tables(0).Columns
                    itemArray(itemp) = dr.Item(itemp).ToString
                    ReDim Preserve itemArray(itemp + 1)
                    itemp += 1
                Next
                sfile.UpdateDBF(path, qyNotCompanies, ext, itemArray,
ds)
                'totResidents += 1

            End If
            'totProcessed += 1
            'FormCompanyInfo.lblTotProcessedCount.Text =
totProcessed.ToString
            'FormCompanyInfo.lblTotCompaniesCount.Text =
totCompanies.ToString
            'FormCompanyInfo.lblTotResidentsCount.Text =
totResidents.ToString
        Next
        File.Move(tempCompany, CompanyFile)
        FormCompanyInfo.Close()
        MsgBox(Now)
    End Sub

Public Function Open_File(ByVal FilePath As String, ByVal Query As
String, ByVal FileExt As String) As DataSet
        Dim ds As DataSet
        Try
            'important note: make sure the file path does "NOT" have
the file name
            'in the string.  This will cause stress, and waisted time.
Make sure
            'file path is only the path to the directory where the file
is stored.
            cs = Connection_String(FilePath, FileExt)
            'If FileExt.ToUpper = "DBF" Or FileExt.ToUpper = "XLS" Then
            Dim cn As OleDbConnection
            Dim da As OleDbDataAdapter
            cn = New OleDbConnection(cs)
            cn.Open()
            da = New OleDbDataAdapter(Query, cn)
            ds = New DataSet
            da.Fill(ds)
            cn.Close()
            Return ds
        Catch ex As Exception
            Throw ex
        Finally
            cs = Nothing
            ds = Nothing
        End Try
    End Function

Public Sub UpdateDBF(ByVal FilePath As String, ByVal qy As String,
ByVal FileExt As String, ByVal itemArray() As String, ByVal ds As
DataSet)
        Dim cs As String
        Dim cn As OleDbConnection
        Dim dc As OleDbCommand
        Dim col As DataColumn
        Dim temp As String = ""
        Dim size As String
        Dim i As Integer = 0

        Try
            cs = Connection_String(FilePath, FileExt)
            cn = New OleDbConnection(cs)
            cn.Open()
            dc = New OleDbCommand(qy, cn)

            For Each col In ds.Tables(0).Columns
                temp = "@" + col.Caption
                size = itemArray(i).Length
                dc.Parameters.Add(temp, OleDbType.Char, size,
col.Caption)
                dc.Parameters(temp).Value = itemArray(i)
                i += 1
            Next
            dc.ExecuteNonQuery()
            cn.Close()
        Catch ex As Exception
            Throw ex
        End Try
    End Sub

Author
13 Sep 2006 1:46 PM
JimmyKoolPantz
After posting my thoughts, I think I came up with a solution.  I think
I am going to add a field to the original file named something like
"Remove" and then write a query to remove records that have a value in
the new field. 

I think this might solve my processing problem.
Author
13 Sep 2006 3:34 PM
JimmyKoolPantz
That sucked, I could not alter a dbf table (add column) to a dbf file
that already contained data.  Any suggestions?
Author
14 Sep 2006 7:10 AM
GhostInAK
Hello JimmyKoolPantz,

I asume yer using OleDb or ODBC to read and write the DBF.  While I would
normally applaud this, as it uses the standard data access scheme and is
the correct way of doping this, I recently had the same trouble.  I had to
write a DBF file that had ~680,000 records.  And I had to write it more than
once.  Turns out writing DBF files is SLOWER THAN MOLASSAS ON A CLOD WINTER
DAY.  I wont make guesses here as to why, but I have an educated guess or
three.  Point is, I wrote my own DBF reader and writer that munched on the
DBF file bits directly.  I can now write 680,000 record DBF's in less than
30 seconds. Whee! Look at me fly ma! No hands!

You might benefit from the same approach.

-Boo

Show quoteHide quote
> That sucked, I could not alter a dbf table (add column) to a dbf file
> that already contained data.  Any suggestions?
>
Author
16 Sep 2006 6:11 AM
JimmyKoolPantz
Boo.. can you teach me to fly?

Yes, I am using oledb .  I have been pondering on a few solutions to my
problem.  One solution (since, I have not found a way to alter a table
that already has data), would be to edit a field that I know will
always be in the table.  Which would be the "Job ID" field.  I could
possible change the value of the "Job ID" field for the records I need
to delete, and then just write a query to remove records with a
specific Job ID value.  However, looking forward I see a problem.  I
really need to find a way to alter a DBF table that contains data.
Because some of my future projects will be to add a field to a dbf
file, such as "winning number" and some other calculations.

Could you please give me some flight instructions, on how to fly?  I
really need to find a way to either 1) alter a table that contains data
2) be able to quickly insert records into a dbf file.  Most important,
data processing must be exceptionally fast.   If I try inserting
records one at a time using oledb, its very slow.. inserting 100
records is extremely slow.

I'm kind of curious, why is it vfp can alter tables with data, but
vb.net oledb does not allow you to do this?   Am I missing something
here?




GhostInAK wrote:
Show quoteHide quote
> Hello JimmyKoolPantz,
>
> I asume yer using OleDb or ODBC to read and write the DBF.  While I would
> normally applaud this, as it uses the standard data access scheme and is
> the correct way of doping this, I recently had the same trouble.  I had to
> write a DBF file that had ~680,000 records.  And I had to write it more than
> once.  Turns out writing DBF files is SLOWER THAN MOLASSAS ON A CLOD WINTER
> DAY.  I wont make guesses here as to why, but I have an educated guess or
> three.  Point is, I wrote my own DBF reader and writer that munched on the
> DBF file bits directly.  I can now write 680,000 record DBF's in less than
> 30 seconds. Whee! Look at me fly ma! No hands!
>
> You might benefit from the same approach.
>
> -Boo
>
> > That sucked, I could not alter a dbf table (add column) to a dbf file
> > that already contained data.  Any suggestions?
> >
Author
17 Sep 2006 12:15 AM
GhostInAK
Hello JimmyKoolPantz,

The first page of a google search for "dbf file format" yeilds http://www.clicketyclick.dk/databases/xbase/format/dbf.html#DBF_STRUCT
I used this spec to build a DBFStreamReader and DBFStreamWriter pair (in
conjunction with classes that modeled the DBF format in an object structure:
DBFFile, DBFFileHeader, DBFRecord, etc etc).  This was done as part of a
larger project for reading/writing SHP files (they call them SHP files..
but an ESRI SHP is actually a minimum of 3 distinct files: a shp, shx, and
dbf.)

I can't share the code as I do not own it, so it's not mine to give away.
The BitConverter class will come in handy.

-Boo

Show quoteHide quote
> Boo.. can you teach me to fly?
>
> Yes, I am using oledb .  I have been pondering on a few solutions to
> my problem.  One solution (since, I have not found a way to alter a
> table that already has data), would be to edit a field that I know
> will always be in the table.  Which would be the "Job ID" field.  I
> could possible change the value of the "Job ID" field for the records
> I need to delete, and then just write a query to remove records with a
> specific Job ID value.  However, looking forward I see a problem.  I
> really need to find a way to alter a DBF table that contains data.
> Because some of my future projects will be to add a field to a dbf
> file, such as "winning number" and some other calculations.
>
> Could you please give me some flight instructions, on how to fly?  I
> really need to find a way to either 1) alter a table that contains
> data 2) be able to quickly insert records into a dbf file.  Most
> important, data processing must be exceptionally fast.   If I try
> inserting records one at a time using oledb, its very slow.. inserting
> 100 records is extremely slow.
>
> I'm kind of curious, why is it vfp can alter tables with data, but
> vb.net oledb does not allow you to do this?   Am I missing something
> here?
>
> GhostInAK wrote:
>
>> Hello JimmyKoolPantz,
>>
>> I asume yer using OleDb or ODBC to read and write the DBF.  While I
>> would
>> normally applaud this, as it uses the standard data access scheme and
>> is
>> the correct way of doping this, I recently had the same trouble.  I
>> had to
>> write a DBF file that had ~680,000 records.  And I had to write it
>> more than
>> once.  Turns out writing DBF files is SLOWER THAN MOLASSAS ON A CLOD
>> WINTER
>> DAY.  I wont make guesses here as to why, but I have an educated
>> guess or
>> three.  Point is, I wrote my own DBF reader and writer that munched
>> on the
>> DBF file bits directly.  I can now write 680,000 record DBF's in less
>> than
>> 30 seconds. Whee! Look at me fly ma! No hands!
>> You might benefit from the same approach.
>>
>> -Boo
>>
>>> That sucked, I could not alter a dbf table (add column) to a dbf
>>> file that already contained data.  Any suggestions?
>>>