Home All Groups Group Topic Archive Search About

Reading .csv files containing the "-" character

Author
10 Apr 2006 8:31 PM
stephane8
I'm able to open and read my .csv file. But when the character "-" is
in one of my fields (ex : 18802-002) I get a system.DBNull value ! Even
if I edit the file with notepad and put "18802-002" I still get a
DBNull value... I'm not able to figure this one out, any help would be
appreciated.

Here's my code :

Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=" & sPath & ";Extended Properties=""Text;HDR=NO;FMT=Delimited"""

Dim Conn As New System.Data.OleDb.OleDbConnection(ConnectionString)
        Conn.Open()

        Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM
" & sFile, Conn)
        Dim ds As New DataSet("TextFiles")

        da.Fill(ds, "XYZ")

        Dim dt As DataTable = ds.Tables("XYZ")

        For Each sRow As DataRow In dt.Rows
               ....
        Next

Author
11 Apr 2006 1:59 AM
tommaso.gastaldi
Hi stephane,

I have tried the same, by using the datareader and I get correct data
with
"18802-002" (no Null).

Could you post the text (a few records and 2-3 fields) so I can make a
trial on
the actual data which gives you problems?

-tom

stepha***@gmail.com ha scritto:

Show quoteHide quote
> I'm able to open and read my .csv file. But when the character "-" is
> in one of my fields (ex : 18802-002) I get a system.DBNull value ! Even
> if I edit the file with notepad and put "18802-002" I still get a
> DBNull value... I'm not able to figure this one out, any help would be
> appreciated.
>
> Here's my code :
>
> Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data
> Source=" & sPath & ";Extended Properties=""Text;HDR=NO;FMT=Delimited"""
>
> Dim Conn As New System.Data.OleDb.OleDbConnection(ConnectionString)
>         Conn.Open()
>
>         Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM
> " & sFile, Conn)
>         Dim ds As New DataSet("TextFiles")
>
>         da.Fill(ds, "XYZ")
>
>         Dim dt As DataTable = ds.Tables("XYZ")
>
>         For Each sRow As DataRow In dt.Rows
>                ....
>         Next
Author
11 Apr 2006 2:42 AM
shriop
You can optionally try my csv parsing component that I sell and be done
with the hassles that come out of using the jet driver, with a
significant speed boost, http://www.csvreader.com .

Bruce Dunwiddie

tommaso.gasta***@uniroma1.it wrote:
Show quoteHide quote
> Hi stephane,
>
> I have tried the same, by using the datareader and I get correct data
> with
> "18802-002" (no Null).
>
> Could you post the text (a few records and 2-3 fields) so I can make a
> trial on
> the actual data which gives you problems?
>
> -tom
>
> stepha***@gmail.com ha scritto:
>
> > I'm able to open and read my .csv file. But when the character "-" is
> > in one of my fields (ex : 18802-002) I get a system.DBNull value ! Even
> > if I edit the file with notepad and put "18802-002" I still get a
> > DBNull value... I'm not able to figure this one out, any help would be
> > appreciated.
> >
> > Here's my code :
> >
> > Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data
> > Source=" & sPath & ";Extended Properties=""Text;HDR=NO;FMT=Delimited"""
> >
> > Dim Conn As New System.Data.OleDb.OleDbConnection(ConnectionString)
> >         Conn.Open()
> >
> >         Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM
> > " & sFile, Conn)
> >         Dim ds As New DataSet("TextFiles")
> >
> >         da.Fill(ds, "XYZ")
> >
> >         Dim dt As DataTable = ds.Tables("XYZ")
> >
> >         For Each sRow As DataRow In dt.Rows
> >                ....
> >         Next
Author
11 Apr 2006 3:09 AM
tommaso.gastaldi
I also tried your code (OleDbDataAdapter + FILL) using:

"34";"4";"6"
"18802-002"; "5"; "7"
"xx"; "5"; "5"

works perfectly fine to me (no Null)   (?).

-t

stepha***@gmail.com ha scritto:

Show quoteHide quote
> I'm able to open and read my .csv file. But when the character "-" is
> in one of my fields (ex : 18802-002) I get a system.DBNull value ! Even
> if I edit the file with notepad and put "18802-002" I still get a
> DBNull value... I'm not able to figure this one out, any help would be
> appreciated.
>
> Here's my code :
>
> Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data
> Source=" & sPath & ";Extended Properties=""Text;HDR=NO;FMT=Delimited"""
>
> Dim Conn As New System.Data.OleDb.OleDbConnection(ConnectionString)
>         Conn.Open()
>
>         Dim da As New System.Data.OleDb.OleDbDataAdapter("SELECT * FROM
> " & sFile, Conn)
>         Dim ds As New DataSet("TextFiles")
>
>         da.Fill(ds, "XYZ")
>
>         Dim dt As DataTable = ds.Tables("XYZ")
>
>         For Each sRow As DataRow In dt.Rows
>                ....
>         Next
Author
11 Apr 2006 1:23 PM
stephane8
Seems like my problem is in my .csv file. I've made some tests with
different values and found out that it was not related to the "-"
character.

This in my test file :
"1",yo,"3"
"4",5,"6"
"4564-564",564,"rger"

I get a DBNull value for 'yo', but if I change my file for :
"1",yo,"3"
"4","5","6"
"4564-564",564,"rger"

Now everything is fine ! Even if i didn't modified 'yo' who was
actually the wrong one.

What i'm not understanding is why it wouldn't work if I put quotes only
to "yo", then again I get a DBNull value
"1","yo","3"
"4",5,"6"
"4564-564",564,"rger"


Here's my real situation :

I have to import a big file (about 20 000 records) who is generated by
an other system. Here's a part of the file :

95912,"""HRD, Rubber feet, NLX chassis""                   "
95913,"""HRD, NLX Rail Guide""                             "
95914,"""HRD, 9400 front panel Clip""                      "
95957,"""Dis, 8510 Colour 10.4  800x600 250 NIT""          "
95958,"""ASY, 8510 HDD 4.0GB""                             "
95959,"""Asy, 8510 grounding Cables""                      "
96375,Power Supply Low  Temp
97063,"""Assy, Hard Disk Drive 6.4 Gb""                    "
97172, Connector Rentention  menchanism
97182,"""Assy, Intel Juneau Motherboard (NEW)""            "
97221,"""Assy,Intel PII 500 MHZ Processor""                "
18802-002            ,8045 Firmware (Ext temp)

16352-001           ,Assy Cable 7015 7 pin din to 9 pin Dsub
16365-001           ,Bottom Plastic 7015/7020
16849-001           ,"""25' antenna cable,RG213/U N-M to UHF-M (for 902
S"
17366-003           ,Power Cable Std Temp
17366-004           ,Power Cable Low Temp
17393-001           ,Modem 9600 Baud
17393-002           ,Modem 4800 Baud Conf.coat
17393-003           ,Modem9600 Baud Conf.coat
17483-003           ,Modem 9600 Baud for 8030/9011
17693-001           ,8025 Main logic bd

Everything goes fine until the 18802-002 line. Even if I put "18802-002
           ","8045 Firmware (Ext temp)" it doesn't work.

I'm not used to play with .csv files, if anybody could suggest me
something it would be really appreciated, thanx
Author
11 Apr 2006 6:47 PM
tommaso.gastaldi
Try this:

-Make a text file called schema.ini

-Place it in the same directory of your delimited file, say
"NameOfYourFile.txt" (or.csv), with data

-Place the following text into schema.ini

[NameOfYourFile.txt]
ColNameHeader=False
Format=Delimited(,)
MaxScanRows=0


    where in place of NameOfYourFile.txt you write the name of your
delimited text file

-Run again your program

let me know ...

-tom



PS
here in italy I am force to use (;)
Author
11 Apr 2006 7:52 PM
tommaso.gastaldi
An alternative (for any delimited file)

start REGEDIT clicking non yous start > execute menu

find
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text\MaxScanRows

right click on the entry > change and set 0 in place of the current
value (I have 25)

this will prevent from scanning the row to attempt to determine the
type. So it will use all string type.
(See ... you should also see "Majority Type" that is it would determine
the type based on most of the scanned values.)


Another solution would be setting the value programmatically when
necessary...

-tom
Author
11 Apr 2006 8:08 PM
stephane8
Thanx Tom for your replies, I found an other way to do it. I'm using
the Regex object with a crazy pattern string ! Even if I have to read
each line of the file by myself it's gonna make the job for me.

Here's my code :

Private Shared Function ParseLine(ByVal oneLine As String) As String()
        ' Returns an array containing the values of the comma-separated
fields.

        ' This pattern actually recognizes the correct commas.
        ' The Regex.Split() command later gets text between the commas.
        Dim pattern As String =
",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = _
                New System.Text.RegularExpressions.Regex(pattern)

        Return r.Split(oneLine)
    End Function

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click

        Dim op As New OpenFileDialog
        op.ShowDialog()

        Dim sLine As String
        Dim s() As String

        Dim oReader As New IO.StreamReader(op.FileName)

        Do
            sLine = oReader.ReadLine()

            If sLine <> "" Then
                s = ParseLine(sLine)
                MsgBox(s(0))
            End If

        Loop Until sLine = ""

    End Sub

Thanx again for the help !
-Stephane
Author
11 Apr 2006 8:27 PM
tommaso.gastaldi
Ah :) , if you want to do the parsing yourself, that's all another
story.

Watch out, there might be several subtleties... is that a tested
pattern that can be trusted ?

http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm