Home All Groups Group Topic Archive Search About

Read text-segment from binary file

Author
10 Jun 2006 8:43 PM
Klaus Jensen
Hi

I have some binary files (jpeg), which contain a lot of image-data - and
some embedded XML (XMP actually).

If I view the file in a hex-editor, there is a lot of binary data - and then
in the middle of everything:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
[etc etc]

I need to load this file in some kind of reader, search for the <?xml as
start and </plist> as the end of the xml, and return this string for further
processing.

How do I do that?

I tried just loading it into a streamreader to check out what happened, but
I only got garbage from the reader...

Any help will be greatly appreciated :)

Regards

- Klaus

Author
10 Jun 2006 10:05 PM
Göran_Andersson
You have to read the file as binary data. Then you can extract the part
of it that is the xml file and decode that to a string.

Klaus Jensen wrote:
Show quoteHide quote
> Hi
>
> I have some binary files (jpeg), which contain a lot of image-data - and
> some embedded XML (XMP actually).
>
> If I view the file in a hex-editor, there is a lot of binary data - and then
> in the middle of everything:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
> "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
> <plist version="1.0">
> [etc etc]
>
> I need to load this file in some kind of reader, search for the <?xml as
> start and </plist> as the end of the xml, and return this string for further
> processing.
>
> How do I do that?
>
> I tried just loading it into a streamreader to check out what happened, but
> I only got garbage from the reader...
>
> Any help will be greatly appreciated :)
>
> Regards
>
> - Klaus
>
>
Author
10 Jun 2006 11:09 PM
Klaus Jensen
"Göran Andersson" <gu***@guffa.com> wrote in message
news:O2ECFoNjGHA.4748@TK2MSFTNGP04.phx.gbl...
> You have to read the file as binary data. Then you can extract the part of
> it that is the xml file and decode that to a string.

Could you give an example, please? Working with the binary-objects is new to
me.
Author
11 Jun 2006 4:17 PM
Michel Posseth [MCP]
you can just use a stream reader and loop untill you reach the XML  from
that point you read all the data in a string untill you read the end of the
XML

Dim sr As New System.IO.StreamReader("C:\afile.bin")

Do Until sr.EndOfStream

Debug.WriteLine(sr.ReadLine)

Loop

note that  readline returns a string object

hth

Michel Posseth [MCP]

Show quoteHide quote
"Klaus Jensen" <spammers@burninhell.com> schreef in bericht
news:uiWltLOjGHA.4284@TK2MSFTNGP05.phx.gbl...
> "Göran Andersson" <gu***@guffa.com> wrote in message
> news:O2ECFoNjGHA.4748@TK2MSFTNGP04.phx.gbl...
>> You have to read the file as binary data. Then you can extract the part
>> of it that is the xml file and decode that to a string.
>
> Could you give an example, please? Working with the binary-objects is new
> to me.
>
>
Author
12 Jun 2006 5:32 PM
Göran_Andersson
You can't read a binary file using a StreamReader.

Michel Posseth [MCP] wrote:
Show quoteHide quote
> you can just use a stream reader and loop untill you reach the XML  from
> that point you read all the data in a string untill you read the end of the
> XML
>
> Dim sr As New System.IO.StreamReader("C:\afile.bin")
>
> Do Until sr.EndOfStream
>
> Debug.WriteLine(sr.ReadLine)
>
> Loop
>
> note that  readline returns a string object
>
> hth
>
> Michel Posseth [MCP]
>
> "Klaus Jensen" <spammers@burninhell.com> schreef in bericht
> news:uiWltLOjGHA.4284@TK2MSFTNGP05.phx.gbl...
>> "Göran Andersson" <gu***@guffa.com> wrote in message
>> news:O2ECFoNjGHA.4748@TK2MSFTNGP04.phx.gbl...
>>> You have to read the file as binary data. Then you can extract the part
>>> of it that is the xml file and decode that to a string.
>> Could you give an example, please? Working with the binary-objects is new
>> to me.
>>
>>
>
>
Author
12 Jun 2006 4:50 PM
Göran_Andersson
Klaus Jensen wrote:
> "Göran Andersson" <gu***@guffa.com> wrote in message
> news:O2ECFoNjGHA.4748@TK2MSFTNGP04.phx.gbl...
>> You have to read the file as binary data. Then you can extract the part of
>> it that is the xml file and decode that to a string.
>
> Could you give an example, please? Working with the binary-objects is new to
> me.
>

If the file is not too large, you could read it all into a byte array.
Then locate the data in the array and create a MemoryStream from that
section of the array, and use a StreamReader to read it.
Author
11 Jun 2006 11:01 PM
GhostInAK
Hello Klaus,

I strongly recommend you read the document at:  http://partners.adobe.com/public/developer/en/xmp/sdk/xmpspecification.pdf

Small Excerpt from page 93:
JPEG
In JPEG files, an APP1 marker designates the location of the XMP Packet.
The following table
shows the entry format.

Byte Offset : Field value : Field name : Length(bytes) : Comments
0 : 0xFFE1 : APP1 : 2 : APP1 marker.
2 : 2 + length of namespace (29) + length of XMP Packet : Lp : 2 : Size in
bytes of this count plus the following two portions.
4 : Null-terminated ASCII string without quotation marks. : namespace : 29
: XMP namespace URI, used as unique ID: http://ns.adobe.com/xap/1.0/
33 : < XMP Packet > : : Must be encoded as UTF-8.

The header plus the following data must be less than 64 KB bytes. The XMP
Packet cannot be
split across the multiple markers, so the size of the XMP Packet must be
at most 65502 bytes.



-Boo



Show quoteHide quote
> Hi
>
> I have some binary files (jpeg), which contain a lot of image-data -
> and some embedded XML (XMP actually).
>
> If I view the file in a hex-editor, there is a lot of binary data -
> and then in the middle of everything:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
> "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
> <plist version="1.0">
> [etc etc]
> I need to load this file in some kind of reader, search for the <?xml
> as start and </plist> as the end of the xml, and return this string
> for further processing.
>
> How do I do that?
>
> I tried just loading it into a streamreader to check out what
> happened, but I only got garbage from the reader...
>
> Any help will be greatly appreciated :)
>
> Regards
>
> - Klaus
>