|
web
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Outlook MSG file readingI am attempting to extract some header information from typical Microsoft Outlook MSG files in VB.NET. I am not after a complete message or attachments that may be enclosed. I am particularly interested in the Message ID field. I have examined MSG files in notepad and hex editors. I can see that the Internet Headers are there and present. I can do a search for Message-ID and locate it without any problems in notepad. The only display issue I have seen so far is that each letter is separated by hex character 00. Thus the Message-ID string would actually be, M e s s a g e - I D. I don't want to use Outlook automation. I have found it to be cumbersome and slow. I also don't want to be reliant on an installation of Office. Since the file is binary, I have attempted to use the System.IO.StreamFile object to read the file. However, I have not been able successfully walk through the file and obtain any readable text. I have played around with various encodings, such as ASCII and Unicode. I think that MSG files are BASE64/Mime encoded though. Perhaps that could be part of my trouble. I have downloaded several example applications that mimic Notepad. However, none of them have been able to read the encoding of MSG files. I have gained a new level of appreciation for Notepad :). I wander what it is that notepad uses to detect the file encoding and display it in such a readable way. Does anyone have any experience with reading Outlook data? Again, I am not after pretty formatting, I just want to extract certain text fragments from these binary files. Can someone point me in the right direction? I would think that I just need to be able to read Byte Sream from the file with the correct encoding and convert it to ASCII text. I have been totally unsuccessful so far. Thanks, Dmitry "Dmitry Akselrod" <dmitry@nospam.com> wrote in message Outlook can be automated, just like Word, Excel etc. It's a bit cranky, but news:hKKdnRpyzo83ud3Z4p2dnA@comcast.com... > Does anyone have any experience with reading Outlook data? Again, I am > not after pretty formatting, I just want to extract certain text fragments > from these binary files. Can someone point me in the right direction? I > would think that I just need to be able to read Byte Sream from the file > with the correct encoding and convert it to ASCII text. I have been > totally unsuccessful so far. I have done it. Have you tried adding a reference to it? Hi,
That's my whole thing is that I don't want to automate Outlook. It's very clunky. I need to be able to process millions of MSG files and Office products (i.e. Access) suck with that many files. Thank you though. dmitry "Homer J Simpson" <nob***@nowhere.com> wrote in message news:QkW%f.89413$%H.47856@clgrps13...Show quoteHide quote > > "Dmitry Akselrod" <dmitry@nospam.com> wrote in message > news:hKKdnRpyzo83ud3Z4p2dnA@comcast.com... > >> Does anyone have any experience with reading Outlook data? Again, I am >> not after pretty formatting, I just want to extract certain text >> fragments from these binary files. Can someone point me in the right >> direction? I would think that I just need to be able to read Byte Sream >> from the file with the correct encoding and convert it to ASCII text. I >> have been totally unsuccessful so far. > > Outlook can be automated, just like Word, Excel etc. It's a bit cranky, > but I have done it. Have you tried adding a reference to it? > > > "Dmitry Akselrod" <dmitry@nospam.com> wrote in message In that case I'd start searching for third party tools. I assume that MSFT news:27OdnVKqzY5GpN3ZnZ2dnUVZ_vqdnZ2d@comcast.com... > That's my whole thing is that I don't want to automate Outlook. It's very > clunky. I need to be able to process millions of MSG files and Office > products (i.e. Access) suck with that many files. aren't offering to divulge the details of the format. No, MS is definitely not documenting their MSG format. I did find this
article: http://www.msusenet.com/archive/topic.php/t-288764.html A gentleman, named Eduardo A. Morcillo has developed some .NET classes that wrap the Office OLE storage. They are pretty good so far. The classes are here: http://www.mvps.org/emorcillo/en/code/grl/storage.shtml I have been able to take a couple of MSG files and obtain a list of streams (properties) and their values. However, I am still missing the Internet Headers. They must lie somewhere else in the file. All of this is quite annoying, thanks to Microsoft. The only known working API I have seen so far (used by many forensic applications) is from Fookes software. These guys are great and their tools are phenomenal, but the API is a little outside my price range. Being able to obtain the Sender, Recipient, Subject, etc. is definitely a plus, but I need the Message ID. I guess it's back to more research. Dmitry Basically, the MSG file format is a series of binary streams. "Homer J Simpson" <nob***@nowhere.com> wrote in message news:UsX%f.89569$%H.59346@clgrps13...Show quoteHide quote > > "Dmitry Akselrod" <dmitry@nospam.com> wrote in message > news:27OdnVKqzY5GpN3ZnZ2dnUVZ_vqdnZ2d@comcast.com... > >> That's my whole thing is that I don't want to automate Outlook. It's >> very clunky. I need to be able to process millions of MSG files and >> Office products (i.e. Access) suck with that many files. > > In that case I'd start searching for third party tools. I assume that MSFT > aren't offering to divulge the details of the format. > > Actually, never mind on the Internet Headers, they are there. They happen
to be stream, __substg1.0_007D001F. I just had some issues with data formatting and conversion. I think that my problem is solved, thanks to Mr. Morcillo. dmitry Show quoteHide quote "Dmitry Akselrod" <dmitry@nospam.com> wrote in message news:ScWdnZVeMu_8zd3ZnZ2dnUVZ_u2dnZ2d@comcast.com... > No, MS is definitely not documenting their MSG format. I did find this > article: > > http://www.msusenet.com/archive/topic.php/t-288764.html > > A gentleman, named Eduardo A. Morcillo has developed some .NET classes > that wrap the Office OLE storage. They are pretty good so far. The > classes are here: > > http://www.mvps.org/emorcillo/en/code/grl/storage.shtml > > I have been able to take a couple of MSG files and obtain a list of > streams (properties) and their values. However, I am still missing the > Internet Headers. They must lie somewhere else in the file. All of this > is quite annoying, thanks to Microsoft. > > The only known working API I have seen so far (used by many forensic > applications) is from Fookes software. These guys are great and their > tools are phenomenal, but the API is a little outside my price range. > > Being able to obtain the Sender, Recipient, Subject, etc. is definitely a > plus, but I need the Message ID. I guess it's back to more research. > > Dmitry > > > Basically, the MSG file format is a series of binary streams. > "Homer J Simpson" <nob***@nowhere.com> wrote in message > news:UsX%f.89569$%H.59346@clgrps13... >> >> "Dmitry Akselrod" <dmitry@nospam.com> wrote in message >> news:27OdnVKqzY5GpN3ZnZ2dnUVZ_vqdnZ2d@comcast.com... >> >>> That's my whole thing is that I don't want to automate Outlook. It's >>> very clunky. I need to be able to process millions of MSG files and >>> Office products (i.e. Access) suck with that many files. >> >> In that case I'd start searching for third party tools. I assume that >> MSFT aren't offering to divulge the details of the format. >> >> > > Check out the Redemption COM object:
http://www.dimastr.com/redemption/ Dmitry Akselrod wrote: Show quoteHide quote > Hello everyone, > > I am attempting to extract some header information from typical Microsoft > Outlook MSG files in VB.NET. I am not after a complete message or > attachments that may be enclosed. I am particularly interested in the > Message ID field. I have examined MSG files in notepad and hex editors. > I > can see that the Internet Headers are there and present. I can do a > search > for Message-ID and locate it without any problems in notepad. The only > display issue I have seen so far is that each letter is separated by hex > character 00. Thus the Message-ID string would actually be, M e s s a g e > - I D. > > I don't want to use Outlook automation. I have found it to be cumbersome > and slow. I also don't want to be reliant on an installation of Office. > > Since the file is binary, I have attempted to use the System.IO.StreamFile > object to read the file. However, I have > not been able successfully walk through the file and obtain any readable > text. I have played around with various encodings, such as ASCII and > Unicode. I think that MSG files are BASE64/Mime encoded though. Perhaps > that could be part of my trouble. > > I have downloaded several example applications that mimic Notepad. > However, > none of them have been able to read the encoding of MSG files. I have > gained a new level of appreciation for Notepad :). I wander what it is > that notepad uses to detect the file encoding and display it in such a > readable way. > > Does anyone have any experience with reading Outlook data? Again, I am > not after pretty formatting, I just want to extract certain text fragments > from > these binary files. Can someone point me in the right direction? I would > think that I just need to be able to read Byte Sream from the file with > the > correct encoding and convert it to ASCII text. I have been totally > unsuccessful so far. > > Thanks, > Dmitry
Nothing as a char of a string
Copywriting or protecting your program Threading a Create Dataset method Timer fires inconsistantely is there a way to do this How to show a form of c# in VB.Net from ? Running Apps and Threads An idea to save learning time. Email Issue due to Antivirus (posting again) Logfiles VB.NET |
|||||||||||||||||||||||