Home All Groups Group Topic Archive Search About

ISO-8859-1 encoding of an xml string

Author
13 Jul 2006 8:00 PM
Christina
Hey Guys,

Currently, I am using the below code:

Dim oReqDoc as XmlDocument
Dim requiredBytes As Byte()
requiredBytes =
System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml).

Here, I am encoding my xml string in UTF8 format.
I want to convert my code to encode the xml string in 'ISO-8859-1'
format.

Any help is appreciated..

Thx,
Chris

Author
13 Jul 2006 8:55 PM
Branco Medeiros
Christina wrote:
<snip>
> Dim oReqDoc as XmlDocument
> Dim requiredBytes As Byte()
> requiredBytes =
> System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml).
>
> Here, I am encoding my xml string in UTF8 format.
> I want to convert my code to encode the xml string in 'ISO-8859-1'
> format.
<snip>

Use the GetEncoding method of the Encoding class to get a custom
encoding.

> Dim oReqDoc as XmlDocument
> Dim requiredBytes As Byte()

  Dim E As System.Text.Encoding
  E = System.Text.Encoding.GetEncoding("ISO-8859-1")
  requiredBytes = E.GetBytes(oReqDoc.InnerXml)


HTH,

Branco.
Author
13 Jul 2006 11:06 PM
Christina
Thanks Branco !!
Thats was helpful. I still have another question.

I made the changes. Now, when I debug the application and check at
location where I posted the xml, Request.contentEncoding  has value
System.Text.UTF8Encoding.

1) Is it because of the web.configs default :
<globalization requestEncoding="utf-8" responseEncoding="utf-8" />

2) How can I check that the message is ISO-8859-1 encoded ?

Thanks again..
Chris

Branco Medeiros wrote:
Show quoteHide quote
> Christina wrote:
> <snip>
> > Dim oReqDoc as XmlDocument
> > Dim requiredBytes As Byte()
> > requiredBytes =
> > System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml).
> >
> > Here, I am encoding my xml string in UTF8 format.
> > I want to convert my code to encode the xml string in 'ISO-8859-1'
> > format.
> <snip>
>
> Use the GetEncoding method of the Encoding class to get a custom
> encoding.
>
> > Dim oReqDoc as XmlDocument
> > Dim requiredBytes As Byte()
>
>   Dim E As System.Text.Encoding
>   E = System.Text.Encoding.GetEncoding("ISO-8859-1")
>   requiredBytes = E.GetBytes(oReqDoc.InnerXml)
>
>
> HTH,
>
> Branco.
Author
14 Jul 2006 2:34 AM
Branco Medeiros
Christina wrote:
<snip>
> I made the changes. Now, when I debug the application and check at
> location where I posted the xml, Request.contentEncoding  has value
> System.Text.UTF8Encoding.
>
> 1) Is it because of the web.configs default :
>  <globalization requestEncoding="utf-8" responseEncoding="utf-8" />

Uh, I guess you answered your question...

> 2) How can I check that the message is ISO-8859-1 encoded ?

Appart from visually inspecting the contents of the generated file in a
binary editor (!!!), I have no idea. ISO-8859-1 has no preamble bytes
as do utf-8 ~ utf-32, so there's no 'mark' you can look for, other than
the distinction between certain char conversions... for instance, utf-8
will turn any char with index above 127 into a two byte sequence, while
Latin 1 will convert such char to a single byte.

Therefore Text.Encoding.UTF8.GetByteCount(ChrW(128)) will return 2,
while Text.Encoding.GetEncoding("iso-8859-1").GetByteCount(ChrW(128))
will return 1... =P

HTH...

Regards,

Branco.
Author
14 Jul 2006 9:01 AM
Michel Posseth [MCP]
write your xml with a encoded string writer


Imports System.Text
Imports System.IO
Namespace Text
    '''<summary>
    ''' Implementeerd een stringwriter waarbij je zelf de encoding kunt
kiezen
    ''' de data wordt opgeslagen in een stringbuilder
    ''' </summary>
    Public Class EncodedStringWriter
        Inherits StringWriter
        'Private property setter
        Private _Encoding As Encoding
        ''' <summary>
        ''' Initializes a new instance of the <see
cref="T:EncodedStringWriter" /> class.
        ''' </summary>
        ''' <param name="sb">stringbuilder</param>
        ''' <param name="Encoding">The encoding.</param>
        Public Sub New(ByVal sb As StringBuilder, ByVal Encoding As Encoding)
            MyBase.New(sb)
            _Encoding = Encoding
        End Sub
        ''' <summary>
        '''
        ''' </summary>
        ''' <value></value>
        ''' <doc>
        ''' <summary>getter voor de string encoding </summary>
        ''' <returns>de Encoding waarin wordt geschreven </returns>
        ''' <filterpriority>1</filterpriority>
        ''' </doc>
        Public Overrides ReadOnly Property Encoding() As Encoding
            Get
                Return _Encoding
            End Get
        End Property
    End Class
End Namespace

USAGE ::::

  Mvar_sb = New StringBuilder(500)
        Mvar_XMLDoc = New XmlTextWriter(New
Text.EncodedStringWriter(Mvar_sb, Encoding.UTF8))
------ choose your encoding

------'now write xml
        Mvar_XMLDoc.Formatting = Formatting.Indented
        Mvar_XMLDoc.Indentation = 2
        Mvar_XMLDoc.WriteStartDocument()
------omitted

------to get the contents of the xml document
Mvar_sb.ToString

as you see everything is now in the requested encoding also the doctype is
now in the correct encoding


regards

Michel Posseth [MCP]











Show quoteHide quote
"Branco Medeiros" wrote:

> Christina wrote:
> <snip>
> > I made the changes. Now, when I debug the application and check at
> > location where I posted the xml, Request.contentEncoding  has value
> > System.Text.UTF8Encoding.
> >
> > 1) Is it because of the web.configs default :
> >  <globalization requestEncoding="utf-8" responseEncoding="utf-8" />
>
> Uh, I guess you answered your question...
>
> > 2) How can I check that the message is ISO-8859-1 encoded ?
>
> Appart from visually inspecting the contents of the generated file in a
> binary editor (!!!), I have no idea. ISO-8859-1 has no preamble bytes
> as do utf-8 ~ utf-32, so there's no 'mark' you can look for, other than
> the distinction between certain char conversions... for instance, utf-8
> will turn any char with index above 127 into a two byte sequence, while
> Latin 1 will convert such char to a single byte.
>
> Therefore Text.Encoding.UTF8.GetByteCount(ChrW(128)) will return 2,
> while Text.Encoding.GetEncoding("iso-8859-1").GetByteCount(ChrW(128))
> will return 1... =P
>
> HTH...
>
> Regards,
>
> Branco.
>
>