Home All Groups Group Topic Archive Search About

Display Unicode characters on Winforms

Author
19 Jul 2006 12:01 AM
Bill Nguyen
I'm getting data from a mySQL database (default char set = UTF-8).
I need to display data in Unicode but got only mongolian characters like
this: Phạm Thị Ngọc

I changed the textbox font to Arial Unicode MS but still not working.

Do I need conversion of data stored in mySQL database before displaying?
Thanks

Bill

Author
19 Jul 2006 12:31 AM
Herfried K. Wagner [MVP]
"Bill Nguyen" <billn_nospam_please@jaco.com> schrieb:
> I'm getting data from a mySQL database (default char set = UTF-8).
> I need to display data in Unicode but got only mongolian characters like
> this: Ph&#7841;m Th&#7883; Ng&#7885;c
>
> I changed the textbox font to Arial Unicode MS but still not working.
>
> Do I need conversion of data stored in mySQL database before displaying?

Windows Forms controls cannot directly convert the character entities like
'&#7841;' to the appropriate character.  You may want to replace the string
"&#<number>;" with the value of 'ChrW(<number>)' or simply do not encode the
characters in the database using that way.

--
M S   Herfried K. Wagner
M V P  <URL:http://dotnet.mvps.org/>
V B   <URL:http://classicvb.org/petition/>
Author
19 Jul 2006 5:05 AM
Bill nguyen
Herfried;
I did not encode data. It must be part of the ISP procedure.
The text are displayed correctly with browsers, both IE and Firefox.
It's gonna be a big task trying the convert those <number> with ChrW because
they are mixing with characters all over.

Bill



Show quoteHide quote
"Herfried K. Wagner [MVP]" <hirf-spam-me-here@gmx.at> wrote in message
news:edfgjqsqGHA.2440@TK2MSFTNGP03.phx.gbl...
> "Bill Nguyen" <billn_nospam_please@jaco.com> schrieb:
>> I'm getting data from a mySQL database (default char set = UTF-8).
>> I need to display data in Unicode but got only mongolian characters like
>> this: Ph&#7841;m Th&#7883; Ng&#7885;c
>>
>> I changed the textbox font to Arial Unicode MS but still not working.
>>
>> Do I need conversion of data stored in mySQL database before displaying?
>
> Windows Forms controls cannot directly convert the character entities like
> '&#7841;' to the appropriate character.  You may want to replace the
> string "&#<number>;" with the value of 'ChrW(<number>)' or simply do not
> encode the characters in the database using that way.
>
> --
> M S   Herfried K. Wagner
> M V P  <URL:http://dotnet.mvps.org/>
> V B   <URL:http://classicvb.org/petition/>
Author
19 Jul 2006 12:48 PM
Bill nguyen
Herfried;

I don't know if this will work, but I need help to try it:
here's  sample of the text string

"Nghiên C&#7913;u - Phê Bình"

I need to read each byte in the text string, then use chrW to convert it to
Unicode.

I tried chrW(ascW(textString)) but it only converts the 1st letter.

Is there a function to read all bytes in the text string in 1 pass?
Thanks

Bill



Show quoteHide quote
"Herfried K. Wagner [MVP]" <hirf-spam-me-here@gmx.at> wrote in message
news:edfgjqsqGHA.2440@TK2MSFTNGP03.phx.gbl...
> "Bill Nguyen" <billn_nospam_please@jaco.com> schrieb:
>> I'm getting data from a mySQL database (default char set = UTF-8).
>> I need to display data in Unicode but got only mongolian characters like
>> this: Ph&#7841;m Th&#7883; Ng&#7885;c
>>
>> I changed the textbox font to Arial Unicode MS but still not working.
>>
>> Do I need conversion of data stored in mySQL database before displaying?
>
> Windows Forms controls cannot directly convert the character entities like
> '&#7841;' to the appropriate character.  You may want to replace the
> string "&#<number>;" with the value of 'ChrW(<number>)' or simply do not
> encode the characters in the database using that way.
>
> --
> M S   Herfried K. Wagner
> M V P  <URL:http://dotnet.mvps.org/>
> V B   <URL:http://classicvb.org/petition/>
Author
19 Jul 2006 11:36 PM
Jay B. Harlow [MVP - Outlook]
Bill,
You could use a RegEx to convert the char escape codes to chars.

You could implement what Herfried suggested with something like:

        Const input As String = "Nghiên C&#7913;u - Phê Bình"

        Const pattern As String = "\&\#\d{4}\;"
        Static parser As New Regex(pattern, RegexOptions.Compiled)
        Dim output As String = parser.Replace(input, AddressOf
MatchEvaluator)

    Private Function MatchEvaluator(ByVal input As Match) As String
        Dim value As String = input.Value.Substring(2, 4)
        Return ChrW(CInt(value))
    End Function


Does the 7913 represent a 4 digit decimal or hexidecimal number? You may
need to change the call to CInt accordingly...

--
Hope this helps
Jay B. Harlow [MVP - Outlook]
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net


Show quoteHide quote
"Bill nguyen" <billn_nospam_please@jaco.com> wrote in message
news:eUu4WGzqGHA.2256@TK2MSFTNGP03.phx.gbl...
| Herfried;
|
| I don't know if this will work, but I need help to try it:
| here's  sample of the text string
|
| "Nghiên C&#7913;u - Phê Bình"
|
| I need to read each byte in the text string, then use chrW to convert it
to
| Unicode.
|
| I tried chrW(ascW(textString)) but it only converts the 1st letter.
|
| Is there a function to read all bytes in the text string in 1 pass?
| Thanks
|
| Bill
|
|
|
| "Herfried K. Wagner [MVP]" <hirf-spam-me-here@gmx.at> wrote in message
| news:edfgjqsqGHA.2440@TK2MSFTNGP03.phx.gbl...
| > "Bill Nguyen" <billn_nospam_please@jaco.com> schrieb:
| >> I'm getting data from a mySQL database (default char set = UTF-8).
| >> I need to display data in Unicode but got only mongolian characters
like
| >> this: Ph&#7841;m Th&#7883; Ng&#7885;c
| >>
| >> I changed the textbox font to Arial Unicode MS but still not working.
| >>
| >> Do I need conversion of data stored in mySQL database before
displaying?
| >
| > Windows Forms controls cannot directly convert the character entities
like
| > '&#7841;' to the appropriate character.  You may want to replace the
| > string "&#<number>;" with the value of 'ChrW(<number>)' or simply do not
| > encode the characters in the database using that way.
| >
| > --
| > M S   Herfried K. Wagner
| > M V P  <URL:http://dotnet.mvps.org/>
| > V B   <URL:http://classicvb.org/petition/>
|
|
Author
21 Jul 2006 3:41 PM
Bill Nguyen
Jay;

If you look at the string again, you'll see that not only the 4-digit group
that needs to be translated but also other characters as well: (those in
squared brackets as below):

Nghi[ê]n C&#7913;u - Ph[ê ]B[ì]nh

I'm using phpWebsite and mySQL database from an ISP (IpowerWeb.com).
Input text is Unicode when a webpage is created/updated.
The text string above is stored in mySQL table instead.
I gues I have to convert the text back to Unicode to view/edit then put it
back. mySQL probably converts the text to the above format by itself.

Any suggestion on how to accomplish this?

Thanks again

Bill


Show quoteHide quote
"Jay B. Harlow [MVP - Outlook]" <Jay_Harlow_***@tsbradley.net> wrote in
message news:OEBohw4qGHA.4988@TK2MSFTNGP04.phx.gbl...
> Bill,
> You could use a RegEx to convert the char escape codes to chars.
>
> You could implement what Herfried suggested with something like:
>
>        Const input As String = "Nghiên C&#7913;u - Phê Bình"
>
>        Const pattern As String = "\&\#\d{4}\;"
>        Static parser As New Regex(pattern, RegexOptions.Compiled)
>        Dim output As String = parser.Replace(input, AddressOf
> MatchEvaluator)
>
>    Private Function MatchEvaluator(ByVal input As Match) As String
>        Dim value As String = input.Value.Substring(2, 4)
>        Return ChrW(CInt(value))
>    End Function
>
>
> Does the 7913 represent a 4 digit decimal or hexidecimal number? You may
> need to change the call to CInt accordingly...
>
> --
> Hope this helps
> Jay B. Harlow [MVP - Outlook]
> .NET Application Architect, Enthusiast, & Evangelist
> T.S. Bradley - http://www.tsbradley.net
>
>
> "Bill nguyen" <billn_nospam_please@jaco.com> wrote in message
> news:eUu4WGzqGHA.2256@TK2MSFTNGP03.phx.gbl...
> | Herfried;
> |
> | I don't know if this will work, but I need help to try it:
> | here's  sample of the text string
> |
> | "Nghiên C&#7913;u - Phê Bình"
> |
> | I need to read each byte in the text string, then use chrW to convert it
> to
> | Unicode.
> |
> | I tried chrW(ascW(textString)) but it only converts the 1st letter.
> |
> | Is there a function to read all bytes in the text string in 1 pass?
> | Thanks
> |
> | Bill
> |
> |
> |
> | "Herfried K. Wagner [MVP]" <hirf-spam-me-here@gmx.at> wrote in message
> | news:edfgjqsqGHA.2440@TK2MSFTNGP03.phx.gbl...
> | > "Bill Nguyen" <billn_nospam_please@jaco.com> schrieb:
> | >> I'm getting data from a mySQL database (default char set = UTF-8).
> | >> I need to display data in Unicode but got only mongolian characters
> like
> | >> this: Ph&#7841;m Th&#7883; Ng&#7885;c
> | >>
> | >> I changed the textbox font to Arial Unicode MS but still not working.
> | >>
> | >> Do I need conversion of data stored in mySQL database before
> displaying?
> | >
> | > Windows Forms controls cannot directly convert the character entities
> like
> | > '&#7841;' to the appropriate character.  You may want to replace the
> | > string "&#<number>;" with the value of 'ChrW(<number>)' or simply do
> not
> | > encode the characters in the database using that way.
> | >
> | > --
> | > M S   Herfried K. Wagner
> | > M V P  <URL:http://dotnet.mvps.org/>
> | > V B   <URL:http://classicvb.org/petition/>
> |
> |
>
>
Author
21 Jul 2006 11:02 PM
Jay B. Harlow [MVP - Outlook]
Bill,
I would extend the pattern to also match the square brackets also, then
modify the MatchEvaluator function to behave according to either the first
escape sequence or the second escape sequence...



--
Hope this helps
Jay B. Harlow [MVP - Outlook]
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net


Show quoteHide quote
"Bill Nguyen" <billn_nospam_please@jaco.com> wrote in message
news:u%23ZyYwNrGHA.1976@TK2MSFTNGP04.phx.gbl...
| Jay;
|
| If you look at the string again, you'll see that not only the 4-digit
group
| that needs to be translated but also other characters as well: (those in
| squared brackets as below):
|
| Nghi[ê]n C&#7913;u - Ph[ê ]B[ì]nh
|
| I'm using phpWebsite and mySQL database from an ISP (IpowerWeb.com).
| Input text is Unicode when a webpage is created/updated.
| The text string above is stored in mySQL table instead.
| I gues I have to convert the text back to Unicode to view/edit then put it
| back. mySQL probably converts the text to the above format by itself.
|
| Any suggestion on how to accomplish this?
|
| Thanks again
|
| Bill
|
|
| "Jay B. Harlow [MVP - Outlook]" <Jay_Harlow_***@tsbradley.net> wrote in
| message news:OEBohw4qGHA.4988@TK2MSFTNGP04.phx.gbl...
| > Bill,
| > You could use a RegEx to convert the char escape codes to chars.
| >
| > You could implement what Herfried suggested with something like:
| >
| >        Const input As String = "Nghiên C&#7913;u - Phê Bình"
| >
| >        Const pattern As String = "\&\#\d{4}\;"
| >        Static parser As New Regex(pattern, RegexOptions.Compiled)
| >        Dim output As String = parser.Replace(input, AddressOf
| > MatchEvaluator)
| >
| >    Private Function MatchEvaluator(ByVal input As Match) As String
| >        Dim value As String = input.Value.Substring(2, 4)
| >        Return ChrW(CInt(value))
| >    End Function
| >
| >
| > Does the 7913 represent a 4 digit decimal or hexidecimal number? You may
| > need to change the call to CInt accordingly...
| >
| > --
| > Hope this helps
| > Jay B. Harlow [MVP - Outlook]
| > .NET Application Architect, Enthusiast, & Evangelist
| > T.S. Bradley - http://www.tsbradley.net
| >
| >
| > "Bill nguyen" <billn_nospam_please@jaco.com> wrote in message
| > news:eUu4WGzqGHA.2256@TK2MSFTNGP03.phx.gbl...
| > | Herfried;
| > |
| > | I don't know if this will work, but I need help to try it:
| > | here's  sample of the text string
| > |
| > | "Nghiên C&#7913;u - Phê Bình"
| > |
| > | I need to read each byte in the text string, then use chrW to convert
it
| > to
| > | Unicode.
| > |
| > | I tried chrW(ascW(textString)) but it only converts the 1st letter.
| > |
| > | Is there a function to read all bytes in the text string in 1 pass?
| > | Thanks
| > |
| > | Bill
| > |
| > |
| > |
| > | "Herfried K. Wagner [MVP]" <hirf-spam-me-here@gmx.at> wrote in message
| > | news:edfgjqsqGHA.2440@TK2MSFTNGP03.phx.gbl...
| > | > "Bill Nguyen" <billn_nospam_please@jaco.com> schrieb:
| > | >> I'm getting data from a mySQL database (default char set = UTF-8).
| > | >> I need to display data in Unicode but got only mongolian characters
| > like
| > | >> this: Ph&#7841;m Th&#7883; Ng&#7885;c
| > | >>
| > | >> I changed the textbox font to Arial Unicode MS but still not
working.
| > | >>
| > | >> Do I need conversion of data stored in mySQL database before
| > displaying?
| > | >
| > | > Windows Forms controls cannot directly convert the character
entities
| > like
| > | > '&#7841;' to the appropriate character.  You may want to replace the
| > | > string "&#<number>;" with the value of 'ChrW(<number>)' or simply do
| > not
| > | > encode the characters in the database using that way.
| > | >
| > | > --
| > | > M S   Herfried K. Wagner
| > | > M V P  <URL:http://dotnet.mvps.org/>
| > | > V B   <URL:http://classicvb.org/petition/>
| > |
| > |
| >
| >
|
|