Unicode in Java
Author |
Message |
Hikaru79
|
|
|
|
|
Sponsor Sponsor
|
|
|
rizzix
|
Posted: Tue Aug 02, 2005 7:43 pm Post subject: (No subject) |
|
|
may i have that file? the one ur trying to read... attach it please..
|
|
|
|
|
|
Hikaru79
|
Posted: Tue Aug 02, 2005 8:22 pm Post subject: (No subject) |
|
|
The one I'm trying to read? Sure. Here it is.
Thanks in advance
EDIT: Again, this is with UNIX line breaks, so opening it in Notepad will make it look strange.
Description: |
Configuration file with Unicode. |
|
Download |
Filename: |
sample.conf.txt |
Filesize: |
71 Bytes |
Downloaded: |
109 Time(s) |
|
|
|
|
|
|
rizzix
|
Posted: Tue Aug 02, 2005 10:12 pm Post subject: (No subject) |
|
|
ok i tried opening your file in a UTF8 editor.. Same results. Your file is not encoded in UTF8 format.
Try UTF16, it should work. Oh and ehm dont mix ASCII and UTF16, ehm, stick to one format..
|
|
|
|
|
|
Hikaru79
|
Posted: Wed Aug 03, 2005 5:52 am Post subject: (No subject) |
|
|
Okay, hmm... I'll try that.
rizzix wrote: Oh and ehm dont mix ASCII and UTF16, ehm, stick to one format..
Isn't ASCII a subset of UTF16?
|
|
|
|
|
|
rizzix
|
Posted: Wed Aug 03, 2005 10:43 am Post subject: (No subject) |
|
|
No it's a subset of UTF8
|
|
|
|
|
|
Hikaru79
|
Posted: Wed Aug 03, 2005 5:28 pm Post subject: (No subject) |
|
|
rizzix wrote: No it's a subset of UTF8
And isn't UTF8 a subset of UTF16? Man, I'm confused, since I've never even bothered dealing with internationalization until know. And this time it has to happen right =/
Maybe if I re-ask the question. I'm trying to achieve a model whereby the program can deal with (input/output) any of the the four following languages and scripts: English, Chinese, Japanese (kana and kanji), and Korean. Will this be ridiculously difficult? If not, how can it be done? If so, where can I go to find out how it can be done? ^_^;
|
|
|
|
|
|
rizzix
|
Posted: Wed Aug 03, 2005 7:14 pm Post subject: (No subject) |
|
|
UTF16 is a 2-byte character format.. while UTF8 is a 1-byte format (actually i think its a variable byte format). ASCII is 1-byte as well, so all ASCII characters can be represented in UTF8. UTF8 also ensures that the ASCII characters retain their same old ASCII codes. The rest of the characters,, well I'm not sure how it represents them...
Java has great internationalization support. Hence it should be easy. Since internationalization is not a critical issues for the common developer, you rarely see any good tutorials on it. Companies like IBM, Apple, Microsoft, etc do have tutorials, but they usually require you to register or something first. Some of them are not free.
I would suggest you take a look into these articles hosted on sun:
http://java.sun.com/developer/technicalArticles/Intl/index.html
And then there's this: http://java.sun.com/docs/books/tutorial/i18n/index.html
Hopefully they are of some use to you.
|
|
|
|
|
|
Sponsor Sponsor
|
|
|
Hikaru79
|
Posted: Wed Aug 03, 2005 8:51 pm Post subject: (No subject) |
|
|
As always, Rizzix, you have been of great help! ^__^ I looked through those and they look mighty helpful. They're being sent off to the printer now Thanks!
|
|
|
|
|
|
Hikaru79
|
Posted: Sat Aug 27, 2005 10:26 am Post subject: (No subject) |
|
|
A-ha! Finally, success! Thanks, rizzix, problem solved!
|
|
|
|
|
|
rizzix
|
Posted: Sat Aug 27, 2005 1:03 pm Post subject: (No subject) |
|
|
cool. maybe you could write a tutorial to share that knowledge.. hmm!
|
|
|
|
|
|
Hikaru79
|
Posted: Sat Aug 27, 2005 10:34 pm Post subject: (No subject) |
|
|
rizzix wrote: cool. maybe you could write a tutorial to share that knowledge.. hmm!
Deal I'll get to it tonight, hopefully.
|
|
|
|
|
|
|
|