W3C home > Mailing lists > Public > public-evangelist@w3.org > October 2002

Re: [WD]: Mandarin language website

From: steph <sniffles@unadorned.org>
Date: Tue, 8 Oct 2002 07:23:45 +1000
To: public-evangelist@w3.org
Message-ID: <20021007212345.GB6903@bund.com.au>

On Mon, Oct 07, 2002 at 06:13:53AM -0400, SanJo wrote:
> I understand that XML is perfect for this although I don't know if it's
> possible to convert XML to HTML or if XHTML has an equivalent for it.

XML -> XHTML can be done with XSLT.

I wrote a private reply to Mike on this subject, I think what
he's asking is somewhat OT for this list, but I'm forwarding it
here for completeness' sake, or if anyone else can give Mike
a hand.

If anything, something like this raises an issue about
language encoding standards such as Unicode vs "old" 8-bit
standards that have been in existence for many years.

Here it is:

From: steph <sniffles@unadorned.org>
To: mike@maupuia.com
Subject: Re: Fwd: [WD]: Mandarin language website
Lines: 53

Hey Mike :)

I saw the forwarded version of your email to webdesign-L, which
I'm no longer on ...

I don't know how much help I'd really be ... :) but given that
I do know Mandarin and did some related stuff for my Honours
thesis some years ago now, maybe I can help a bit.

> Forward---------debut
> Date: Mon, 7 Oct 2002 17:26:03 +1300
> From: Michael Brown <mike@maupuia.com>
> To: list@webdesign-l.com
> Subject: [WD]: Mandarin language website
> We've been asked to give some advice on the issues involved with
> having an English language website translated into Mandarin. This of
> course is not something we've done before! :)
> We're not so much concerned with the translation (as that will be done
> by the client) but with the issues involved in marking-up and
> displaying the HTML pages.

I'm pretty sure you're aware of the two main encodings for
Mandarin: Big-5 for Traditional Character set, and GB for
the Simplified Character set. Unicode contains both sets
of characters for Mandarin/Chinese. I am not aware of an
editor that outputs stuff in Unicode, but it's not hard
to represent Unicode chars in HTML (pretty sure you know that

There has been editors around for ages for word processing (I don't 
know what encoding Chinese Star uses, but it was a very popular 
software), and it's more than likely that these word processing
packages output HTML, I guess.

I will almost wager that the current Mandarin websites out
there use either Big-5 or GB, which require extra plug-ins. I 
believe you can get a Unicode plug-in for Windows and it is 
not too much of a hassle either.

Both Big-5 and GB are 8-bit, Unicode is 16-bit. :)
(You probably know that too!)

But anyway, hope that helps a little bit. 

Received on Monday, 7 October 2002 17:24:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:16:17 UTC