RE: Comments on XML Part 1 from Japanese experts

Dear Tim and Jon,

As Murata-san mailed below, we had a meeting among W3C Japanese members.
We are really worrying about the handling of 2-Byte character codes.
Murata-san will gather the opitions from Japanese members and
make a proposal. We hope it will be included in the Part 1 Draft.
Regards,

				--- tomy

-----------------------------------
Dr Tomihisa Kamada
Executive Vice-President, R&D
ACCESS CO.,LTD.
Hirata-Bld. 8F, 2-8-16 Sarugaku-cho
Chiyoda-ku, Tokyo, 101 JAPAN
TEL +81-3-5259-3535
FAX +81-3-5259-3536

----------
From: 	Murata Makoto
Sent: 	Saturday, June 7, 1997 6:15 PM
To: 	w3c-sgml-wg@w3.org
Subject: 	Re: Comments on XML Part 1 from Japanese experts

Yesterday, I attended an informal W3C meeting chaired by Prof. Saito. 
I spoke about XML and asked their opinions about a new part for 
Japanese text

We can develop such a new part, although we need more time to find 
the right procedure to do it.  We can mention EUC_JP and Shift-JIS 
in that part.  (Hanakaku kana for names should be disallowed in Part 
1, though.  And the idegraphic space should not be allowed 
as a delimiter.)

After the meeting, I spoke with Internet experts and HTML experts 
, and asked their opinion about the ideographic space character 
and HTTP content labelling.

1) The ideographic space

They unanimously agree with me in that the ideographic 
space character must not be a delimiter.  Period.

A plea to the ERB: please remove the ideographic space from the 
definition of S.

2) HTTP content labelling

Although they (and I) unanimously disagree with Gavin, it is not 
easy to write a complete proposal right now.

I will circulate my proposal to Japanese collegues and ask for 
their comments.  Then, I will submit the conclusion to this 
mailing list.

To give some idea of the proposed change to the ERB, I give a 
rough sketch.

Addition: we might propose to add encoding declarations 
as a part of external entity declarations. (See I18N of HTML.)

Encoding decision

Priority 1:  If BOM or an encoding declaration exists at the beginning 
		of this external entity or document entity, they 
		are used to detect the encoding.
Priority 2:  If this entity is not a document entity but an external 
		entity, then:
	2-1: If an encoding declaration is attached to the external 
		entity declaration that declares this external 
		entity, this encoding declaration is used.
	2-2: The encoding of the external entity 
		or document entity that refers to this external entity is 
		used.
Priority 3:  HTTP content labelling
Priority 4(optional):  any kind of autodetect or user preference list 
			or locale-setting
Priority 5:  UTF-8


P.S.  Gavin, please reply to the group only.

Makoto
 
Fuji Xerox Information Systems
 
Tel: 044-812-7230   Fax: 044-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

Received on Saturday, 7 June 1997 08:52:02 UTC