[WSUS] Sec 2.1.x Data Integrity --- does Unicode gurantees character encoding interoperability?

This is a comment on Section 2.1.x in:
 http://www.w3.org/TR/2002/WD-ws-i18n-scenarios-20021220/.

I have a concern about a clause in Section 2.1.1.2 "Description"
that reads:  
	using UTF-8 or UTF-16 guarantees character encoding 
	interoperability on the SOAP layer.
and a sentence in Section 2.1.2.2 "Description" that reads:
	 XML Japanese profile document [XML-JP] describes 
	that using non-Unicode encodings such as Shift_JIS 
	cannot provide interoperability in information interchange.

I have a quite opposit view of how Unicode plays in this
problem described here.  My experience and my understanding
of the interoperability problem is that
	Use of Unicode CAUSES the problem, rather than
	solves the problem.

Before introduction of Unicode, the Japanese characters
are transmitted in one of the three legacy encodings,
Shift_JIS, EUC-JP and ISO-2022-JP.  Because all of them
are defined based on the base Japanese national
code sets, the conversion among them was well defined, and
no character loss happens.  

After introduction of Unicode, if we use a Unicode based
encoding as transmission encoding between two systems
which use the same or different legacy encodings, the data
loss happens because of the incosistency in legacy-to-
Unicode mappings between the two systems.

Unicode enhances interoperability between the two
encodings that share some characters, like between
EUC-JP and EUC-KR, from 0% to some degree,
but it also reduces interoperability between the legacy
encodings that shares the same base code sets,
from 100% to 99.99%, in my opinion.

So I would like to propose to remove the phrase and
the sentence mentioned above, and add a note that
warns the interoperability issue.

----
T. "Kuro" Kurosaka, Internationalization Architect
IONA Technologies, Santa Clara, CA USA / +1 408 350-9684 

Received on Thursday, 2 January 2003 15:01:58 UTC