Re: mapping of XML names into programming language from Felix Sasaki on 2006-01-31 (public-i18n-core@w3.org from January to March 2006)

From: Felix Sasaki <fsasaki@w3.org>
Date: Tue, 31 Jan 2006 14:53:42 +0900
To: paul.downey@bt.com, public-i18n-core@w3.org
Cc: public-xsd-databinding@w3.org
Message-ID: <op.s38ansgax1753t@ibm-60d333fc0ec.mag.keio.ac.jp>

Dear Paul,

Thank you for your mail. I am not yet sure if I understand your problem.  
Do you want to be able to map something like
<nihon>... (written with Japanese characters 日本) into
<nihon> ... (written with latin characters only)?
This is the mapping from Kanji to Romaji you are mentioning below.  
Unfortunately this works only with a lexicon and on a per language basis.  
It is also not reversible, e.g. "nihon" can be mapped to 日本 or 二本 or  
others. This kind of mapping is something I guess you don't want for your  
tasks.

Currently, the names in XML Schema are defined at
http://www.w3.org/TR/1999/REC-xml-names-19990114/#NT-NCName as
NCName   ::=  (Letter | '_') (NCNameChar)*

I guess what you need is a mapping of "Letter" and "NCNameChar" to a  
subset of these character ranges, which fits programming language  
requirements. Is that right? Then the next question would be if you have  
the need in your scenario to go back to the original XML name. If the  
answer is "yes", you will have the same ambiguity as with the mapping from  
"nihon" to "二本" (or "日本").

If you could give more details on your requirements, me and the i18n core  
working group will take a closer look at possible solutions.

Regards, Felix.

On Mon, 30 Jan 2006 22:05:57 +0900, <paul.downey@bt.com> wrote:

>
> Dear i18n-core,
>
> The XML Schema Patterns for Databinding WG has an issue
> surrounding how to give advice to implementers of binding
> tools how to represent XML Schema 1.0 names such as elements,
> types and enumerated type values in the typically more
> constrained world of databases and programming languages.
>
> One way forward is to simply warn product developers to
> expect to have to provide a manual step to handle the
> mapping of characters invalid in their processing environment
> and to avoid any possible symbol clashes.
>
> However, several members of the WG felt sure that there may
> already be some approaches for mapping characters, such as
> from Kanji to Roman which could also be referenced.
>
> We therefore wondered if there was any advice or pointers to
> existing advice the i18n WG could offer us in this area?
>
> Regards,
> Paul
> --
> Chair
> XML Schema Patterns for Databinding WG
> http://www.w3.org/2002/ws/databinding/
>

Received on Tuesday, 31 January 2006 05:53:50 UTC