Re: mapping of XML names into programming language

Hi Paul, all,

We discussed this problem at the i18n core call yesterday. The group saw  
the same problems with a romanization which Martin and me had mentioned  
before. Francois Yergeau proposed to use an XML like escaping mechanism  
(see ):
[66]    CharRef    ::=    '&#' [0-9]+ ';'
   | '&#x' [0-9a-fA-F]+ ';'
I guess you cannot use '&#' at the beginning of your usage scenario, but  
if you define s.t. else as a "marker" for the beginning of a character  
references, that would be fine I think.
This solution would of course be reversible.
Would that solve your problems?

Regards, Felix.

On Tue, 31 Jan 2006 15:41:36 +0900, <> wrote:

> Hi Felix,
>> I am not yet sure if I understand your problem.
>> Do you want to be able to map something like
>> <nihon>... (written with Japanese characters ??) into
>> <nihon> ... (written with latin characters only)?
>> This is the mapping from Kanji to Romaji you are mentioning below.
>> Unfortunately this works only with a lexicon and on a per language  
>> basis.
> OK, understood.
>> It is also not reversible, e.g. "nihon" can be mapped to ?? or ???or
>> others.
> understood. (curse my web mail, btw)
>> This kind of mapping is something I guess you don't want for your
>> tasks.
>> Currently, the names in XML Schema are defined at
>> as
>> NCName   ::=  (Letter | '_') (NCNameChar)*
>> I guess what you need is a mapping of "Letter" and "NCNameChar" to a
>> subset of these character ranges, which fits programming language
>> requirements. Is that right?
> Exactly! Please note we're not expecting to find anything
> definitive here, but would welcome hearing about existing
> works in this area we could possibly reference.
>> Then the next question would be if you have
>> the need in your scenario to go back to the original XML name. If the
>> answer is "yes", you will have the same ambiguity as with the mapping  
>> from
>> "nihon" to "??" (or "??").
> That might not be required, since a databinding could hold a map
> for 'decoding' and resolve clashes by adding a prefix or a suffix,
> nihon1, nihon2, etc.
>> If you could give more details on your requirements, me and the i18n  
>> core
>> working group will take a closer look at possible solutions.
> thanks!
> Paul

Received on Wednesday, 1 February 2006 02:50:11 UTC