- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 01 Feb 2006 11:50:01 +0900
- To: paul.downey@bt.com, public-i18n-core@w3.org
- Cc: public-xsd-databinding@w3.org
Hi Paul, all, We discussed this problem at the i18n core call yesterday. The group saw the same problems with a romanization which Martin and me had mentioned before. Francois Yergeau proposed to use an XML like escaping mechanism (see http://www.w3.org/TR/REC-xml/#dt-charref ): [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';' I guess you cannot use '&#' at the beginning of your usage scenario, but if you define s.t. else as a "marker" for the beginning of a character references, that would be fine I think. This solution would of course be reversible. Would that solve your problems? Regards, Felix. On Tue, 31 Jan 2006 15:41:36 +0900, <paul.downey@bt.com> wrote: > > Hi Felix, > >> I am not yet sure if I understand your problem. >> Do you want to be able to map something like >> <nihon>... (written with Japanese characters ??) into >> <nihon> ... (written with latin characters only)? > >> This is the mapping from Kanji to Romaji you are mentioning below. >> Unfortunately this works only with a lexicon and on a per language >> basis. > > OK, understood. > >> It is also not reversible, e.g. "nihon" can be mapped to ?? or ???or >> others. > > understood. (curse my web mail, btw) > >> This kind of mapping is something I guess you don't want for your >> tasks. > > >> Currently, the names in XML Schema are defined at >> http://www.w3.org/TR/1999/REC-xml-names-19990114/#NT-NCName as >> NCName ::= (Letter | '_') (NCNameChar)* > >> I guess what you need is a mapping of "Letter" and "NCNameChar" to a >> subset of these character ranges, which fits programming language >> requirements. Is that right? > > Exactly! Please note we're not expecting to find anything > definitive here, but would welcome hearing about existing > works in this area we could possibly reference. > >> Then the next question would be if you have >> the need in your scenario to go back to the original XML name. If the >> answer is "yes", you will have the same ambiguity as with the mapping >> from >> "nihon" to "??" (or "??"). > > That might not be required, since a databinding could hold a map > for 'decoding' and resolve clashes by adding a prefix or a suffix, > nihon1, nihon2, etc. > >> If you could give more details on your requirements, me and the i18n >> core >> working group will take a closer look at possible solutions. > > thanks! > > Paul >
Received on Wednesday, 1 February 2006 02:50:11 UTC