XML name to C-like identifier mapping (Was: Re: ISSUE-5: Open Enumerated Type)

----- Original Message From: "Ed Day" <edday@obj-sys.com>

> Another issue related to enumerations that I would like to see the group
> consider is how to map enumerations that contain non alpha-numeric
> characters.  For example, I have seen schemas that contain enumerations of
> things like #, @, $, etc. in various combinations.  The typical response 
> by
> many binding tools is to replace these characters with underscores (_) 
> which
> leads to some very cryptic and confusing names.  If the group could come 
> up
> with some standard way of dealing with this situation, it would be great.

Related to Ed's message, the character space for XML names is much richer 
than many program language identifiers.  The disparity is not too bad for 
languages like Java, but there are still various punctuation characters (.) 
that are allowed in XML names that are not allowed in Java names.  The 
mismatch for C/C++ is even bigger.  JAXB offers rules for mapping XML names 
to Java, so maybe that is not a concern, but nothing formal exists for C/C++ 
to my knowledge.

So the questions for C/C++ and other languages with similar limitations 
become:  is this an issue?  Should it be left to vendors to sort out? 
Should a mapping procedure be specified that ends up with only valid C/C++ 
characters?  Should developers be advised that for maximum portability the 
character set used for XML names should be limited to the C/C++ set?

I'm not sure what I think at this stage!

Pete
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
                         for XML to C++ data binding visit
                         http://www.tech-know-ware.com/lmx
                         (or http://www.xml2cpp.com)
=============================================

Received on Thursday, 12 January 2006 09:58:22 UTC