Re: ISSUE-10: Mappings - proposed text from Pete Cordell on 2006-09-07 (public-xsd-databinding@w3.org from September 2006)

From: Pete Cordell <petexmldev@tech-know-ware.com>
Date: Thu, 7 Sep 2006 09:48:04 +0100
To: <jon.calladine@bt.com>, <paul.downey@bt.com>, <public-xsd-databinding@w3.org>
Message-ID: <036601c6d25a$bd362650$5000a8c0@RW>
I have some sympathy for Jon's POV.  I think in standards, things of MUST 
strength are usually fairly obvious as to why they are what they are. 
Things of SHOULD strength often are less obvious.  There's been some 
discussion in the VoIP parts of the IETF saying that if you make something 
SHOULD strength then you need some discussion about when you should do 
something and when you shouldn't.  As the spec being developed here is 
mostly SHOULD strength, then I think some debate is would be helpful.

I also think that some rationale in standards would not only be helpful for 
implementers, but for the standards writers themselves.  I've seen a number 
of times standards writers looking back to older versions (or dependent 
standards) and saying "Why did we do it like that?"  It's even worse if 
there's churn of the people writing the standard.  I suppose it's a bit like 
commenting your code!

I suppose such rationale could be in an informative appendix rather than 
in-line.

Anyway, enough mumbling from me for one morning!

Pete.
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
                         for XML to C++ data binding visit
                         http://www.tech-know-ware.com/lmx
                         (or http://www.xml2cpp.com)
=============================================

----- Original Message ----- 
From: <jon.calladine@bt.com>
To: <paul.downey@bt.com>; <petexmldev@tech-know-ware.com>; 
<public-xsd-databinding@w3.org>
Sent: Wednesday, September 06, 2006 5:44 PM
Subject: RE: ISSUE-10: Mappings - proposed text


Paul, you are right of course, as a formal spec such verbiage is out of
place. As I have said on the calls though I am aware/concerned that many
readers of the spec won't 'get' what we are doing or what the reason is
behind such decisions.

When this document goes beyond last call I think there will be a need
for an annotated version of the spec or similar summarising the thinking
in the issues list for some of the more contentious policies we have.

JonC

-----Original Message-----
From: Downey,P,Paul,XSB2 R
Sent: 06 September 2006 14:14
To: Pete Cordell; Calladine,J,Jon,XSE6 R; public-xsd-databinding@w3.org
Subject: RE: ISSUE-10: Mappings - proposed text

Thanks Jon for working on this, I know it hasn't been easy!

The length of the text worries me. I'd be happier with concrete
assertion in our spec than risk our drifting into justifying our
decisions and opening up holes by hand-waving.

This is mostly my fault asking us NOT to rule out characters useful for
internationalisation, but after our discussions and Pete's useful links
I'm now of the opinion that mapping to strange escape sequences or
requiring the developer to assert a Schema/WSDL specific manual mapping
step doeesn't make for a "good user experience". That's our criteria for
a Schema as being marked as "Basic".

However tools may be able to do better with internationalisation, so I
think we should make any valid ncname an Advanced pattern.

Proposal:

"""
//xs:schema//@name values which conform to the following pattern are
"Basic":

  identifier  ::=  (letter|"_") (letter | digit | "_"){1,30}
  letter ::= ("a".."z") | ("A".."Z")
  digit ::= "0".."9"

Any other @name is marked as "Advanced".
"""

Note: case is significant and identifiers may be at most 31 characters
long based on C/C++, Fortran 90

We can probably expect i18n comments, but given the position of Basic as
the "state of the art" and i18n in programming languages sucks ..

Paul

-----Original Message-----
From: public-xsd-databinding-request@w3.org on behalf of Pete Cordell
Sent: Wed 9/6/2006 10:00 AM
To: Calladine,J,Jon,XSE6 R; public-xsd-databinding@w3.org
Subject: Re: ISSUE-10: Mappings - proposed text


Hi Jon,

This looks good to me, although in the case of C++ and its ilk, it's
more restrictive than just US ASCII.  Perhaps it ought to be "Where any
character other than alphanumeric US ASCII characters or the
underscore...".

I know this is sounding a bit C/C++-ish, but I think C/C++ is the bottom
of the pile in this respect!  It's OK from a C/C++, Java, C#, Perl[1],
PHP[2], and Python [3] POV.  Don't know about VB.  What other languages
should we care about?

Pete.


[1] http://search.cpan.org/dist/perl/pod/perldata.pod
[2] PHP manual says '[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*'
(http://www.php.net/manual/en/language.variables.php)
[3] http://docs.python.org/ref/identifiers.html
--
=============================================
Pete Cordell
Tech-Know-Ware Ltd
                         for XML to C++ data binding visit
                         http://www.tech-know-ware.com/lmx
                         (or http://www.xml2cpp.com)
=============================================

----- Original Message -----
From: <jon.calladine@bt.com>
To: <public-xsd-databinding@w3.org>
Sent: Tuesday, September 05, 2006 3:35 PM
Subject: RE: ISSUE-10: Mappings - proposed text



Here is the text as discussed on the call.



Design Consideration.

The naming of element and type names remains a problematic area for
databinding tools. As the fundamental building blocks of an xml
document, tools *should* be able to support *any* valid XML element
name. This is still not the case however.

Historically, early versions of tools would not cope with the more
unusual characters available to the schema author, and these tools would
refuse to generate code. In all modern tools we have experience of there
is now excellent coverage of xml element names in so far as databinding
tools will generate the necessary serialisation/deserialisation code.
That this remains a problem area is to do with the mapping of valid xml
names to programming language specific environments often resulting in
'unpalatable' translations.

In many tools (but not all) it is possible to manually map the names to
something that is more acceptable to the developers but it must be
emphasised this is a manual step and will be very much dependent on the
specific programming language being used.

We have stopped short of giving language specific guidelines in this
basic patterns document because our aim is to provide generic guidance
to the schema author on what will work well. Our approach in this area
is as follows.

Where any character other than US ASCII is used in a schema document the
basic patterns validation rules will generate a *single* Information
message as follows:

Information: Element names in the schema have been constructed using
characters that will not map directly into all programming language
character sets for variables. The use of these element names will not
prevent databinding tools from generating mappings for these names but
the mapped names may not be 'meaningful' to the developer or may require
a manual reconfiguration of the code to make it so. For ultimate
interoperability use only US ASCII character set.
Received on Thursday, 7 September 2006 08:51:20 UTC