- From: Sandy Gao <sandygao@ca.ibm.com>
- Date: Tue, 1 Nov 2005 09:32:05 -0500
- To: "Alessandro Triglia" <sandro@mclink.it>
- Cc: "[Public XML Schema-DEV]" <xmlschema-dev@w3.org>
- Message-ID: <OFA798F42B.1F7FC99E-ON852570AC.004E8A7D-852570AC.004FD8BB@ca.ibm.com>
Sounds like some valid concerns. I would suggest you open a bug [1] against schema spec version 1.0 part 2. Alternatively you can send an email to www-xml-schema-comments@w3.org, where all comments on the schema spec should go. But it'll probably take longer (than Bugzilla) to get the WG's attention. [1] http://www.w3.org/Bugs/Public/enter_bug.cgi?product=XML%20Schema Thanks, Sandy Gao XML Parser Development, IBM Canada (1-905) 413-3255 sandygao@ca.ibm.com "Alessandro Triglia" <sandro@mclink.it> Sent by: xmlschema-dev-request@w3.org 10/28/2005 01:00 PM To "[Public XML Schema-DEV]" <xmlschema-dev@w3.org> cc Subject Lexical representation of xsd:decimal and xsd:integer Hi There is a problematic area in Part 2. Some of the related questions are: - Is ".5" a valid lexical representation for xsd:decimal? - Is "" a valid lexical representation for xsd:decimal? - Is "" a valid lexical representation for xsd:integer? - Is "0" or "" the canonical lexical representation of the integer value 0? Part 2 says: -------------------------------- decimal has a lexical representation consisting of a finite-length sequence of decimal digits (#x30-#x39) separated by a period as a decimal indicator. An optional leading sign is allowed. If the sign is omitted, "+" is assumed. Leading and trailing zeroes are optional. If the fractional part is zero, the period and following zero(es) can be omitted. For example: -1.23, 12678967.543233, +100000.00, 210. -------------------------------- There are three problematic terms in this paragraph: "finite-length sequence", "separated", and "leading zeros". I am trying to understand what these terms mean by looking in other parts of the document, because they are all ambiguous. "Finite-length sequence" is used in many other places for the length of: lists, strings, binary octets of hexBinary, binary octets of baseBinary, etc. Obviously, lists, strings, and binary octets must be allowed to have a zero length. Therefore, at least in these cases (and possibly in all cases), "finite-length" includes zero-length. This is supported by the use of the phrase "finite, non-zero-length" for NMTOKENS, IDREFS, and ENTITIES, and by the addition of "(possibly empty)" after "finite-length" in the definition of list. So if "finite-length sequence of digits" in xsd:decimal includes zero digits, then all of the following are valid lexical representations for this type: "", "+", ".", "3.", ".3". This has several other implications: -- "" is a valid lexical representation of the integer value 0 -- "" and "E" are valid lexical representations of the float value 0 Is all of the above intended? Is this the common understanding? If not, there is a defect in the specification of the lexical representation of xsd:decimal and xsd:integer (it should not say "finite-length"). Actually, I doubt that the above was intended. One of the reasons is that none of the examples in Part 2 shows a decimal number with an empty integer part, or an "empty" integer, but a stronger reason comes from the definition of the canonical lex rep of xsd:integer. While the definition of the canonical lex rep of xsd:decimal requires the presence of at least one digit in the integer part, the definition of the canonical lex rep of xsd:integer does not include the same requirement -- it says that leading zeros are prohibited, full stop. Is the single "0" in the representation of the integer value 0 a "leading zero" or not? Depending on the answer, the canonical lex rep of the integer value 0 will be either "" or "0". Which is it? If the latter is intended, then "fixed-length sequence" for xsd:integer apparently does not include a zero length, and "leading zeros" apparently does not include the single zero digit representing the integer value zero. But if the former is intended (and so the canonical lex rep of the integer 0 is ""), then I wonder why "0.x" (rather than ".x") was chosen as the canonical lex rep of decimals with a null integer part. Should "finite-length sequence of digits" have been in both cases "finite, but non-zero-length, sequence of digits"? Also, should "separated be a period" have been "with a period optionally inserted at any point in the sequence of digits"? Or perhaps "with a period optionally inserted at any point in the sequence of digits except before the first digit and after the last digit"? (or equivalently "between any two digits"?) Or perhaps "with a period optionally inserted at any point in the sequence of digits except after the last digit"? I think the first big question is, What was intended? And the second is, Is there anything to be mended in the text, and how? Alessandro Triglia OSS Nokalva
Attachments
- application/octet-stream attachment: winmail.dat
Received on Tuesday, 1 November 2005 14:32:20 UTC