W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2005

Lexical representation of xsd:decimal and xsd:integer

From: Alessandro Triglia <sandro@mclink.it>
Date: Fri, 28 Oct 2005 13:00:07 -0400
To: "[Public XML Schema-DEV]" <xmlschema-dev@w3.org>
Message-ID: <!~!AAAAADAdZm9acJxBst5oez/oaBME23QA@mclink.it>
Hi

There is a problematic area in Part 2.  Some of the related questions are:

- Is ".5" a valid lexical representation for xsd:decimal?

- Is "" a valid lexical representation for xsd:decimal?

- Is "" a valid lexical representation for xsd:integer?

- Is "0" or "" the canonical lexical representation of the integer value 0?


Part 2 says:

--------------------------------
decimal has a lexical representation consisting of a finite-length sequence
of decimal digits (#x30-#x39) separated by a period as a decimal indicator.
An optional leading sign is allowed. If the sign is omitted, "+" is assumed.
Leading and trailing zeroes are optional. If the fractional part is zero,
the period and following zero(es) can be omitted. For example: -1.23,
12678967.543233, +100000.00, 210.
--------------------------------

There are three problematic terms in this paragraph:  "finite-length
sequence", "separated", and "leading zeros".  I am trying to understand what
these terms mean by looking in other parts of the document, because they are
all ambiguous.

"Finite-length sequence" is used in many other places for the length of:
lists, strings, binary octets of hexBinary, binary octets of baseBinary,
etc.  Obviously, lists, strings, and binary octets must be allowed to have a
zero length.  Therefore, at least in these cases (and possibly in all
cases), "finite-length" includes zero-length.   This is supported by the use
of the phrase "finite, non-zero-length" for NMTOKENS, IDREFS, and ENTITIES,
and by the addition of "(possibly empty)" after "finite-length" in the
definition of list.

So if "finite-length sequence of digits" in xsd:decimal includes zero
digits, then all of the following are valid lexical representations for this
type:  "", "+", ".", "3.", ".3".

This has several other implications:

-- "" is a valid lexical representation of the integer value 0

-- "" and "E" are valid lexical representations of the float value 0

Is all of the above intended?  Is this the common understanding?  If not,
there is a defect in the specification of the lexical representation of
xsd:decimal and xsd:integer (it should not say "finite-length").

Actually, I doubt that the above was intended.  One of the reasons is that
none of the examples in Part 2 shows a decimal number with an empty integer
part, or an "empty" integer, but a stronger reason comes from the definition
of the canonical lex rep of xsd:integer.  While the definition of the
canonical lex rep of xsd:decimal requires the presence of at least one digit
in the integer part, the definition of the canonical lex rep of xsd:integer
does not include the same requirement -- it says that leading zeros are
prohibited, full stop.  Is the single "0" in the representation of the
integer value 0 a "leading zero" or not?  Depending on the answer, the
canonical lex rep of the integer value 0 will be either "" or "0".  Which is
it?

If the latter is intended, then "fixed-length sequence" for xsd:integer
apparently does not include a zero length, and "leading zeros" apparently
does not include the single zero digit representing the integer value zero.
But if the former is intended (and so the canonical lex rep of the integer 0
is ""), then I wonder why "0.x" (rather than ".x") was chosen as the
canonical lex rep of decimals with a null integer part.

Should "finite-length sequence of digits" have been in both cases "finite,
but non-zero-length, sequence of digits"?

Also, should "separated be a period" have been "with a period optionally
inserted at any point in the sequence of digits"?

Or perhaps "with a period optionally inserted at any point in the sequence
of digits except before the first digit and after the last digit"?  (or
equivalently "between any two digits"?)

Or perhaps "with a period optionally inserted at any point in the sequence
of digits except after the last digit"?

I think the first big question is, What was intended?  And the second is, Is
there anything to be mended in the text, and how?

Alessandro Triglia
OSS Nokalva





Received on Friday, 28 October 2005 17:01:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:51 GMT