- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 12 May 2003 13:34:46 -0400
- To: "Roger L. Costello" <costello@mitre.org>
- Cc: "Costello,Roger L." <costello@mitre.org>, xmlschema-dev@w3.org
Roger Costello: >> is it correct that num's value (32) is always represented as a "string", regardless of how num is declared? That is, are all values just strings, with a "datatype label" associated with the string? I think it's fair to say that the Schema recommendation doesn't tell you how to "represent" things, any more than the XML recommendation tells you whether to use SAX or DOM, or whether to use UTF-8 or UTF-16 for the strings in your API. The schema recommendation defines a relation on schemas and instances: it basically tells you some information that you can discover in the course of an assessment. In example, some of the things you can discover include: * That the character children of <num> are the characters "3", "2" * That the element has been validated by the type unSignedByte (Example 1) or numType (Example 2) respectively. * In the case of Example 1, the recommendation tells you that the base type for xsd:unSignedByte is xsd: decimal. Crucially, it tells you that for a lexical form such as "3", "2" there is a corresponding abstract decimal value in the value space, which is the decimal number 32. So, the recommendation is very clear that after the assessment you know both the characters and the corresponding value. Whether you expose either or both in any particular API is up to you. Note that the XML Query language (working drafts) let you deal with either or both. * The story on the your numType (Example 2) follows a similar analysis. In this case, the base type is xsd:string. While there is also a value space for this type, it is essentially in 1-to-1 correspondence with the lexical space. If we consider the input documents: <num>32</num> and <num>032</num> they have different values in the value space for the string-like types, the same value in the decimal-based type. >> Let me ask it another way, is the value (32) represented by an XML Schema validator as this: Again, the recommendation doesn't tell a validator how to optimize its representations of anything. We note that the integer, decimal, float, etc. types allow bounds checks such as maxInclusive. If your integers are small enough, and the validator knows this, you can try storing them in a 32 (or 16 or whatever) binary integer. In the case of types like integer and decimal, you might also get away with bounds checks on the lexical forms. In the case of float, this is unlikely. Most validators will use IEEE binary notations to implement bounds checks on floats and doubles, and lexical forms to implement the pattern facet on floats and doubles. If your validator can find a better way, that's fine, as long as your results are as described by the recommendation. While we're at it., note that <xsd:enumeration> is on the value space. For integer types: <xsd:enumeration>32</xsd:enumeration> also matches 032. If your base type is string, then "032" does not match. Same for key/keyRef. In general, the representations a processor will need internally is likely to determine the features actually used. And so on. Hope this helps. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------ "Roger L. Costello" <costello@mitre.org> Sent by: xmlschema-dev-request@w3.org 05/09/2003 04:50 PM To: xmlschema-dev@w3.org, "Costello,Roger L." <costello@mitre.org> cc: (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: Are all values stored as strings? Hi Folks, Consider these two ways of defining an element called "num": ----------------------------------------------------------------- Version #1: <xsd:element name="num" type="xsd:unsignedByte"/> ----------------------------------------------------------------- Version #2: <xsd:simpleType name="numType"> <xsd:restriction base="xsd:string"> <xsd:pattern value="[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]"/> </xsd:restriction> </xsd:simpleType> <xsd:element name="num" type="numType ----------------------------------------------------------------- Now, here is an example of an instance of "num": <num>32</num> Question: is it correct that num's value (32) is always represented as a "string", regardless of how num is declared? That is, are all values just strings, with a "datatype label" associated with the string? Let me ask it another way, is the value (32) represented by an XML Schema validator as this: 0010 0000 if "num" is declared using Version #1 and like this: 0011 0010 if "num" is declared using Version #2? /Roger
Received on Monday, 12 May 2003 13:44:15 UTC