- From: Falk, Alexander <falk@icon.at>
- Date: Mon, 10 Apr 2000 14:03:36 +0200
- To: "'www-xml-schema-comments@w3.org'" <www-xml-schema-comments@w3.org>
- Message-ID: <0FED160BABE4D311AD2E0050DA4657850226BE@MEDUSA>
Hi, I was studying the new April 7 version of the XML Schema working draft throughout the weekend, as we are in the process of finalizing the beta 3 version of XML Spy 3.0 (see http://www.xmlspy.com/version30.asp), and I have a first list of comments and questions - especially regarding the changes to the datatypes (part 2). Part 1 - Structures A. Schema for Schemas Why does the Public Identifier URN for the DOCTYPE statement still use 19991216 as its date, when the DTD for Schemas (Appendix B) is v1.1 dated 2000/04/06. This Public Identifier URN seems to imply that the Schema for Schemas is itself written in compliance with the old December 1999 XML Schema draft, which it is not. Along the same lines: the year in the XML Schema namespace URI is also still fixed with 1999 - is that going to change for the final recommendation? While it is understandable from an implementors point of view that the URN should remain constant over the time of the draft and recommendation creation, it would IMHO be rather confusing for all future schema authors, if the date given here is not identical to the date of the final recommendation. G. Tabulation of Changes The comments in this list are not very useful at all. Compared with "H Revisions from previous draft" in Part 2, which is ideal for implementors and saves us the burden of re-reading the entire Specs again and again, the list of changes in Part 1 is too minimal. Comments like "Lots of edits" or "more from Noah" are simply not comprehensible without the background that only insiders of the WG can have. Please provide a more meaningful change history in the future (or none at all). Part 2 - Datatypes 3.2.2.2 Constraining facets on boolean datatype Other than specifically restricting the lexical space to either {0,1} or {true, false} for a certain schema, what is the intention of allowing a pattern facet for booleans? 3.2.8.1 Constraining facets on binary datatype As binary currently only offers two different encodings that specify the respective lexical spaces, defining a pattern facet on binary doesn't make much sense - other than e.g. restricting the letters a-f to uppercase-only or lower-case only. However, with base64 the alphabet is strictly defined in the RFC. To answer the question contained in the Ed.Note of this chapter, I would, therefore, suggest to omit the pattern facet here from an implementors standpoint, as its benefits are rather limited and the potential confusion would be worse. 3.2.3 - 3.2.5 Lexical notation of floating-point numbers While it is very nice from an implementors standpoint to know that all sorts of float, double, or decimal numbers will only use the period as a decimal separator, I wonder if this is really satisfying for many European and other non-US users. Specifically, when XML is being used to supplant existing systems, it is often necessary to interpret floating-point or decimal number with other decimal separators (most notably ',') and in some cases also including thousands separators (e.g. 4,560,758.99 vs. 4.560.758,99). Why is there no means provided to support these formatting styles in the XML schema draft. Just like the encoding facet for binaries, this "formatting" or "picture" facet (to use an old COBOL-coined term that was also suggested in the DCD submission to the W3C in July 1998) could be used to specify the various aspects of the lexical space of these datatypes. If we were to consider XML schemas for B2B e-Commerce scenarios only, it would be understandable to only allow one format that can be easily processed - but XML schemas should be thought of in much broader terms. 3.3 A general question concering constraining facets in derived types: Most of the derived datatypes have certain facets that distinguish them from the primitive types. However, each one of the derived types still lists the very facets that were used to generate it from the primitive types in its list of applicable constraining facets. Consider the case of recurringDay, which is derived from recurringDuration by fixing the duration facet with "PT24H" and the period facet with "P1M". This type still lists duration and period as possible constraining facets - yet they are absolutley fixed by the very definition of recurringDay. How should a validating processor treat a new type derived from recurringDay that actually tries to use one of these facets in its definition? I see two possible solutions to this dilemma: a) you integrate some kind of "final" method to fix constraining facets (e.g. the definition of recurringDay would use the period and duration facets with this "final" mechanism to explicitely forbid any further attempts at adding additional constraints through the same facets). b) if this seems to be too complicated, it would also possible to make the above mechanism mandatory for ANY kind of facet (e.g. once a derived type was generated by using any one facet, that facet cannot be used anymore to further derive from that derived type). This would, perhaps, result in some of the derived built-in types that are currently defined, to be redefined as primitive types, but would resolve all potential ambiguities arising from multiple use of the same facet for any sort of grandchildren-derived type. 3.3.29.1 Lexical representation of recurringDay If this is a left truncated ISO-8601 day, then it should be ----DD, not ---DD A. Schema for Datatype Definitions The part.xsd schema document includes the namespace "http://www.w3.org/XML/1998/namespace" from a schemLocation "../structures/xml.xsd" yet I was unable to locate this file on the W3C web-server. Can you please provide a URL that will allow me to access the xml.xsd file? Furthermore, would it be possible for a future draft or the final recommendation to include one downloadable archive file (ZIP, gzip, or any other common formats) that includes all required files in one neat package (i.e. the specs and their respective DTDs and XSL files plus the non-normative Schema DTDs, XSDs, and any other required file). E. Regular Expressions For an implementors position I don't see why defining {,m} as a shorthand form of {0,m} would be a problem. It would seem logical to add this, now that {n,} is allowed. I don't think it is relevant whether or not Perl includes such a quantifier. If it is more consistent and could potentially help schema authors, then it should be added. Along these same lines: I doubt that there is any meaningful use for {0,0} apart from effectively "commenting out" the preceding atom. Furthermore, {0,0} could then potentially be written as {,} which is even more confusing. Apart from being a logical consequence of the {n,m} quantifier, what was the reason for adding {0,0} to the table as a separate line? Another problem: it is currently impossible to define a pattern that uses the vertical bar '|' as a character, because this is defined as a separator between branches, and there is no single character escape defined for \|. The only workaround is to include the vertical bar inside of a positive character group in a character class escape: [|]. Wouldn't it be better (i.e. more consistent) to add \| as a single char escape? Sincerely, Alexander Falk ... Icon Information-Systems ... ALEXANDER FALK ... President, CEO ... http://www.icon-is.com/falk
Received on Monday, 10 April 2000 08:03:39 UTC