- From: <rjelliffe@allette.com.au>
- Date: Sun, 3 May 2009 15:59:11 +1000 (EST)
- To: xmlschema-dev@w3.org
>> > I think the working group felt that >> > introducing context-dependent validation (where the validity of a >> > document depends on factors other than the schema and the instance >> > document) was a risky architectural innovation, and possibly a step >> > that would be later regretted. >> >> So they actually had no reason? Just some vague possibility. > > Actually, now I recall some of the discussion: one of the concerns was > specifically the subject of this original question. Should the referenced > document be schema-validated? If so, how do you prevent circularities, or > infinite regress? Some members of the WG feel strongly that validation > should never pose any denial-of-service risks, and allowing doc() opens up > all sorts of possibilities. "All sorts of possibilities"? Why are any more possibilities for DOS opened up than currently exist with import/include/redefine/xsi:schemaLocation/xsi:noNamespaceSchemaLocation? Security must be a layer above schemas. Securuty policy decides which features are safe or unsafe in a particular deployment, but they needn't decide which features to leave in or out. The DTD spec has entities, but the security warning about the billion laughs attack. In practical terms, retrieval, validation and use of external files should be a command-line or invocation option not one that needs to stymie the XSD WG: make the default that only relative or local URLs can be used, and with lax validation, for example. Trying to find the mythological optimum solution in the absence of information will of course lead to the conclusion that no decision can be made. But trying to find the optimal solution is itself an irrational approach in that situation; the minimum to declare victory is better. > Potential problems like this can consume an immense amount of WG time, and > when a spec is running years late already, there is a strong temptation > for > the chair to encourage people to cut a feature rather than spend time > discussing whether or not it creates a problem. But the bottom line is that when a committee succumbs to agreeing to arguments like "there may be ramifications we have not identified", then it open itself to a selection of features merely based on the whims of one implementer or another. It is a get-out-of-jail free card, since while it looks like a rational argument it actually cannot be argued against by any facts or specific arguments. That is because it is a flawed kind of logic. It gives the appearance of being prudent, but risk assessment is based on looking at real problems and issues, not asserting risks in the absence of any evidence or issues, if you see what I mean. Is it really that XSD is such a rathole of ineracting effects that issues ccannot be thought through, or is it that there is a (legitimate, I would say) difference between the database/data-binding stakeholders who only want schemas to drive their storage/CRUD requirements, and web/messaging/publishing/QA/QC stakeholders who need the schema constraints to reflect their web-based information arrangements? In either case, the solution has to be to take a big axe to XSD: * XSD -lite with no type derivation syntax. Only built-in simple types. Allow xs:choice inside a simple type as an alternative syntax for enumerations. No list or union. No facets. No extension. Some alternative syntax for complexTypes with simpleContent so that there is complexContent and simpleContent can left out. UPA as a caution message not a fatal warning. Fewer restrictions (progress has been made on this, I see). PSVI as an optional In fact, a reconstruction of RELAX NG in XSD syntax, and a prelude for giving up type derivation of complex types as a bad joke. * XSD -fat as a layer on top with type derivation, strict UPA, assertions, etc. reconstructing the existing syntax. XSD after 10 years is facing exactly the same situation that SGML had after 8 years in 1996:stakeholders saw that revision plans were revisions in the direction of being more complex rather than simplifications (so that other layers would reconstruct the culled functionality.) Back 9 years ago, when I was on the WG, there were comments from WG members against modularization or optionality along the lines that validity should always be validity. If this were the real test, then XSD has proven itself an utter failure: witness the XSD profiles from different groups, such as the data-binding profiles at W3C. I trust that on the XSD WG nowadays if anyone talks of reliably-widespread implementation of any components, they get laughed back into reality: the stable door is open, the horse has bolted--it would be more prudent to close the door before putting in more horses. Of course, then comes the predictable excuse that "when we look at it, everyone has different requirements" as if doing nothing was an improvement on at least meeting some stakeholder chunks. People think XSD Recommendation is horrible, and based on barmy editorial principles. But aside from any eccentricities of the specs (and the Recommendations have many virtues too), ultimately the problems with the Recommendations are caused by the complexity of the underlying technology. If the XSD WG wanted a rule-of-thumb for how big a layer should be, I would suggest this: no technology should be so big that an experienced and excellent programmer would take more than a month full-time to implement the layer. I think XML Schemas Datatypes meets this goal. RELAX NG and Schematron and all the parts of DSDL do. But clearly XML Schemas Structures does not: I am not sure that someone could even get competent in the Recommendation in one month. The XSD 1.1 revision is a great step forwards, but this is not much use when you are falling down a hole. Of course I think assertions and so on are really useful. But they are being tacked onto a Heath Robinson/Rube Goldeberg machine, and it doesn't have to be that way. Cheers Rick Jelliffe
Received on Sunday, 3 May 2009 06:00:10 UTC