- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Thu, 08 Mar 2018 15:06:17 +0100
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "Steven Pemberton" <steven.pemberton@cwi.nl>
- Cc: public-xformsusers@w3.org
But that said, the anyURI type is extremely liberal, accepting literally *any* string of characters. The only purpose of the type seems to be mandating transforming the characters into something acceptable to a URI when necessary. It would still be useful to have a type that validates according to http://www.ietf.org/rfc/rfc3987.txt. Steven On Thu, 08 Mar 2018 13:46:33 +0100, Steven Pemberton <steven.pemberton@cwi.nl> wrote: > You are absolutely right, and I am absolutely wrong. > > What led me to the conclusion was writing the test suite for anyURI, and > IRIs showing up as invalid, and me then following the wrong link. > > So all is well, I can breathe a sigh of relief, and carry on with the > test suite. > > I'm happy that you are reading the XForms mailing list :-) > > Steven > > On Wed, 07 Mar 2018 19:37:20 +0100, C. M. Sperberg-McQueen > <cmsmcq@blackmesatech.com> wrote: > >> >>> On Mar 7, 2018, at 10:39 AM, Steven Pemberton >>> <steven.pemberton@cwi.nl> wrote: >>> >>> The definition of anyURI doesn't allow IRIs, such as >>> https://zh.wikipedia.org/wiki/Wikipedia:关于中文维基百科/en. >>> >>> Just as we added an iemail address type to match modern email >>> addresses, it seems to me that we ought to also add an anyIRI type >>> that accepts IRIs like the aboce. >> >> I am puzzled; what leads you to the conclusion that xsd:anyURI >> does not accept IRIs? >> >> In XSD 1.0 [1], the value space is described as that of >> RFC 2396, as modified by RFC 2732, and the lexical space >> is described (roughly) as the set of strings, which after >> escaping, turn into URIs as defined by those specs. The >> escaping in question is the then current algorithm for IRIs, >> as published in the XLink spec. I believe that later revisions >> of the concept of IRI changed the rules for whitespace, but >> I don’t recall any other changes likely to be noticeable to >> users of the datatype. Certainly the intent of XSD 1.0 >> was to accept IRIs in the lexical space of the type anyURI. >> >> The spec says "This [the mapping from lexical space to value >> space] means that a wide range of internationalized resource >> identifiers can be specified when an anyURI is called for”. >> >> In XSD 1.1 [2], the spec is a little more explicit, since the >> IRI concept was a little more clearly developed by that time: >> "anyURI represents an Internationalized Resource Identifier >> Reference (IRI). An anyURI value can be absolute or relative, >> and may have an optional fragment identifier (i.e., it may be >> an IRI Reference). This type should be used when the value >> fulfills the role of an IRI, as defined in [RFC 3987] or its >> successor(s) in the IETF Standards Track.” >> >> During the development of XSD 1.1 the WG responded to >> inconsistencies in the 1.0 implementations of the anyURI >> type (and, perhaps, to fears that future revisions of the RFCs >> for URIs and IRIs would continue to change the set of legal >> values) by seeking to simplify and future-proof the rules used >> for checking schema-validity of IRIs. For reasons I do not think >> I can successfully reconstruct (at least, not without falling >> into depression), it chose to do so by stating clearly that the >> grammar rules specified by the relevant RFCs are effectively >> only advisory, and that for purposes of schema validation, >> any sequence of XML characters constitutes a value of the >> type. >> >> So in XSD 1.1 it is doubly untrue to say that IRIs are not >> accepted as lexical representations of xsd:anyURI: not only >> is it clearly stated that IRIs are to be accepted, but strings >> that do not match the current definition of IRIs will *also* >> be accepted as schema-valid. >> >> XForms needs its own IRI type only if stricter validation of the >> grammar of URIs and IRIs is needed. >> >> If in fact stricter validation is needed, the XForms group may >> wish to consider using the datatypes defined in “XSD datatypes >> for strict validation of IRIs and URIs” [3]. >> >> It would be very disappointing if the amount of work that went >> into making xsd:anyURI accept IRIs turned out to be for >> naught. >> >> [1] https://www.w3.org/TR/xmlschema-2/#anyURI >> [2] https://www.w3.org/TR/xmlschema11-2/#anyURI >> [3] >> https://www.w3.org/XML/Group/2004/06/exacturi/xsd-rfc-3986-uri-3986-iri.html >> >> N.B. I am umable to verify URI [3], since my access privileges >> no longer seem sufficient to retrieve the document. [3] was >> prepared for publication as a WG note by the then XML Schema >> WG but never published, since the WG ran out of resources and >> time. When the XML Core WG took over responsibility for >> XSD, they decided they didn’t have the necessary resources, either. >> I would be glad if the work were finally published. >> >> ******************************************** >> C. M. Sperberg-McQueen >> Black Mesa Technologies LLC >> cmsmcq@blackmesatech.com >> http://www.blackmesatech.com >> ******************************************** >>
Received on Thursday, 8 March 2018 14:06:52 UTC