W3C home > Mailing lists > Public > public-xformsusers@w3.org > March 2018

Re: IRIs

From: Steven Pemberton <steven.pemberton@cwi.nl>
Date: Thu, 08 Mar 2018 13:46:33 +0100
To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Cc: public-xformsusers@w3.org
Message-ID: <op.zfj5rvwrsmjzpq@steven-xps>
You are absolutely right, and I am absolutely wrong.

What led me to the conclusion was writing the test suite for anyURI, and  
IRIs showing up as invalid, and me then following the wrong link.

So all is well, I can breathe a sigh of relief, and carry on with the test  
suite.

I'm happy that you are reading the XForms mailing list :-)

Steven

On Wed, 07 Mar 2018 19:37:20 +0100, C. M. Sperberg-McQueen  
<cmsmcq@blackmesatech.com> wrote:

>
>> On Mar 7, 2018, at 10:39 AM, Steven Pemberton <steven.pemberton@cwi.nl>  
>> wrote:
>>
>> The definition of anyURI doesn't allow IRIs, such as
>> 	https://zh.wikipedia.org/wiki/Wikipedia:关于中文维基百科/en.
>>
>> Just as we added an iemail address type to match modern email  
>> addresses, it seems to me that we ought to also add an anyIRI type that  
>> accepts IRIs like the aboce.
>
> I am puzzled; what leads you to the conclusion that xsd:anyURI
> does not accept IRIs?
>
> In XSD 1.0 [1], the value space is described as that of
> RFC 2396, as modified by RFC 2732, and the lexical space
> is described (roughly) as the set of strings, which after
> escaping, turn into URIs as defined by those specs.  The
> escaping in question is the then current algorithm for IRIs,
> as published in the XLink spec. I believe that later revisions
> of the concept of IRI changed the rules for whitespace, but
> I don’t recall any other changes likely to be noticeable to
> users of the datatype.  Certainly the intent of XSD 1.0
> was to accept IRIs in the lexical space of the type anyURI.
>
> The spec says "This [the mapping from lexical space to value
> space] means that a wide range of internationalized resource
> identifiers can be specified when an anyURI is called for”.
>
> In XSD 1.1 [2], the spec is a little more explicit, since the
> IRI concept was a little more clearly developed by that time:
> "anyURI represents an Internationalized Resource Identifier
> Reference (IRI).  An anyURI value can be absolute or relative,
> and may have an optional fragment identifier (i.e., it may be
> an IRI Reference).  This type should be used when the value
> fulfills the role of an IRI, as defined in [RFC 3987] or its
> successor(s) in the IETF Standards Track.”
>
> During the development of XSD 1.1 the WG responded to
> inconsistencies in the 1.0 implementations of the anyURI
> type (and, perhaps, to fears that future revisions of the RFCs
> for URIs and IRIs would continue to change the set of legal
> values) by seeking to simplify and future-proof the rules used
> for checking schema-validity of IRIs.  For reasons I do not think
> I can successfully reconstruct (at least, not without falling
> into depression), it chose to do so by stating clearly that the
> grammar rules specified by the relevant RFCs are effectively
> only advisory, and that for purposes of schema validation,
> any sequence of XML characters constitutes a value of the
> type.
>
> So in XSD 1.1 it is doubly untrue to say that IRIs are not
> accepted as lexical representations of xsd:anyURI:  not only
> is it clearly stated that IRIs are to be accepted, but strings
> that do not match the current definition of IRIs will *also*
> be accepted as schema-valid.
>
> XForms needs its own IRI type only if stricter validation of the
> grammar of URIs and IRIs is needed.
>
> If in fact stricter validation is needed, the XForms group may
> wish to consider using the datatypes defined in “XSD datatypes
> for strict validation of IRIs and URIs” [3].
>
> It would be very disappointing if the amount of work that went
> into making xsd:anyURI accept IRIs turned out to be for
> naught.
>
> [1] https://www.w3.org/TR/xmlschema-2/#anyURI
> [2] https://www.w3.org/TR/xmlschema11-2/#anyURI
> [3]  
> https://www.w3.org/XML/Group/2004/06/exacturi/xsd-rfc-3986-uri-3986-iri.html
>
> N.B. I am umable to verify URI [3], since my access privileges
> no longer seem sufficient to retrieve the document.  [3] was
> prepared for publication as a WG note by the then XML Schema
> WG but never published, since the WG ran out of resources and
> time.  When the XML Core WG took over responsibility for
> XSD, they decided they didn’t have the necessary resources, either.
> I would be glad if the work were finally published.
>
> ********************************************
> C. M. Sperberg-McQueen
> Black Mesa Technologies LLC
> cmsmcq@blackmesatech.com
> http://www.blackmesatech.com
> ********************************************
>
Received on Thursday, 8 March 2018 12:47:07 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:37:49 UTC