W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > April to June 2011

[Bug 6089] Revise anyURI to use RFCs 3986 and 3987

From: <bugzilla@jessica.w3.org>
Date: Mon, 09 May 2011 12:48:24 +0000
To: www-xml-schema-comments@w3.org
Message-Id: <E1QJPsu-0001hv-I4@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6089

C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #12 from C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> 2011-05-09 12:48:21 UTC ---
When the WG discussed this issue again on 17 December 2010 I was asked to draft
a user-defined type definition to illustrate the possibility of using the
pattern facet to enforce the rules of RFC 3986, as suggested by Mukul Gandhi in
comment 7.  This has taken longer than hoped to reach the top of my to-do list,
but two schema documents defining URI and IRI types using patterns are now
available in the directory 

  http://www.w3.org/2011/04/XMLSchema/
  (world-accessible resource)

The schema document
http://www.w3.org/2011/04/XMLSchema/TypeLibrary-URI-RFC3986.xsd defines types
based on xsd:anyURI with patterns that require the literal to be legal
according to the RFC 3986 definitions of 'URI', 'URI-ref', 'absolute-URI', and
'relative-ref'.  The schema document
http://www.w3.org/2011/04/XMLSchema/TypeLibrary-IRI-RFC3987.xsd does analogous
work based on the grammar of RFC 3987; it defines types called IRI-3987,
absolute-IRI-3987, relative-reference-3987, and IRI-reference-3987.  It is this
last which most users who want IRI validation should use unless they know what
they are doing and know that they want one of the others.

The testframe.xsd and testframe.xml documents in the same directory illustrate
the application of the datatypes to various strings some of which match the
relevant grammars and some of which do not.  So, for example, the
IRI-reference-3987 type accepts the string 

  http://r&#xE9;sum&#xE9;.example.org

and rejects the string 

  //2/-:)z254/:2a2$::25[v42.42.42.42:AA:]3 

Further work that may be done when time allows (not soon, probably) may include
definition of similar types for the grammars of RFC 2396 and other earlier
definitions of URIs and IRIs.  In due course the WG will prepare and publish a
new version of the type library at http://www.w3.org/2001/03/XMLSchema/
incorporating these new datatypes.

As the discussion of this issue has shown (both in the comments here and in the
long technical discussions summarized in the email mentioned in comment 10)
there is no unanimity in the community about what form of checking should be
done for URIs and IRIs.  Type definitions like those in the schema documents
mentioned above show how users can control their own destiny and get the
validation they need for their particular applications.  Those who want to be
careful about namespace names, for example, will want relative references to be
caught as errors, so they will want to use type absolute-IRI-3987 and not type
IRI-reference-3987.  

And of course implementations can always check the syntactic correctness of
anyURI values as a service to their users; failure to match the grammar of the
RFC isn't a well defined type error in XSD 1.0, and it's clearly defined as NOT
a type error in XSD 1.1, but that doesn't mean it can't be mentioned in a
message to the user.

Since the WG is firm in its decision not to change the text of XSD 1.1 as
suggested here and hopes that the user-defined types described above will show
how users can get whatever form and level of validation is appropriate to their
situation, I am marking this issue RESOLVED / WONTFIX.  I am sorry that the WG
has not found a way to make all interested parties happy.

Murata-san, as the originator of the issue you are asked to mark this issue
CLOSED, thus indicating that you are satisfied with the working group's efforts
to resolve the issue (even if not happy with the final result) and are willing
to accept the decision.  Or alternatively you may choose to REOPEN the issue,
thus indicating that you do not believe the working group has made a sufficient
effort to resolve the issue, that you refuse to accept the outcome, and that if
necessary you wish to appeal the decision of the WG to the director of the W3C.

If we do not hear from you within a period of two weeks, we will assume that
you are willing to accept the outcome.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Monday, 9 May 2011 12:48:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 9 May 2011 12:48:27 GMT