- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 11 Jan 2010 11:38:40 -0500
- To: Larry Masinter <masinter@adobe.com>
- Cc: "Roy T. Fielding" <fielding@gbiv.com>, "julian.reschke@gmx.de" <julian.reschke@gmx.de>, "public-iri@w3.org" <public-iri@w3.org>
Larry Masinter writes: > I understand there are widespread implementations of URIs and > URI processing, but what other systems implement IRIs > according to RFC 3987 terms? In case it's of interest, the XML Schema 1.1 anyURI datatype [1] is (intended as) an IRI. Among the pertinent parts of the CR specification are: -----BEGIN QUOTE FROM XSD 1.1 DATATYPES ------- [Definition:] anyURI represents an Internationalized Resource Identifier Reference (IRI). An anyURI value can be absolute or relative, and may have an optional fragment identifier (i.e., it may be an IRI Reference). This type should be used when the value fulfills the role of an IRI, as defined in [RFC 3987] or its successor(s) in the IETF Standards Track. [...] The ·lexical space· of anyURI is the set of finite-length sequences of zero or more characters (as defined in [XML]) that ·match· the Char production from [XML]. Note: For an anyURI value to be usable in practice as an IRI, the result of applying to it the algorithm defined in Section 3.1 of [RFC 3987] should be a string which is a legal URI according to [RFC 3986]. (This is true at the time this document is published; if in the future [RFC 3987] and [RFC 3986] are replaced by other specifications in the IETF Standards Track, the relevant constraints will be those imposed by those successor specifications.) Each URI scheme imposes specialized syntax rules for URIs in that scheme, including restrictions on the syntax of allowed fragment identifiers. Because it is impractical for processors to check that a value is a context-appropriate URI reference, neither the syntactic constraints defined by the definitions of individual schemes nor the generic syntactic constraints defined by [RFC 3987] and [RFC 3986] and their successors are part of this datatype as defined here. Applications which depend on anyURI values being legal according to the rules of the relevant specifications should make arrangements to check values against the appropriate definitions of IRI, URI, and specific schemes. -----END QUOTE FROM XSD 1.1 DATATYPES ------- Also, there is the following note about space characters. Before getting too upset about it, please note that using this type for IRIs at all is a "should", and there is in fact no required normative content checking except that the characters match the XML Char production. So, the following is not a backhanded way of saying that space characters >should< be allowed in IRIs; rather it is an acknowledgement that, with no normative prohibition, a health warning is in order. -----BEGIN QUOTE FROM XSD 1.1 DATATYPES ------- Note: Spaces are, in principle, allowed in the ·lexical space· of anyURI, however, their use is highly discouraged (unless they are encoded by '%20'). -----END QUOTE FROM XSD 1.1 DATATYPES ------- Noah [1] http://www.w3.org/TR/xmlschema11-2/#anyURI -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- Larry Masinter <masinter@adobe.com> Sent by: public-iri-request@w3.org 12/30/2009 02:02 PM To: "Roy T. Fielding" <fielding@gbiv.com> cc: "julian.reschke@gmx.de" <julian.reschke@gmx.de>, "public-iri@w3.org" <public-iri@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: What standards and implementations use IRIs? I think it would be helpful if we could be more explicit about which standards and implementations of IRIs are better served by the current normative definition of IRIs vs. a more liberal specification, closer to what browsers, operating systems, and common URL-parsing libraries accept and process? Outside of XML's LEIRI, which is itself an extension of what RFC 3987 allows, or URL-parsing libraries, which seem to have parameters or options letting the caller determine which syntax they want to process against? I understand there are widespread implementations of URIs and URI processing, but what other systems implement IRIs according to RFC 3987 terms? Larry -- http://larry.masinter.net
Received on Monday, 11 January 2010 16:36:56 UTC