RE: What standards and implementations use IRIs?

Well, that's interesting, because it explicitly allows
RFC 3987 to be updated. So I wouldn't read that
compatibility with the quoted XML Schema 1.1 anyURI spec
should by itself be a reason for not making changes
that bring the IRI spec closer to how IRIs are implemented.

I did ask what other systems *IMPLEMENT* IRIs
according to RFC 3987, in the course of discussing 
whether what section 7.2 currently calls "pre-processing"
should be folded into the definition of what 
an IRI is.

So the question remains: are there any other systems
that actually implement RFC 3987 *as it is written*
that we should be considering when we consider
changes to RFC 3987.

Larry
--
http://larry.masinter.net


-----Original Message-----
From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] 
Sent: Monday, January 11, 2010 8:39 AM
To: Larry Masinter
Cc: Roy T. Fielding; julian.reschke@gmx.de; public-iri@w3.org
Subject: Re: What standards and implementations use IRIs?

Larry Masinter writes:

> I understand there are widespread implementations of URIs and 
> URI processing, but what other systems implement IRIs
> according to RFC 3987 terms?

In case it's of interest, the XML Schema 1.1 anyURI datatype [1] is 
(intended as) an IRI.  Among the pertinent parts of the CR specification 
are:

-----BEGIN QUOTE FROM XSD 1.1 DATATYPES -------
[Definition:]   anyURI represents an Internationalized Resource Identifier 
Reference (IRI).  An anyURI value can be absolute or relative, and may 
have an optional fragment identifier (i.e., it may be an IRI Reference). 
This type should be used when the value fulfills the role of an IRI, as 
defined in [RFC 3987] or its successor(s) in the IETF Standards Track.

[...]

The *lexical space* of anyURI is the set of finite-length sequences of 
zero or more characters (as defined in [XML]) that *match* the Char 
production from [XML].

Note: For an anyURI value to be usable in practice as an IRI, the result 
of applying to it the algorithm defined in Section 3.1 of [RFC 3987] 
should be a string which is a legal URI according to [RFC 3986]. (This is 
true at the time this document is published; if in the future [RFC 3987] 
and [RFC 3986] are replaced by other specifications in the IETF Standards 
Track, the relevant constraints will be those imposed by those successor 
specifications.)

Each URI scheme imposes specialized syntax rules for URIs in that scheme, 
including restrictions on the syntax of allowed fragment identifiers. 
Because it is impractical for processors to check that a value is a 
context-appropriate URI reference, neither the syntactic constraints 
defined by the definitions of individual schemes nor the generic syntactic 
constraints defined by [RFC 3987] and [RFC 3986] and their successors are 
part of this datatype as defined here. Applications which depend on anyURI 
values being legal according to the rules of the relevant specifications 
should make arrangements to check values against the appropriate 
definitions of IRI, URI, and specific schemes.
-----END QUOTE FROM XSD 1.1 DATATYPES -------

Also, there is the following note about space characters.  Before getting 
too upset about it, please note that using this type for IRIs at all is a 
"should", and there is in fact no required normative content checking 
except that the characters match the XML Char production.  So, the 
following is not a backhanded way of saying that space characters >should< 
be allowed in IRIs;  rather it is an acknowledgement that, with no 
normative prohibition, a health warning is in order. 

-----BEGIN QUOTE FROM XSD 1.1 DATATYPES -------
Note: Spaces are, in principle, allowed in the *lexical space* of anyURI, 
however, their use is highly discouraged (unless they are encoded by 
'%20').
-----END QUOTE FROM XSD 1.1 DATATYPES -------

Noah

[1] http://www.w3.org/TR/xmlschema11-2/#anyURI

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








Larry Masinter <masinter@adobe.com>
Sent by: public-iri-request@w3.org
12/30/2009 02:02 PM
 
        To:     "Roy T. Fielding" <fielding@gbiv.com>
        cc:     "julian.reschke@gmx.de" <julian.reschke@gmx.de>, 
"public-iri@w3.org" <public-iri@w3.org>, (bcc: Noah 
Mendelsohn/Cambridge/IBM)
        Subject:        What standards and implementations use IRIs?


I think it would be helpful if we could be more explicit about
which standards and implementations of IRIs are better served
by the current normative definition of IRIs vs. a more liberal
specification, closer to what browsers, operating systems, and
common URL-parsing libraries accept and process?

Outside of XML's LEIRI, which is itself an extension of what
RFC 3987 allows, or URL-parsing libraries, which seem to have
parameters or options letting the caller determine which
syntax they want to process against?

I understand there are widespread implementations of URIs
and URI processing, but what other systems implement IRIs
according to RFC 3987 terms?

Larry
--
http://larry.masinter.net

Received on Tuesday, 19 January 2010 09:34:25 UTC