Re: xml:base (was Re: IRI meets RDF meets HTTP redirect) from John Cowan on 2007-04-20 (semantic-web@w3.org from April 2007)

From: John Cowan <cowan@ccil.org>
Date: Thu, 19 Apr 2007 20:22:40 -0400
To: Sandro Hawke <sandro@w3.org>
Cc: John Cowan <cowan@ccil.org>, Jeremy Carroll <jjc@hpl.hp.com>, semantic-web@w3.org, www-international@w3.org
Message-ID: <20070420002240.GM1262@mercury.ccil.org>

Sandro Hawke scripsit:

> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> 	 xmlns:foaf="http://xmlns.com/foaf/0.1/"
>          xml:base="http://www.w3.org/International/articles/idn-and-iri/JP????/????????????">
>   <rdf:Description rdf:about="http://www.w3.org/">
>     <foaf:likes rdf:resource="" />
>   </rdf:Description>
> </rdf:RDF>
> ================================================================

Unfortunately, the above showed up in my mailer with a pile of question
marks, but I'll pretend it didn't.  I'm also reversing the order of your
options for rhetorical purposes.

> ** Option 2:
> 
> This input is not well formed XML.

In no case would an erroneous value of the xml:base attribute (for
example, "%%%" would be such a value) make an XML document *not well
formed*.  It would, however, make the document not conform to the XML
Base recommendation.

However, your example is *not* such a case.

> ** Option 1:
> 
> This is perfectly decent XML.

Yes, it is, and what's more it conforms to XML Base.

> It parses to this N-Triple:
> 
>    <http://www.w3.org/> <http://xmlns.com/foaf/0.1/likes> <http://www.w3.org/International/articles/idn-and-iri/JP????/????????????>.

Almost.  It parses to the N-Triple that results when
you %-escape the above non-ASCII characters.  If you
read either http://www.w3.org/2001/sw/RDFCore/ntriples/
or http://www.w3.org/TR/rdf-testcases/#ntriples , you
will find that non-ASCII characters are not permitted
in N-Triples files.  Furthermore, they are not required,
because N-Triples express relations between resources
named by URIs, not by IRIs.

I repeat:  the value of an xml:base attribute may contain
non-ASCII (and non-URI) characters.  The resulting base URI
does not; it contains their %-escaped equivalents.

> I'm happy with this option, and I understood Jeremy and Chris to be as
> well.  FWIW, the W3C RDF validator (using Jeremy's parser) does this.

If the validator returns that N-Triple then it is broken.

-- 
Is not a patron, my Lord [Chesterfield],        John Cowan
one who looks with unconcern on a man           http://www.ccil.org/~cowan
struggling for life in the water, and when      cowan@ccil.org
he has reached ground encumbers him with help?
        --Samuel Johnson

Received on Friday, 20 April 2007 00:22:59 UTC