Re: Question about "paths as URIs" in the BBC RDF from Ross Singer on 2010-01-28 (public-lod@w3.org from January 2010)

From: Ross Singer <rossfsinger@gmail.com>
Date: Thu, 28 Jan 2010 15:56:49 -0500
To: Dan Brickley <danbri@danbri.org>
Cc: Linked Data community <public-lod@w3.org>
Message-ID: <23b83f161001281256n170c7b5cnc191fa62ef248024@mail.gmail.com>

Thanks, Dan.  Simon Spero pointed me towards this as well (so I would
like to publicly thank him, too).

My takeaway is that both parties are doing something wrong here:

1) My parser needs to be aware of the context of the resource it is
parsing (whether that be the URI it is being retrieved from or an
explicitly set base URI) - I'm refactoring it as we speak to take this
into account.
2) The BBC should set an xml:base attribute since that would alleviate
any ambiguity should the RDF not be retrieved from their site.

For the historical record (should anyone else try this and run across
this thread), Simon broke down :5.1 thusly:

the places to get your bases are (in order):
5.1.1: explicit base in document
5.1.2: base of encapsulating document
5.1.3: uri used to retrieve document
5.1.4:  Guess

I had 5.1.1 and 5.1.2 (mostly) covered.  I just didn't take into account 5.1.3

Thanks again,
-Ross.

On Thu, Jan 28, 2010 at 3:36 PM, Dan Brickley <danbri@danbri.org> wrote:
> On Thu, Jan 28, 2010 at 7:56 PM, Ross Singer <rossfsinger@gmail.com> wrote:
>> Hi, I have a question about something I've run across when trying to
>> parse the RDF coming from the BBC.  If you take a document like:
>>
>> http://www.bbc.co.uk/music/artists/72c536dc-7137-4477-a521-567eeb840fa8.rdf
>>
>> notice how all of the URIs are paths, but there's no xml:base to
>> declare where these actual paths may reside.
>>
>> If I point rapper at that URI, it brings me back fully qualified URIs:
>> <http://www.bbc.co.uk/music/artists/72c536dc-7137-4477-a521-567eeb840fa8#artist>
>>
>> but the only way I can figure it's able to do that is for the parser
>> and the HTTP agent to be in cahoots somehow, which seems like a
>> breakdown in the separation of concerns -- this document is useless,
>> except in the context of living on www.bbc.co.uk.  The moment I cache
>> it to my local system, if I'm understanding it correctly, it's now
>> asserting these things about my filesystem (effectively).  Rapper now
>> says:
>> <file:///music/artists/72c536dc-7137-4477-a521-567eeb840fa8#artist>
>>
>> So my questions would be:
>> 1) Is this "valid"?
>> 2) If so, is there an expectation of the parser being aware of the URI
>> of retrieval? (I have written my own set of parsers, so I'd need to
>> rethink this assumption, if so)
>> 3) How do other client libraries handle this?
>
> Hi Ross,
>
> The relevant specs are
>
> http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/#section-Syntax-ID-xml-base
>
> "The XML Infoset provides a base URI attribute xml:base that sets the
> base URI for resolving relative RDF URI references, otherwise the base
> URI is that of the document. The base URI applies to all RDF/XML
> attributes that deal with RDF URI references which are rdf:about,
> rdf:resource, rdf:ID and rdf:datatype."
>
> http://www.faqs.org/rfcs/rfc2396.html which specifies relative URI
> processing given a base URI.
>
> I think most of what you need is in :5.1. Establishing a Base URI" there.
>
> cheers,
>
> Dan
>

Received on Thursday, 28 January 2010 20:57:23 UTC