Re: What are proper URIs for RDF representations of real existing content from Richard Cyganiak on 2008-04-04 (public-lod@w3.org from April 2008)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Fri, 4 Apr 2008 15:51:39 +0100
To: Mark Diggory <mdiggory@MIT.EDU>
Cc: public-lod@w3.org
Message-Id: <3663F081-2CCD-460D-973E-A7F2E2DD5AD4@cyganiak.de>
On 4 Apr 2008, at 02:41, Mark Diggory wrote:
> But to take this to the point of describing an actual "file", if I  
> have a file (lets say a pdf) at /path/too/my.pdf and I'm using  
> content negotiation... I suppose I could have a unique rdf  
> representation for that pdf that describes it, then /path/to/my.pdf  
> would return that rdf to rdf browsers.  But what if I'm asking the  
> browser to also render the pdf? then the Accept header needs to  
> adjust to negotiate only the pdf.

Remember that the Web doesn't really have a concept of “files”, it has  
resources which can have zero or more representations. Hence if you  
start out with a bunch of files, you first have to make a decision on  
how to model the files as resources and representations.

If you use Apache to serve static files for example, then Apache will  
do that modelling for you automatically, in the very simple way where  
you end up with one resource per file, the path of the file directly  
corresponds to the resource's URI, and the resource has exactly one  
representation. That's a sensible modelling, but it's not the only  
thing possible and it's not set in stone!

So here's how you could treat this.

/path/to/my.pdf doesn't have content negotiation, it's just the PDF.
/path/to/my.rdf has the RDF version of the PDF.
/path/to/my is a “generic document” which content-negotiates to PDF or  
RDF (or perhaps also HTML).

When you pass around links to this bunch of resources, you would  
usually pass around /path/to/my, because it's generic and provides  
access to whatever format is most appropriate for the client. But if  
you definitely want the client to see the PDF, tell him about /path/to/ 
my.pdf instead.

See also http://www.w3.org/2001/tag/doc/alternatives-discovery.html  
which describes this approach.

(Note that if the PDF and RDF contain very different information, e.g.  
the PDF is a 100-page-document but the RDF is just a few triples with  
name, author and date, then this is not really appropriate, the two  
should in that case be treated as different resources, and connected  
via links and not content negotiation. Content negotiation is best- 
suited for the case where all the different variants, e.g. HTML and  
RDF, have more or less the same information content, and it's just a  
question of selecting the variant that the client can most easily  
process. Content negotiation is about different formats or languages  
of the same information.)

Richard


>
>
>> Serving different document formats from the same URI (content  
>> negotiation) has been a feature of the basic Web protocols for  
>> many, many years.
>
> And with that and the Semantic Web effort, seems that if OAI-ORE is  
> about being able to encode descriptions of complex composite digital  
> objects, they best account for content negotiation in their spec.
>
>>
>>> Because I feel this is a description of that resource, not a  
>>> description of a description of the resource. I'd like to be able  
>>> to say...
>>>
>>>> <rdf:RDF ... >
>>>>   <rdf:Description rdf:about="http://dspace-test.mit.edu/handle/1721.1/36383 
>>>> ">
>>>>       <dc:creator>Abelson, Harold</dc:creator>
>>>>       <dc:creator>Zittrain, Jonathan</dc:creator>
>>>> 	<ore:describes rdf:resource="http://dspace-test.mit.edu/handle/1721.1/36383#aggregation 
>>>> "/>
>>>>   </rdf:Description>
>>>>   <rdf:Description rdf:about="http://dspace-test.mit.edu/handle/1721.1/36383#aggregation 
>>>> ">
>>>>       <ore:aggregates rdf:resource="...."/>
>>>>       <ore:aggregates rdf:resource="...."/>
>>>>   </rdf:Description>
>>>> </rdf:RDF>
>>
>> This looks good to me.
>>
>>> But, doing this requires that the tool resolving be crossing 303  
>>> redirects or parsing HTML and extracting the location of the RDF  
>>> from there, otherwise they always resolve to the HTML rather than  
>>> the RDF whenever attempting to follow the URI.  Can anyone  
>>> recommend what a best practice would be in this case?
>>
>> Not sure I understand the problem. RDF-aware tools need to send a  
>> proper Accept header anyways or they won't get any RDF out of many  
>> Semantic Web sites. And practically all Web tools follow redirects  
>> transparently unless you explicitly tell them not to.
>>
>> I would actually propose this slightly different setup:
>>
>> /handle/1721.1/36383 serves either HTML or RDF/XML, based on the  
>> Accept header (content negotiation), directly without a redirect.
>
> Yes that can be done.
>
>> /handle/1721.1/36383.html serves only HTML.
>>
>> /handle/1721.1/36383.rdf serves only RDF/XML.
>
> We currently have a special path for representations because the / 
> handle/ space is rather controlled in our application...but
>
> /metadata/handle/1721.1/36383.html
> /metadata/handle/1721.1/36383.rdf
> /metadata/handle/1721.1/36383.xxx
>
> should be workable for the moment.  Eventually, hoping we do get to  
> reusing the same namespace to serve out the different  
> representations...
>
>>
>> This is the approach described here: http://www.w3.org/TR/cooluris/#hashuri 
>>  .
>
> Thanks, it looks to be a good resource.
>
> -Mark
>
> ~~~~~~~~~~~~~
> Mark R. Diggory - DSpace Developer and Systems Manager
> MIT Libraries, Systems and Technology Services
> Massachusetts Institute of Technology
>
>
>
>
>
Received on Friday, 4 April 2008 14:52:26 UTC