Re: Expressing metadata using IRW from Nathan on 2011-01-24 (public-awwsw@w3.org from January 2011)

From: Nathan <nathan@webr3.org>
Date: Mon, 24 Jan 2011 23:05:48 +0000
To: Jonathan Rees <jar@creativecommons.org>
CC: Harry Halpin <hhalpin@w3.org>, AWWSW TF <public-awwsw@w3.org>
Message-ID: <4D3E05CC.1010106@webr3.org>
Jonathan Rees wrote:
> On Thu, Jan 20, 2011 at 5:54 PM, Harry Halpin <hhalpin@w3.org> wrote:
>> Anyways, my two cents, and I'm happy to stop by and discuss IRW at some
>> point.
> 
> Harry,
> 
> There are many documents out on the web containing assertions similar
> to the following:
> 
> @prefix xhtml: <http://www.w3.org/1999/xhtml/vocab#>
> <http://lessig.org/blog/>
>    xhtml:license
>    <http://creativecommons.org/licenses/by/3.0/> .
> 
> The above assumes that the subject and object URIs refer to documents
> (or similar). It therefore depends on the httpRange-14 rule in order
> to be understood.
> 
> What would you suggest as an httpRange-14-free way to express the above?

Will chip in with a personal opinion:

Exactly the same way, the license (hopefully) only applies to what's 
there in the present, and copies thereof, for instance if Lawrence 
dropped the domain and I picked it up, the license would not apply to 
my own "blog".

Anybody looking at the page, or this one:

   http://lessig.org/blog/2009/08/announcing_the_hibernation_of.html

Can see that it's a blog post (augmented with loads of other stuff), 
and in the RDF he clearly has statements noting that URI as a cc:Work, 
has it described fully and unambiguously, and the license applied to it.

There is no issue for man or machine here.

One could start asking if the license also applies to the "CHANGE 
CONGRESS" banner, the comments, the templates and css and so forth, 
but both the humans and machines are not confused here at all.

If Lawrence did want to license each comment too, then he could 
identify them by their URI, for instance:
 
http://lessig.org/blog/2009/08/announcing_the_hibernation_of.html#comment-75817

But that could just as easily be:
   http://lessig.org/blog/2009/08/comments/75817

And just as easily again, both URIs could refer to the same 
"resource", the one described, which in this case would be that comment.

Likewise, here there is no issue for man or machine again.

Now, if Lawrence gave every comment, and the blog post, the same URI, 
and used it to describe them all, then there would be a problem, for 
Lawrence, because his data would be so messed up that it would be 
scrap and nobody would care to use it.

IMO (and his O), this is the URI of a Work, a Blog Post, as described 
by Lawrence:
   http://lessig.org/blog/2009/08/announcing_the_hibernation_of.html

If you expand the use-case, such that Lawrence offered, by conneg, 
other versions, RDF/XML, Turtle, N3, a PDF and an audio file of him 
reading the blog post aloud for the blind, then this is where Lawrence 
would hit upon a snag, and have to give each representation it's own 
URI. So that he could say that "this is an audio version which is 3 
minutes long" and the like, but again, even with such a pattern all 
that's needed is to have consistent, useful, unique URIs, so he could 
easily mint:

the work:
   http://lessig.org/blog/2009/08/announcing_the_hibernation_of
an HTML page containing it:
   http://lessig.org/blog/2009/08/announcing_the_hibernation_of.html
an audio version:
   http://lessig.org/blog/2009/08/announcing_the_hibernation_of.mp4

Now, you may think this brings us right back to square one, because 
we've now said that in one context the .html URI is fine to name the 
Work, and in the other it's not. But it's still fine, because the 
first URI can name the Work, the second can name the Work in an HTML 
format (dc:hasFormat), and likewise the third.

Further, Lawrence could easily /not/ have different URIs to access 
each representation and allow the ambiguous HTTP conneg to do the job, 
simply giving back HTML or Audio or PDF based on client Accept 
headers, each GET 200, and each without there own URI in a 
Content-Location. In this case, Lawrence would have overloaded the 
URI, but could still easily mint #frag id's for each specific format, 
if he so wished and remain unambiguous in his own data; however, this 
is overloading, is ambiguous, and would cause unexpected 
functionality, even out with the realms of the sem web, because nobody 
could click on the single URI and be guaranteed to get back the audio 
version each time, or the html version, and so forth. So we should 
advise against this case, and luckily it's not very common, and not 
really supported by any servers I know of (unless somebody has custom 
coded it this way, if so - raise it as a bug).

Another example:

   http://tools.ietf.org/html/rfc2616#section-14
   http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

They both refer to the same thing, yeah? but if you conneg them both 
to RDF, one has to 200 OK and the other has to 303 See Other ?

what if one did (which also works), then the 200 would be okay?
   http://www.w3.org/Protocols/rfc2616/rfc2616-sec14#thing

In reality, there's no ambiguity there, and no problem for man or 
machine, RDF could use all three URIs and assert that all three 
referred to the same thing - apart from the fact httpRange-14 blocks it.

I find it strange saying this, as I've always been very "you MUST use 
frag URIs" at all times, but now I fail to see why, just ensure 
different things you're referring to have different names.

A final thought, typically we'd say:

   http://example.org/foo.html (an IR)
   http://example.org/foo.html#post (a post described by the IR)

but why not:

   http://example.org/foo.html (the post)
   http://example.org/foo.html#format (details of the format/"IR")

or:
   http://example.org/foo (the post)
   http://example.org/foo.html (the post)

or in one case:
   http://example.org/foo (the post)
   http://example.org/foo#post (the post)
   http://example.org/foo.html (the post)

and in another:

   http://example.org/foo (the post)
   http://example.org/foo.html (the html formatted post)

It all seems fine to me, where's the problem?

To return right back to the start, perhaps the proof is in the 
pudding, can any of us find a fault with Lawrence's URIs? I know I can 
understand them, and so can my tools, no ambiguity - do any of us have 
a problem with them (they 200 OK on slashes)

Best,

Nathan
Received on Monday, 24 January 2011 23:06:47 UTC