Re: another crack at 'information resource'

On Fri, 2010-11-19 at 09:24 -0500, Jonathan Rees wrote:
> To date the main effect of the httpRange-14 resolution has been a
> cascade of messages on various LOD lists bullying newcomers to RDF
> with taunts like "you're not a web page, are you?"  

Yes, well intentioned but misguided taunts, IMO.

> We have the stick
> of public censure on pedantic-web and such places, but, as has been
> pointed out with some justification, no carrot, no clear practical
> benefit that ensues from adhering to the
> 2xx-means-information-resource rule. 

True, and for good reason: most applications to date have no *need* to
distinguish between the toucan and its web page.

> Furthermore the rule gets
> rightfully ridiculed for not being accompanied by any sensible
> definitions of "information resource" or "is a representation of". 

Agreed.

> It
> feels arbitrary and pointless to those who have not already bought
> into it - a sort of cult.

Agreed.

> 
> The carrot I want to offer is that there *are* valuable things that
> you can do if the rule is followed, but that you can't do in the
> intended way and with confidence if it's not. 

That sounds like an excellent goal.

> And in fact these are
> things that are already being done on the web, with no particular
> normative justification. So, we can think of the rule as being
> something that preserves and encourages a particular existing salutary
> practice.

Ok.

> 
> What I have in mind are metadata assertions of the kind that RDF (the
> 'resource description format' of the W3C 'Metadata Initiative',
> remember?  back when 'resource' meant what we now call 'information
> resource'?) was originally designed to encode, things like Dublin Core
> and BIBO and FRBR properties, or more generally any property that is
> true or false on the basis of a resource's "representations". TimBL's
> genont ontology has properties of this sort too, even though they're
> not what one would usually call metadata.

Sounds good.

> 
> We can talk in practical terms about what happens with and without the
> 2xx/IR rule. With the rule, I can use my knowledge of a resource's
> representations to make assertions about the resource. I can encode my
> knowledge of GET U/200 Z exchange patterns as RDF statements with
> subject <U>, just as the Metadata Initiative meant for me to do. I can
> write such statements with confidence regardless of whether the URI
> owner has ever heard of RDF or cares about ontology or
> "identification" or anything else. 

Sounds great.

> Without the rule, I am constantly
> in doubt - I have to look over my shoulder and ask, does this URI mean
> the thing that we observe via HTTP, or does it mean some other entity?
>  How would I even find out? Yes, you can invent answers to these
> questions, but the answers are ad hoc, complex, brittle, unreliable,
> and incompatible with current metadata practice.  The effect of
> detaching the use of the subject <U> from HTTP would be a chill, the
> injection of FUD, in our declarative treatment of what's on the web.

Agreed.

> 
> For example, suppose I know that 200 responses from URI U will always
> give me responses with media type RDF/XML. I might say so using
> something like <U> :mediaTypeAlways media:application-rdf-xml. In any
> reasonable interpretation this would contradict <U> rdf:type
> foaf:Person because it's nonsense (i.e. highly undesirable from an
> engineering viewpoint) to say that a person has a media type. 

Whoa!  Major problem here.  You are *assuming* that the class of
foaf:Persons is disjoint with the class of web pages.  If those classes
were *defined* to be disjoint, then yes there would be a contradiction.
And those classes *could* be defined as disjoint, but doing so would not
be a good idea, as it would dramatically limit their usability.

You seem to be assuming that these classes are being used to make
absolute statements about the real world.  I think that's the wrong way
to think about it.  These statements are designed to be processed by
*applications* to perform useful tasks.  All that matters is that the
application produces the right answers.  The fact that the app confuses
the toucan with its web page makes no difference as long as it gives the
right answers.

URIs in RDF *can* be interpreted as mapping to real world things, but
that is only one of many possible RDF interpretations (in the sense of
RDF Semantics).  Applications that process RDF are not in any way
limited to interpretations that map URIs to real world things.  They may
perfectly well use interpretations that involve imaginary entities like
a thing that is part toucan, part web page.  All that matters is that
the app produces the right answers.

> If the
> Person assertion were found in the RDF delivered at U and given
> credence, we'd have a contradiction, and a perfectly good metadata
> assertion would be under siege.
> 
> If you find the Person example unconvincing, consider the more direct
> case where the fetched RDF uses <U> to designate an "information
> resource" that is observably different from the one that has the RDF
> representation, e.g. <U> :mediaTypeAlways media:text-html.  Since
> :mediaTypeAlways is functional this would be a contradiction not
> requiring a judgment of nonsensicalness. (This is pretty much the same
> as the Communist Manifesto vs. Wikipedia articla example I gave
> before.)

Yes, this sounds feasible.

> 
> If you find media type unconvincing, it can be replaced by any other
> similar property such as dc:creator or dc:title.
> 
> Note I'm not presupposing any particular meaning for "information
> resource" or "representation of" but rather would try to figure out
> what definitions these words would need to have in order to make this
> kind of use case work.
> 
> The 200/IR rule is not logically necessary. There are other possible
> architectures. E.g. we could have - and I think we probably should
> have - a property "isLocatedAt" that connects an IR to a URI where
> it's deployed, 

I think the log:uri relationship is already adequate for this.

> and instead of saying <U> we could say [:isLocatedAt
> "U"], the IR located at U, and use that as the subject in metadata
> assertions. But this is not how people currently write DC and BIBO and
> genont (etc), and it is so awful that no one ever would. We've already
> gone down the 200/IR path; I think it's just a matter of publicizing
> the reason why. It's not that the rule necessarily benefits the URI
> owner. It's because general respect for the rule benefits those who
> are doing metadata curation, and those who would make use of such
> metadata.
> 
> I'm looking for general encouragement or other feedback, not of the
> details but of the general approach.

I think it has some merit.

thanks,
David


> 
> Jonathan
> 
> On Wed, Nov 17, 2010 at 4:51 PM, David Booth <david@dbooth.org> wrote:
> > On Thu, 2010-11-11 at 17:23 -0500, Jonathan Rees wrote:
> >> [ . . . ] could enumerate lots of properties ("content properties") that I
> >> think should follow the  universal quantification rule - e.g. most or
> >> all of the DC and BIBO properties - but (in light of the negations of
> >> these properties) I don't know how to enable someone else to
> >> generalize to additional properties, i.e. what the boundaries of the
> >> meta-category of content properties are.
> >>
> >> Before spending a lot of time trying to figure that out, though, I
> >> want to convince at least one other person that this idea (universal
> >> quantification to extend representation properties to IR properties)
> >> has promise as a possible way to motivate the httpRange-14 rule.
> >
> > I don't follow this at all.  How would it motivate the httpRange-14
> > rule?
> >
> >
> >
> > --
> > David Booth, Ph.D.
> > Cleveland Clinic (contractor)
> > http://dbooth.org/
> >
> > Opinions expressed herein are those of the author and do not necessarily
> > reflect those of Cleveland Clinic.
> >
> >
> 
> 
> 

-- 
David Booth, Ph.D.
Cleveland Clinic (contractor)
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of Cleveland Clinic.

Received on Friday, 19 November 2010 15:24:59 UTC