Re: Issue-57

On Tue, 2011-06-14 at 05:05 +0100, Alan Ruttenberg wrote:
> On Mon, Jun 13, 2011 at 9:50 PM, David Booth <david@dbooth.org> wrote:
> > I do not think that is a fair characterization.  Richard's example is
> > *not* opting out of machine inference.  It is merely opting out of
> > certain inferences that *some* applications need but others do *not*
> > need.  And that is as it *should* be, as it is not possible to cater to
> > *all* applications.
> >
> > The subtle mistake that is being made repeatedly here is in assuming
> > that someone's data is *wrong* (or socially irresponsible) if it
> > conflates two things that we humans find useful to distinguish, such as
> > people versus web pages -- *even* if the class of applications for which
> > that data is intended have no need to make such a distinction!
> 
> Pat has it right:
> 
> On Tue, Jun 14, 2011 at 4:33 AM, Pat Hayes <phayes@ihmc.us> wrote:
> > Bear in mind that the very first principle of the Web is that the
> > *publisher* of the data, who asserts these things about dogs or
> > pictures of dogs, cannot possibly know what 'context of use' is
> > going to be relevant to the *user* of the published content
> 
> http://lists.w3.org/Archives/Public/public-lod/2011Jun/0199.html

I agree with the above comment: data publishers cannot know how their
data will be used.  However, that is *not* the same as saying that their
data must be usable in all possible applications.  Any given dataset
supports a particular class of applications and will be unsuitable to
others.  For example, a dataset that models the world as flat may be
fine for computing driving directions but would be unsuitable for
aircraft applications.

> 
> David, we are not aiming for application developers to use the web as
> scratchpad instead of a relational database with the mind of then
> sucking it back in for their proprietary application. We don't *need*
> the web for that. The idea that data publishers should have in mind
> exactly how their data is supposed to be used, and then choose to use
> public vocabulary however they feel like it is just broken. It is
> missing the point.

I agree, and that is *not* what I am advocating or condoning.  Except
for the rare case of community expropriation, I think URIs should be
used strictly in accordance with their URI declarations (i.e., the URI
owner's published definitions):
http://dbooth.org/2009/lifecycle/#event2
[[
An RDF statement author has a choice about whether to use a given URI in
a statement. The guiding principle is:

  Statement author responsibility 3: Use of a URI implies agreement 
  with the core assertions of its URI declaration.

Hence, the statement author is responsible for ensuring that he/she does
indeed agree with those assertions and must NOT use the URI if he/she
does not agree.
]]

HOWEVER, a URI declaration or definition cannot remove all possible
ambiguity, not matter how precise or well considered it is.  All it can
do is to *bound* the ambiguity.  For *some* applications, the ambiguity
will be bounded enough that the URI appears unambiguous.  Whereas for
other applications requiring finer distinctions, that same URI will be
hopelessly ambiguous.

There certainly *are* cases where people are just plain sloppy or
erroneous in their data or definitions.  But one should not assume that
ambiguity is *necessarily* the result of sloppiness or error.  And in
the schema.org case, it appears to have been a conscious choice to avoid
additional complexity -- complexity that may have hindered their target
applications, even if it would have helped other applications.


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.

Received on Tuesday, 14 June 2011 20:42:18 UTC