Re: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...] from William Waites on 2011-06-14 (public-lod@w3.org from June 2011)

From: William Waites <ww@styx.org>
Date: Tue, 14 Jun 2011 10:54:36 +0200
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org>
Message-ID: <20110614085436.GX42832@styx.org>

* [2011-06-13 20:33:47 -0700] Pat Hayes <phayes@ihmc.us> écrit:

] > So there is some relationship between a description of the Eiffel
] > tower and the tower itself. The relationship is akin to similarity in
] > a very specific way - they are similar enough that someone thought it
] > made sense to write down that the tower was 356m tall.
] 
] What has that got to do with the tower being similar to its description? 

Simply that they are similar enough (in the relevant respects etc)
that one can write ":eiffel :height 324" for either and (reasonably?)
expect the reader not to be confused.

] First, you seem to be assuming here that the tower and its description
] are NOT similar, contrary to what you said earlier and Danny seems to 
] be insisting upon. Second, this hypothetical person is, we both agree,
] confused. They made a mistake, what they said was wrong. Correct? I ask,
] because many people seem to want to say that they were NOT confused or
] wrong, just kind of less correct than if they used the right URI. 

Confused or speaking loosely, not bothering to make the distinction
because it seems to them that they are being clear enough that any
reader will understand what they mean. If you call them on it they
will probably agree that, yes, "what I really meant was ... but to
have written that out would have seemed excessively pedantic" in
exactly the same way that I wasn't confused when I wrote "confused"
but I admit to being inexact :)

So I agree with these many people who want to say that there are a lot
of inexact statements that are not made by confused people just by
people with perhaps unreasonably high expectations that the readers of
their statements will be able to figure out what they meant if not
strictly what they said.

] Third, and most important, anyone interested is unlikely to be confused,
] yes indeed. But any piece of software or inference engine is not
] unlikely to be confused. 

So this is the mismatch. Publishers write things down with some
assumptions of what is likely to cause confusion that are probably
based largely on their interactions with other humans, not with
inference engines.

Writing things down exactly is incredibly difficult. A very large part
of almost every discussion or disagreement usually comes down to
someone understanding what was said differently than the person who
said it meant. It can often take a lot of discussion before this
becomes apparent. And that's between humans!

So we want to get people to publish linked or structured data that is
as exact as possible. Each step in that direction is a little bit more
burdensome for the publisher, feels a little bit more pedantic and
verbose to write down, means the publisher needs to know a little more
about the kinds of things a reader can handle, but at the same time is
easier to write software that can use it using simpler and more
general algorithms that we know.

Some people seem to be saying that range-14 is a step too far. Other
people seem to be saying that without that step it's impossible to
write software in a general way to work with the data. If both are
correct then we're stuck.

The perception of RDF as complicated, verbose and pedantic is common
and is something we cannot afford. Personally I don't think the range-14
arrangment is too burdensome but outside this community this is a
minority viewpoint. We cannot throw up extra barriers to publishers.
So we need better software that can handle this kind of inexact data.

] When you are the agent who is using this information, sure. But when
] you are the one publishing it or asserting it, you cannot do this.
] And when you are the one writing the rules to determine a globally
] accepted notion of entailment, you cannot do it.

Publishers will always make assumptions about how the information
will be used. The assumptions will usually not be explicit. Even
humans don't have a globally accepted notion of entailment, it's
all about context and intent on the part of the agent doing the
reasoning. They will just have to deal with the fact that the
publisher may not have anticipated their use.

Since range-14 seems to be a sticking point, we can try to address
that particular kind of ambiguity with guidance about how to reason
about information and non-information resources, and this guidance
won't be general, it will have to do with particular classes and 
predicates and how they should be interpreted in the local (graph)
context.

] Well, now you are stepping into an ocean of cans of worms. 

Oh, well aware of that :)

Cheers,
-w

-- 
William Waites                <mailto:ww@styx.org>
http://river.styx.org/ww/        <sip:ww@styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45

Received on Tuesday, 14 June 2011 08:55:10 UTC