Fwd: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...] from Danny Ayers on 2011-06-19 (public-lod@w3.org from June 2011)

From: Danny Ayers <danny.ayers@gmail.com>
Date: Sun, 19 Jun 2011 20:22:30 +0200
To: public-lod@w3.org
Message-ID: <BANLkTinFpLJ7X+iVYvtd=kcKLq4frBAU5w@mail.gmail.com>
I feel very guilty being in threads like this. Shit fuck smarter people than
me.

Can we now close this trench down and move elsewhere?

Forwarded conversation
Subject: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...]
------------------------

From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 12 June 2011 14:40
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


On 12 June 2011 01:51, Pat Hayes <phayes@ihmc.us> wrote:
>
> On Jun 11, 2011, at 12:20 PM, Richard Cyganiak wrote:
>
>> ...
>>>> It's just that the schema.org designers don't seem to care much about
the distinction between information resources and angels and pinheads. This
is the prevalent attitude outside of this mailing list and we should come to
terms with this.
>>>
>>> I think we should foster a greater level of respect for representation
>>> choices here. Your dismissal of the distinction between information
>>> resources and what they are about insults the efforts of many
>>> researchers and practitioners and their efforts in domains where such
>>> a distinction in quite important. Let's try not to alienate part of
>>> this community in order to interoperate with another.
>>
>> Look, Alan. I've wasted eight years arguing about that shit and defending
httpRange-14, and I'm sick and tired of it. Google, Yahoo, Bing, Facebook,
Freebase and the New York Times are violating httpRange-14. I consider that
battle lost. I recanted. I've come to embrace agnosticism and I am not
planning to waste any more time discussing these issues.
>
>
> Well, I am sympathetic to not defending HTTP-range-14 and nobody ever,
ever again even mentioning "information resource", but I don't think we can
just make this go away by ignoring it. What do we say when a URI is used
both to retrieve, um sorry, identify, a Web page but is also used to refer
to something which is quite definitely not a web page? What do we say when
the range of a property is supposed to be, say, people, but its considered
OK to insert a string to stand in place of the person? In the first case we
can just say that identifying and reference are distinct, and that one
expects the web page to provide information about the referent, which is a
nice comfortable doctrine but has some holes in it. (Chiefly, how then do we
actually refer to a web page?) But the second is more serious, seems to me,
as it violates the basic semantic model underlying all of RDF through OWL
and beyond. Maybe we need to re-think this model, but if so then we really
ought to be doing that re-thinking in the RDF WG right now, surely? Just
declaring an impatient agnosticism and refusing to discuss these issues does
not get things actually fixed here.

For pragmatic reasons I'm inclined towards Richard's pov, but it would
be nice for the model to make sense.

Pat, how does this sound:

>From HTTP we get the notions of resources and representations. The
resource is the conceptual entity, the representations are concrete
expressions of the resource. So take a photo of my dog -

<http://example.org/sasha-photo> foaf:depicts <http://example.org/Sasha> .

If we deref http://example.org/sasha-photo then we would expect to get
a bunch of bits that can be displayed as an image.

But that bunch of bits may be returned with HTTP header -

Content-Type: image/jpeg

or

Content-Type: image/gif

Which, for convenience, lets say correspond to files on the server
called sasha-photo.jpg and sasha-photo.gif

Aside from containing a different bunch of bits because of the
encoding, sasha-photo.jpg could be a lossy-compressed version of
sasha-photo.gif, containing less pixel information yet sharing many
characteristics.

All ok so far..?

If so, from this we can determine that a representation of a resource
need not be "complete" in terms of the information it contains to
fulfill the RDF statement and the HTTP contract.

Now turning to http://example.org/Sasha, what happens if we deref that?

Sasha isn't an information resource, so following HTTP-range-14 we
would expect a redirect to (say) a text/html description of Sasha.

But what if we just got a 200 OK and some bits Content-Type: text/html ?

We are told by this that we have a representation of my dog, but from
the above, is there any reason to assume it's a complete
representation?

The information would presumably be a description, but is it such a
leap to say that because this shares many characteristics with my dog
(there will be some isomorphism between a thing and a description of a
thing, right?) that this is a legitimate, however partial,
representation?

In other words, what we are seeing of my dog with -

Content-Type: text/html.

is just a very lossy version of her representation as -

Content-Type: physical-matter/dog

Does that make (enough) sense?

Cheers,
Danny.




--
http://danny.ayers.name

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 12 June 2011 16:29
To: public-lod@w3.org


Danny,

Quite a long route to saying:

You can use a hyperlinks to Name Observation Subjects.

Observation Subject have Representations at an Address.

Actual format of Observation Subject Representation is negotiable.

The brevity challenge is a function of using hyperlinks as Names since WWW
users are only accustomed to their use as Resource Locators or Addresses
(URLs).

Graph Models for describing Observation Subjects has made sense for a long
time, pre WWW. It only when we try to state or infer that this is an RDF
(syntax for expressing semantics) invention that all hell breaks loose, and
justifiably so.

-- 

Regards,

Kingsley Idehen
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/**blog/~kidehen<http://www.openlinksw.com/blog/~kidehen>
Twitter/Identi.ca: kidehen








----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 12 June 2011 19:19
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Well, I am too. That is, I would love for this whole issue/problem to just
go away. But I don't think ignoring it will make it go away.
OK, so far. I would just note that (coming from a different, non-HTTP,
tradition) I would never have even dreamt of any representation being
"complete" in what I think is the sense you mean. So your care and emphasis
here seem odd. But OK, I am following you...
Really? I thought that HTTP-range-14 just said that if we get redirected,
all bets are off, and the URI might denote anything at all, so the thing
that gets returned might have nothing to do with the referent.
Then (again, according to doctrine) the URI denotes the information resource
which this is the HTTP-representation of. Which evidently is not Sasha.
No, but what has that got to do with anything? The key issue is that we are
told that it is an information resource and hence we know it is not a dog.
So we know, for example, that if someone asserts that some other dog is its
father, or that it had its vet shots in February, or that it is an instance
of http://sw.opencyc.org/concept/Mx4rvVjaoJwpEbGdrcN5Y29ycA , then (if we
are smart) something is wrong here, or else (if we are less smart) that
something on the Web has these properties.

Now, we could try this line, which I think is what you are suggesting. We
could say that all such 'information resources' are being used as stand-ins
for referential names themselves, i.e. they are not things (like dogs, say)
but should always be understood as referring to some other thing. There are
some technical problems with this, but Im sure we could work around them;
but the serious problem with this idea is, that it makes it impossible to
simply refer to these information resources themselves. So we would be
unable to talk about Web pages using the Web description language RDF.
Frankly, this would not bother me personally very much, as I am not
particularly interested in describing Web pages in RDF, but I know it would
bother some other people (TIm B-L, for just one) rather a lot.
What??
Absolutely not. Descriptions are not in any way isomorphic to the things
they describe. (OK, some 'diagrammatic' representations can be claimed to
be, eg in cartography, but even those cases don't stand up to careful
analysis. in fact.)
It is a representation, sure. The question is, what is it a representation
OF? A lossy image of a lossy image of X is itself a (very) lossy image of X.
But the name of a name of X is not a name of X; and a (descriptive)
representation of a representation of X is not a representation of X. For
example, "written clumsily and with many spelling errors" describes "Ee were
real gude at mafematiks at skool", which in turn describes me; but I am not,
myself, composed of spelling errors. Reference is not transitive, in a
nutshell.
Nope, absolutely not. Reference is not like lossy imaging.
NIce try, but no cigar. Want to try again? Seriously, it is not easy to find
a coherent way to allow what one might call reference slippage - using a
name or description to stand in for the actual thing named - without the
whole semantic framework just basically collapsing**. I know we humans do it
all the time without hardly noticing, and I REALLY wish that I or someone
could figure out how to capture this facility in a formal scheme of some
kind. But I cant see how to do it.

Pat

** To illustrate. Someone goes to a website about dogs, likes one of the
dogs, and buys it on-line. He goes to collect the dog, the shopkeeper gives
him a photograph of the dog. Um, Where is the dog? Right there, says the
seller, pointing to the photograph. That isn't good enough. The seller
mutters a bit, goes into the back room, comes back with a much larger,
crisper, glossier picture, says, is that enough of the dog for you? But the
customer still isn't satisfied. The seller finds a flash card with an
hour-long HD movie of the dog, and even offers, if the customer is willing
to wait a week or two, to have a short novel written by a well-known author
entirely about the dog. But the customer still isn't happy. The seller is at
his wits end, because he just doesn't know how to satisfy this customer.
What else can I do? He asks. I don't have any better representations of the
dog than these. So the customer says, look, I want the *actual dog*, not a
representation of a dog. Its not a matter of getting me more information
about the dog; I want the actual, smelly animal. And the seller says, what
do you mean,  an "actual dog"? We just deal in **representations** of dogs.
There's no such thing as an actual dog. Surely you knew that when you looked
at our website?
------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes






----------
From: *Alan Ruttenberg* <alanruttenberg@gmail.com>
Date: 12 June 2011 19:53
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <
richard@cyganiak.de>, Linked Data community <public-lod@w3.org>, Michael
Hausenblas <michael.hausenblas@deri.org>


That seems too strong.

Just thinking about this alternative - that 200 responders (for the
purposes of linked data) are not considered IRs.
Instead 200 implies an assertion (for, say, http://www.ihmc.us/users/phayes/
)

_:foo a :information-thing
_:foo :at "http://www.ihmc.us/users/phayes/"^^xsd:anyURI

(there exists an information resource accessible at
http://www.ihmc.us/users/phayes/)

to which could then be asserted in your favored syntax:

_:page a :web-page
_:page :at "http://www.ihmc.us/users/phayes/"^^xsd:anyURI
_:page dc:creator <http://www.ihmc.us/users/phayes/>

This effectively flips what is now the default (you would use, e.g.
foaf:primaryTopic to go in the opposite direction)

Not that I'm advocating this. For one thing there are many information
thinks that couldn't possibly be understood as designators. (well,
shouldn't ;-)

-Alan

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 13 June 2011 01:13
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Beh! Some isomorphism is all I ask for. Take your height and shoe size
- those numeric descriptions will correspond 1:1 with aspects of the
reality. Keep going to a waxwork model of you, the path you walked in
the park this afternoon - are you suggesting there's no isomorphism?
Lovely imagery, thanks Pat.

But replace "a novel written by a dog" for "dog" in the above. Why
should the concept of a document be fundamentally any different from
the concept of a dog, hence representations of a document and
representations of a dog? Ok, you can squeeze something over the wire
that represents  "a novel written by a dog" but you (probably) can't
squeeze a "dog" over, but that's just a limitation of the protocol.
There's equally an *actual* document (as a bunch of bits) and an
*actual* dog (as a bunch of cells).

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 13 June 2011 02:28
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Yes, in fact I am *denying* there is *any* isomorphism. What structures are
you intending to appeal to when you say 'isomorphic'? Do you see reality as
being some kind of giant category? Or what?

Lets suppose that the interpretation/denotation/semantic/reference mapping
goes from the representation to the reality. (Since its an isomorphism, it
should be invertible, so this is an arbitrary choice, right?) Call this
mapping ref, so X ref Y means that Y is one way reality might be assuming X
is true, when X is used as a representation. First point: for descriptions,
ref is a Galois mapping, which means that when X gets larger - when the
representation says more about the reality - then Y, the number of ways that
the reality can be, gets smaller. The more you say, the more tightly you
constrain the ways the world can be. This is exactly the opposite from how
an isomorphism would behave.

Next point: there can indeed be correspondences between the syntactic
structure of a description and the aspects of reality it describes. Your
example of the path I walked would be one, if you were to draw the path on
an accurate map. But this is completely hostage to the map being
**accurate**. If I used a not-to-scale sketch map, then no, you don't get
isomorphism. Yet it seems to me that these two cases, the real map and a
sketch map, both seem to work in the same kind of semantic way. So this
explanation of how they work cannot depend on there being an isomorphism.
Maybe there is a kind of homomorphism, but even that is kind of hard to make
work. What it seems to be is more like, the map projection function is a
homomorphism of the entire mapped terrain, and then marks or symbols on the
map indicate terrain location by inverting this projection morphism and
asserting an existential to the effect that the thing described is contained
in that back-projected space in the terrain from space occupied by the mark
or symbol in the map space.

But I don't think all this is really germane to the http-range-14 issue. The
point there is, does the URI refer to something like a representation
(information resource, website, document, RDF graph, whatever) or something
which definitely canNOT be sent over a wire?
I dont follow your point here. If you mean, a document is just as real as a
dog, I agree. So?  But if you mean, there is no basic difference between a
document and a dog, I disagree. And so does my cat.
So improved software engineering will enable us to teleport dogs over the
internet? Come on, you don't actually believe this.

Pat

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 13 June 2011 03:46
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


That is what I was calling isomorphism (which I still don't think was
inaccurate). But ok, say there are correspondences instead. I would
suggest that those correspondences are enough to allow the description
to take the place of a representation under HTTP definitions.
I'm saying conceptually it doesn't matter if you can put it over the
wire or not.
Difference sure, but not necessarily relevant.
It would save a lot of effort sometimes (walkies!) but all I'm
suggesting is that if, hypothetically, you could teleport matter over
the internet, all you'd be looking at as far as http-range-14 is
concerned is another media type. Working back from there, and given
correspondences as above, a descriptive document can be a valid
representation of the identified resource even if it happens to be an
actual thing, given that there isn't necessary any "one true"
representation. We don't need the Information Resource distinction
here (useful elsewhere maybe).

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 13 June 2011 07:52
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


OK, I am now completely and utterly lost. I have no idea what you are saying
or how any of it is relevant to the http-range-14 issue. Want to try running
it past me again? Bear in mind that I do not accept your claim that a
description of something is in any useful sense isomorphic to the thing it
describes. As in, some RDF describing, say, the Eiffel tower is not in any
way isomorphic to the actual tower. (I also do not understand why you think
this claim matters, by the way.)

Perhaps we are understanding the meaning of http-range-14 differently. My
understanding of it is as follows: if an HTTP GET applied to a bare URI
http:x returns a 200 response, then http:x is understood to refer to (to be
a name for, to denote) the resource that emitted the response. Hence, it
follows that if a URI is intended to refer to something else, it has to emit
a different response, and a 303 redirect is appropriate. It also follows
that in the 200 case, the thing denoted has to be the kind of thing that can
possibly emit an HTTP response, thereby excluding a whole lot of things,
such as dogs, from being the referent in such cases.

Pat

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 13 June 2011 10:16
To: public-lod@w3.org


The Referent of a URI re., http-range-14 is the observation (or description)
subject. In this context the subject may or may not be a real world object
or entity.

In the context of Linked Data, the observation (or description) subject URI
resolves to a Representation of its Referent. Actual representation is
accessible via an Address. Data representation formats are *optionally*
negotiable e.g., via content negotiation, and ultimately varied i.e., many
serialization formats for byte stream that actually transmits data from its
source to its consumers.

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 13 June 2011 10:25
To: public-lod@w3.org


No, 200 OK means this URI is functionally an Address i.e., a place that's
ready to transmit the byte stream associated with the Address. When the
functionality of the URI changes i.e., its a Name rather than an Address,
courtesy of de-reference (indirection), there is a 303 redirect (an act of
indirection). Yes, a data server indicates to a client that a given Address
is functional i.e., I'll transmit you a byte stream from this place which I
crafted for this specific purpose. Yes, if the response is 200 OK since the
URI is an Address. No if the response is a 303 since the URI is a Name.

It still boils down to the URI abstraction which ingeniously caters for two
vital data access by reference operations: Name (for de-reference and
indirection) and Address (for Data Access).


Kingsley

----------
From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk>
Date: 13 June 2011 10:59
To: public-lod@w3.org
Cc: public-lod@w3.org


**
Before I comment, I just want to summarise my understanding because
http-range-14 is a weird term;

I understand it as the range-14 issue that when you use 302 to redirect from
a URI-A to a URL-B we have a convention that URL-B has some relationship to
URI-A but it's not defined, we don't treat this as semantic information and
tend to throw it away.
(stated to make sure I've understood correctly)

This bit a chap working with some of my data;
* he loaded some data from <URI-A> using a library
* URI-A did  a nice content-negotiated 302 to URL-B (and RDF document)
* URL-B had a description of <URI-A>
* The problem was he also wanted to auto extract the license for this data,
but the triples gave the license as a relation to <URL-B>, but the system
treated the data as loaded from <URI-A>

At the most simple level, we could add some triples when loading a graph via
redirection...
<URI-A> myprefix:http302redirect <URL-B>
or something richer with dates, http options etc.

You could do something even fussier with http headers stating an explicit
relationship with the 302, but all of this is very nice but the main problem
seems to be that it's hard and doesn't benefit someone who just wants to
knock something up quickly.

The real problem seems to me that making resolvable, HTTP URIs for real
world things was a clever but dirty hack and does not make any semantic
sense. We should use thing://data.totl.net/scooby to refer to the dog and
have a convention that http://data.totl.net/scooby will refer to some
content about my dog. This URL can of course then content negotiate as
normal. You could also use this in reverse. *thing*://
www.imdb.com/title/tt0910554/ is the primary topic of
http://www.imdb.com/title/tt0910554/

Yes, you could end up with a whole bunch of URIs for the same thing;
thing://data.totl.net/scooby thing://data.totl.net/scooby.html thing://
data.totl.net/scooby.pdf thing://data.totl.net/scooby.csv all are the same
thing, but big deal.

The only tricky thing would be people may get confused about the "thing" URI
related to a document. For example, given a document in pdf, word and html,
you might need a separate thing:// URI to describe the abstract concept of
the document, but that's not the primary topic of any of the documents. Such
fiddling details are more the province of people with experience, so I'm not
too worried. What we should be doing is making the common garden data really
easy to produce.

I've spent a lot of time trying to teach these concepts to people at
hackdays & barcamps, plus in a professional context. http:// URIs for real
world things clearly make it harder to learn. The follow-you-nose gimick is
cool, but we could do that with a change convention, and a trivial update to
existing libraries (just resolve thing:// via http://)

I expect the answer is "it's too late to change now". To which I am tempted
to say "change or die".

(again, another Monday morning ranty mail! but I feel like someone should be
commenting on the emperors URI  convention. If there's a cheat sheet I
should read before continuing commenting on these subject, please point me
to it.)

-- 
Christopher Gutteridge -- http://id.ecs.soton.ac.uk/person/1248

You should read the ECS Web Team blog: http://blogs.ecs.soton.ac.uk/webteam/


----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 13 June 2011 11:21
To: public-lod@w3.org


**
I think its an ingenious tweak, but easily perceived as a "clever but dirty
hack".

As you know, the problem with HTTP URI based Names is that they are
unintuitive. Thus, the entire narrative re. Linked Data should never have
built solely around use of HTTP scheme based URIs for Names. It could have
just started with URIs and worked its way toward the benefits inherent in
using HTTP scheme URIs due to encapsulation of de-reference (indirection)
and address-of operations. Instead, as I've stated repeatedly, we oscillate
between use of URI and URL for a concept that leverages all aspects of the
URI abstraction.

HTTP URI based Names ultimately deliver the least disruptive path of a
global data spaces of data objects represented by linked data graphs. We
just need to fix the narrative, and that starts by decoupling the concept of
Linked Data from RDF. RDF is but an option, if you choose to use RDF in a
particular way.
But that won't work in any of today's Web Browsers off the bat. Thus, it
doesn't solve the need for the transition to be none disruptive to user
experience. It potentially works one way i.e., introspectively (to a point)
from the resource at: http://www.imdb.com/title/tt0910554/, if so crafted by
the publisher. It won't work from the Address bar of a Web Browser. It won't
work with cURL or wget etc. It just won't work from the client side.

----------
From: *William Waites* <ww@styx.org>
Date: 13 June 2011 22:51
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <
richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked
Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


* [2011-06-12 22:52:18 -0700] Pat Hayes <phayes@ihmc.us> écrit:

] OK, I am now completely and utterly lost. I have no idea what you
So in the previous email, Danny used the important word - relevant.
Let's unpack that a little bit. Suppose we have no range-14 and all
these RDF statements out there are all mixed up about what they refer
to. Well, not completely mixed up. They're kind of clumped together,
web pages and the things they are about tend to get confused but
probably the chain of inferences that lead you to believe that the
Eiffel tower is a dog is pretty unlikely.

So there is some relationship between a description of the Eiffel
tower and the tower itself. The relationship is akin to similarity in
a very specific way - they are similar enough that someone thought it
made sense to write down that the tower was 356m tall. Unfortunately
they got confused and wrote down that the web page was 356m tall. No
matter, they are still different enough in the relevant ways that
anyone interested in heights on the order of hundreds of meters is
unlikely to be confused.

Same with the dog. Is the distinction between the dog and the picture
important to me? Maybe, maybe not. It depends what I'm trying to do.
If I want to make sure that I can recognise the doc when I meet her,
a picture or the actual dog might do equally well.

So that's the thing, similar or different in the relevant respects for
the purpose at hand. The purpose at hand is necessary to figure out
relevance. Just deriving all the possible things that can be entailed
from the information you have is no good. You have to derive the
relevant things in a particular context. You have to throw out givens
that are irrelevant to you or that lead you to irrelevant or
nonsensical entailments.

In the general case this is hard. It's not even clear if it is
relevance understood like this is computable. The intent of the user
is so clearly in the loop providing a reference frame for evaluating
relevance and capturing and representing a user's intent is not
something we have a good way of doing apart from hand-crafting
interactions.

Is it doable in simple cases (with rules programmed by humans) like
figuring out the foaf:knows graph where people and their homepages
can just be merged without too many bad side-effects.

We need a different kind of rule here - a cut rule. That says if
some condition obtains, *remove* some statements. For example,
remove all { ?doc a foaf:Document } before running the productive
rules might be a common one where we know that we aren't interested
in information resources.

Cheers,
-w

--
William Waites                <mailto:ww@styx.org>
http://river.styx.org/ww/        <sip:ww@styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45

----------
From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk>
Date: 13 June 2011 23:17
To: William Waites <ww@styx.org>
Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, Richard
Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>,
Linked Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


Perhaps what we need to start worrying about is getting some test cases --
or a big pile of real (shonky) data to extract useful facts from...

Would it be worth starting a collection of data which makes sense to humans
but isn't strictly semanticly clear?

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 14 June 2011 05:33
To: William Waites <ww@styx.org>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <
richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked
Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


What has that got to do with the tower being similar to its description?
First, you seem to be assuming here that the tower and its description are
NOT similar, contrary to what you said earlier and Danny seems to be
insisting upon. Second, this hypothetical person is, we both agree,
confused. They made a mistake, what they said was wrong. Correct? I ask,
because many people seem to want to say that they were NOT confused or
wrong, just kind of less correct than if they used the right URI. Third, and
most important, anyone interested is unlikely to be confused, yes indeed.
But any piece of software or inference engine is not unlikely to be
confused. In fact, it is virtually guaranteed to be in the position of
generating absolute nonsense.  If all the inference software was as smart as
the average ten-year-old human, we wouldn't even need the semantic web
because the software would be able to read the text on Web pages. But it
isn't, and we do (need it, that is.)
But if you are a semantic inference engine, and you get the dog and its
picture muddled, will you likely generate a lot of nonsensical assertions?
Answer, Yes, you will. Which is the key point at issue here.
Yeh, yeh. Contexts, local purpose, pragmatism. Now, make this happy thought
cash out in an actual logic for use on the Web. Bear in mind that the very
first principle of the Web is that the *publisher* of the data, who asserts
these things about dogs or pictures of dogs, cannot possibly know what
'context of use' is going to be relevant to the *user* of the published
content. So I say that my picture of Fido has had its rabies shots, and what
will you make of this information, for your purposes, on the other side of
the planet in a foreign city years after Fido has died? And what about all
the other people who will use this misinformation for their different
purposes? How am I going to keep them ALL happy?
When you are the agent who is using this information, sure. But when you are
the one publishing it or asserting it, you cannot do this. And when you are
the one writing the rules to determine a globally accepted notion of
entailment, you cannot do it.
Well, now you are stepping into an ocean of cans of worms. Relevance logics,
paraconsistent logics, etc. ad nauseam. But I dont think its our business to
even go there. The Web logics don't give instruction on how to use
information rationally in the face of uncertainty. Their purpose is much
less ambitious and more restricted: just give entailment conditions which
are universally correct, so that *whenever* you believe (for whatever
reason) the premis, you are committed to believing the conclusion. Strict
classical entailment works for everyone, and its about the only thing that
does. So that is what we should be capturing in RDF and OWL, etc.. So, to go
back to the http-range-14 issue, what are the *universal* principles that
allow everyone to make the same valid entailments involving URI retrievals?
AFAIKS, Danny is saying that there aren't any (?) Which is a reasonable
answer, but is rather defeatist. I think http-range-14 is more useful than
this.

Pat

----------
From: *Michael Brunnbauer* <brunni@netestate.de>
Date: 14 June 2011 10:45
To: Pat Hayes <phayes@ihmc.us>
Cc: public-lod@w3.org



re
We should be able to present the user a lot of sensical assertions (and
maybe
some nonsensical ones) if we know he is concerned with information about
dogs
instead of information about pictures.

Anyway - I think special purpose reasoners will play a much bigger role in
the
near future than general purpose reasoners because they perform better with
big and messy data.

And publishers will start to differenciate between dogs and pictures of dogs
as
soon as it provides them added value. Until that day, we will have to live
with the situation and try to nudge people in the right direction (which
includes httprange-14). But mass adoption means messy data in any case.

Regards,

Michael Brunnbauer

--
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel


----------
From: *Richard Cyganiak* <richard@cyganiak.de>
Date: 14 June 2011 10:53
To: Christopher Gutteridge <cjg@ecs.soton.ac.uk>
Cc: William Waites <ww@styx.org>, Pat Hayes <phayes@ihmc.us>, Danny Ayers <
danny.ayers@gmail.com>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked
Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


Define “strictly semantically clear”. Good luck!

Best,
Richard
----------
From: *William Waites* <ww@styx.org>
Date: 14 June 2011 10:54
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <
richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked
Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


* [2011-06-13 20:33:47 -0700] Pat Hayes <phayes@ihmc.us> écrit:

] > So there is some relationship between a description of the Eiffel
Simply that they are similar enough (in the relevant respects etc)
that one can write ":eiffel :height 324" for either and (reasonably?)
expect the reader not to be confused.
Confused or speaking loosely, not bothering to make the distinction
because it seems to them that they are being clear enough that any
reader will understand what they mean. If you call them on it they
will probably agree that, yes, "what I really meant was ... but to
have written that out would have seemed excessively pedantic" in
exactly the same way that I wasn't confused when I wrote "confused"
but I admit to being inexact :)

So I agree with these many people who want to say that there are a lot
of inexact statements that are not made by confused people just by
people with perhaps unreasonably high expectations that the readers of
their statements will be able to figure out what they meant if not
strictly what they said.
So this is the mismatch. Publishers write things down with some
assumptions of what is likely to cause confusion that are probably
based largely on their interactions with other humans, not with
inference engines.

Writing things down exactly is incredibly difficult. A very large part
of almost every discussion or disagreement usually comes down to
someone understanding what was said differently than the person who
said it meant. It can often take a lot of discussion before this
becomes apparent. And that's between humans!

So we want to get people to publish linked or structured data that is
as exact as possible. Each step in that direction is a little bit more
burdensome for the publisher, feels a little bit more pedantic and
verbose to write down, means the publisher needs to know a little more
about the kinds of things a reader can handle, but at the same time is
easier to write software that can use it using simpler and more
general algorithms that we know.

Some people seem to be saying that range-14 is a step too far. Other
people seem to be saying that without that step it's impossible to
write software in a general way to work with the data. If both are
correct then we're stuck.

The perception of RDF as complicated, verbose and pedantic is common
and is something we cannot afford. Personally I don't think the range-14
arrangment is too burdensome but outside this community this is a
minority viewpoint. We cannot throw up extra barriers to publishers.
So we need better software that can handle this kind of inexact data.

] When you are the agent who is using this information, sure. But when
Publishers will always make assumptions about how the information
will be used. The assumptions will usually not be explicit. Even
humans don't have a globally accepted notion of entailment, it's
all about context and intent on the part of the agent doing the
reasoning. They will just have to deal with the fact that the
publisher may not have anticipated their use.

Since range-14 seems to be a sticking point, we can try to address
that particular kind of ambiguity with guidance about how to reason
about information and non-information resources, and this guidance
won't be general, it will have to do with particular classes and
predicates and how they should be interpreted in the local (graph)
context.

] Well, now you are stepping into an ocean of cans of worms.

Oh, well aware of that :)

----------
From: *Richard Cyganiak* <richard@cyganiak.de>
Date: 14 June 2011 10:57
To: Michael Brunnbauer <brunni@netestate.de>
Cc: Pat Hayes <phayes@ihmc.us>, public-lod@w3.org


Yes. It's certainly true in the case of the web -- you cannot apply
off-the-shelf standard OWL reasoners on web data, because of its messiness.
This is quite well-documented in the literature.
That's spot-on.

Best,
Richard

----------
From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk>
Date: 14 June 2011 11:07
To: Michael Brunnbauer <brunni@netestate.de>
Cc: Pat Hayes <phayes@ihmc.us>, public-lod@w3.org


I think in this lies a key issue. Much of my experience of producing linked
data has been anti-climatic. If I was having to justify every coding hour
then it's hard to say the providing public open data really gained much
value for our business.

What would be really nice is some public services which consume RDF and
produce something useful, so that people actually get a direct value out of
putting out linked data.

One of my unfunded, background projects is programme.ecs.soton.ac.uk  -- I'm
working on a PHP library which will consume the RDF and produce a nice big
part of an HTML website for a conference from it, along with mobile
interfaces & tools like a "print out a schedule to go on the door of each
room each day" and "check you didn't double book a speaker". Plus a tool to
author the data in a spreadsheet and convert that into RDF.

The goal is to get people creating nice RDF data for their conferences
because it makes their lives easier, not because it's the right thing.
Hopefully in the next year or two it'll hit a tipping point and we'll get
some third party tools working with the data and it'll be a really useful
format.

You can see a prototype of the PHP library in action on this conference
site:
http://data.dev8d.org/2011/**programme/<http://data.dev8d.org/2011/programme/>

I'd encourage the community to build more tools for webmasters, not for the
linked data community!

----------
From: *Alan Ruttenberg* <alanruttenberg@gmail.com>
Date: 14 June 2011 11:47
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Christopher Gutteridge <cjg@ecs.soton.ac.uk>, William Waites <
ww@styx.org>, Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>,
Linked Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


Why don't we start with the following:

Message sender has some statements they want to communicate. They
encode their statements into the language. The encoding is sent. The
receiver examines the the encoding and constructs an understanding
consisting of some statements. Key is that the construction and
interpretation of the message are isolated events - the first
communication between the parties is via the message.

Now the parties meet and compare the statements intended with the
statements understood. Note that the parties might be humans or
machines, without prejudice.

Repeat.

If, reliably (which doesn't mean *always*, but does mean more often
then not) the comparison is favorable, then the messages are
semantically clear. The "strictly" word is superfluous.

We can design various protocols for doing the comparison, which does
not have to be a discussion. For example the message might specify
some actions and we can check whether the actions taken after
interpreting the message match the intention of the sender, or whether
the receiver has confidence enough in their understanding of the
message.

What we have seen is that for some of the messages being discussed in
this thread, there have been raised a number of concerns about whether
that process will work under various of the assumptions and assertions
made by the participants in the thread. My assessment is that, at the
moment, the messaging that has been proposed is not semantically
clear.

-Alan

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 14 June 2011 17:55
To: William Waites <ww@styx.org>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <
richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked
Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


Well, you have got me confused. Are you saying here that it does in fact
make sense to say that a description of the eiffel tower is 356M tall? So
that your triple here is actually ambiguous, but one can rely of reader's
common sense to figure out which one is meant? I had always thought that
when people used a name of a name instead of the name of a thing, they were
usually just blurring the use/reference distinction, not that they genuinely
weren't sure whether they were talking about things or names.
Maybe we just produced the Web situation in miniature, because whether or
not you were confused, I certainly was (and still am) trying to figure out
what you are saying here.
If they do this when writing, say, Javascript or PERL, things will go badly
wrong. If they do it when writing RDF, things will also go wrong. Even when
writing English in a non-conversational situation where reading is separated
from writing (eg road signs, email), things will go wrong surprisingly
often. I am not sure how much we can expect to be responsible for people
saying garbage because they are too lazy or incompetent to learn how to use
a language.
You seem to be making my point for me here :-)
No, what will happen is that a class of people will arise who *do*
understand http-range-14 (and other issues that are perceived as 'hard') and
they will for a short while be able to earn a living writing (or writing
code which generates) this stuff properly. This situation will last at most
a decade, because by then a new generation of people will have educated
themselves to 'speak' correctly in this new style without apparent effort,
and all the whining about how terribly hard it was will be the stuff of
nerdish jokes on XKCD.
All we really need is enough people who can see through this mist of fear
and actually get RDF written. On the whole, looking at the way the linked
data is being created, I think we are doing quite well. Once stuff starts
working and doing something useful, all this fear of formalism will melt
away.
Sure, and to solve global warming, we need better power sources that don't
emit CO2. Your move. We aren't going to get this magic software any time
soon. The inference software in the semantic web engines behind RDF and OWL
and RIF are the state of the art. If people can't write data that doesn't
break these, we are in trouble. But I think they can: after all, they write
RDB data out the wazoo. Anyone who can understand SQL can surely get their
head around the distinction between the eiffel tower and a web page.
Ah, but they do. That is exactly why inference engines work.
No, its not all about context. There really are non-contextual logics. If it
really were ALL about context, the Web itself would not work.
It will be (well, it can be) general for the entire Web. Why not? The
http-range-14 rule is actually pretty simple and intuitive. In my
experience, most people kind of assume it without thinking about it,
actually. (**Of course** the URI of a web page is the name of the web
page...) So why not just say it, loud and clear, until people get it?

Pat

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 14 June 2011 17:56
To: Christopher Gutteridge <cjg@ecs.soton.ac.uk>
Cc: Michael Brunnbauer <brunni@netestate.de>, public-lod@w3.org


Well, +1 to that :-)

Pat

----------
From: *Richard Cyganiak* <richard@cyganiak.de>
Date: 14 June 2011 20:19
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: Linked Data community <public-lod@w3.org>


Alan,
Google won't scrap schema.org because your thought experiment proved that
it's not “semantically clear.”

I think that we are beyond the point where that kind of extremely idealised
account is useful for evaluating web technologies.

But just to stay in the spirit of your proposal:

1. The sender may not care that certain receivers be able to understand
their message
2. The message cannot strictly be the first communication -- there always
has to be prior agreement on protocols, formats, languages, vocabularies
3. Both parties will already share certain context that is outside of the
message, otherwise why would they be communicating
4. Depending on the value of the communication to the receiver, they may or
may not be willing to go to certain lengths in order to interpret the
message, including the application of heuristics, studying the sender's
documentation, dereferencing their schema and applying reasoning etc
5. The receiver may want to use the information for purposes not intended by
the sender

So this is all rather subjective and context-dependent. I'm extremely
skeptical of generic claims about the “strict semantic clarity” of a certain
way of publishing data, especially if it is claimed to be a binary
black-and-white thing.

Best,
Richard

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 14 June 2011 21:37
To: public-lod@w3.org


Yep!

+1000

We just have to accept that the Web is an ocean liner scale space, things
are going to happen slowly, courtesy of opportunity costs materialization.
Unfortunately, people don't like prescriptions that are preventative in
nature, they simply like to have cures for problems as they arise. Sad but
true, at least in my years of experience.

As stated repeatedly, we should never scorn or take issue with any entity
that contributes structured data (in any format) to the Web. Half bread is
better than none :-)

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 14 June 2011 21:41
To: public-lod@w3.org


The community should build and embrace solutions for a broad range of user
and developer profiles. The Web isn't about developers solely, neither is
computing in general.

As I recall, Apple's success isn't driven by its sole preoccupation with
developers, ditto Google, Facebook etc.. All of these organizations provide
palpable solutions that deliver tangible value off the bat, no code required
whatsoever.

Developers are but one community, focusing on them solely isn't going to
change much, really!

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 14 June 2011 22:03
To: public-lod@w3.org


**
Yes!

Serving folks their lunch works better than warning them about the effects
of your pending actions or the ill effects of their current action. This is
why the solution to the problem lies in delivering applications and services
that provide value to a broad spectrum of consumer profiles, courtesy of
their fundamental understanding that Generic Names and Location specific
Names (Addresses)  != same thing.

----------
From: *Michael Brunnbauer* <brunni@netestate.de>
Date: 14 June 2011 23:37
To: public-lod@w3.org



re

as I was talking about "messy" data, some anecdotes from our work with
foaf-search.net:

-Want to see some people and groups that are an owl:Ontology ?

http://www.foaf-search.net/SearchRDFType?type=http%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23Ontology
 Thank god everyone using our website either knows instantly that this is
 wrong or does not have a clue what owl:Ontology is.

-Today, our website spent hours merging thousands of different people into
 one person because our java developer made an update and forgot the code to
 check the inverse functional property foaf:mbox_sha1sum (SHA1-hash of
mailbox
 URI) for bad values like 08445a31a78661b5c746feff39a9db6e4e2cc5cf
(SHA1-hash
 of "mailto:"). We need these kind of hacks to keep everything running.

-foaf:homepage and foaf:weblog are inverse functional properties in the
 foaf ontology. We excluded them in our reasoners in fear of users having
 shared pages or being sloppy about what to fill in when asked for their
 homepage or weblog. But the very popular livejournal blog software only
 uses foaf:weblog to identify your friends so we had to accept at least
 foaf:weblog.

-This is something I found before our crawler found it - fortunately:
 http://data.totl.net/dave.rdf

-From the same website comes a huge database of many of the world's obscure
 industrial bands. Cool - except they are endless and made up on the fly :)
 http://data.totl.net/musicdb/music.cgi/bands?page=1

-Speaking about fakes: http://fakefriends.me/ makes up fake identities
 including crawlable FOAF RDF data on the fly. And almost every elgg blog
our
 FOAF crawler gets to crawl has been taken over by spammers or was installed
by
 them in the first place.

-Things can have so many different foaf:names. What is the canonical one ?
 We are currently using the one with the most quads but this is surely not
 the best possible solution.

This list will probably grow much larger in the near future.

----------
From: *Alan Ruttenberg* <alanruttenberg@gmail.com>
Date: 15 June 2011 02:07
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Linked Data community <public-lod@w3.org>


Richard, that wasn't the point. You mocked the idea that "semantically
clear" could be defined. I responded with an attempt.
We will agree to disagree then. Perhaps in another thread you will say
what *will* be useful for evaluating web technologies. Or do you think
they are above evaluation?
ah, good!
Not relevant to this piece of the thread. The goal was to have a go at
defining "semantically clear". But in the spirit of responding I will
grant you that some people may not care. However I'm pretty sure that
the people we care about using schema.org will care. There will be
others who use schema.org not to communicate but to try to game the
google ranking system, and for such people, whether there is a message
conveyed or not may not matter. However I don't think we are
interested in considering their needs
Granted. I don't think that this affects the substance of the
proposal, but if you say how it would I will try to address it.

> 3. Both parties will already share certain context that is outside of the
message, otherwise why would they be communicating.

I have not said that they are intentionally communicating - that the
message was intended for an specific person. This removes the support
for the assumption of the first clause. But to address it: that they
will share a certain context outside the message may or may not
obtain. For instance sender may be a person, and receiver a machine,
and it's not clear what shared context they could have given the
current state of machine technology. However if you think the shared
context somehow undermines the proposal, please say how.
Again, this is outside the scope of my proposal, which in response to
your skepticism about whether "semantically clear" could be defined.
ditto.
You have not demonstrated subjectivity or context-dependency in my
proposal. However I will be interested if you attempt to.
You may be skeptical that semantic clarity (again, I don't think
"strict" brings anything) is *relevant* in some or all cases.  I may
engage you on that issue separately. However I don't see that you have
succeeded in finding a flaw in my proposal for how one might go about
defining it operationally.

Regards,
Alan


----------
From: *Richard Cyganiak* <richard@cyganiak.de>
Date: 15 June 2011 12:11
To: Michael Brunnbauer <brunni@netestate.de>
Cc: public-lod@w3.org


Another anecdote, I don't remember whom I heard this from: From FOAF data
you can see that a lot of people say that their homepage is … "Google".

Best,
Richard

----------
From: *Richard Cyganiak* <richard@cyganiak.de>
Date: 15 June 2011 12:24
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: Linked Data community <public-lod@w3.org>


I have no interest in theoretical discussions that are detached from
application.
Adoption trends, ergonomics, fit with the existing technology ecosystem,
existence of migration paths, marketability, potential of network effects.

Best,
Richard

----------
From: *Mischa Tuffield* <mmt04r@ecs.soton.ac.uk>
Date: 15 June 2011 12:24
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Michael Brunnbauer <brunni@netestate.de>, public-lod@w3.org


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

<snip/>
I am not sure this is on-topic anymore but, these are the following values I
blacklisted and flagged when used as an IFP in the FOAF validator I wrote on
foaf.qdos.com (I know it is currently down, we are repurposing hardware at
the mo - so sorry!).

$ifpblacklist = array("<mailto:
>",'"da39a3ee5e6b4b0d3255bfef95601890afd80709"','"08445a31a78661b5c746feff39a9db6e4e2cc5cf"','"20cb76cb42b39df43cb616fffdda22dbb5ebba32"','<
http://www.google.com/>','<http://www.google.com>','<http://www.bbc.co.uk/
>','<http://bbc.co.uk>','"02085a0d20a5f574c1ce6cfe42bba6e85cfe07cf"');

Some of the hashes in the blacklist where added due to copy and pasting
errors when people where knocking together handwritten FOAF files, iirc John
Domingue shared one of the foaf:mbox_sha1sum's with Tom Heath (probably from
the time when they both worked at KMI).

Mischa
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.12 (Darwin)

iQIcBAEBAgAGBQJN+IhOAAoJEJ7QsE5R8vfvVTsP/0kx9/spxqLciwUAWCRHPT3V
SWgsl/Rlk0i4SDOvBcyAdXpuOxQfB06nuY5Bps4RrfZWb5Q5AwYMThGmEDeXq1n+
STlD3eNsXBscaF5Yocnxp22Z2t98d3bNB8Lia5uuEJmq28mG+H3ijNqcDq7+ztnp
f/XG+DV5ONXsE2XRmfQ8nFTKm/6Rkaylg49Ndjx0xcybEUXWthpBxdVprsKXHdq7
lIZ4/TtF5i/B37sIx5yOUhXs1d0wR+D+hkOIk0vBHoCbvcOhutE3LjanNAPK/B+f
HWG2AAhc3w+syeXs2noABabCO+1Ac2CkKGfA4F2rhdD5xnk/tCEkwZGrqhb4W61k
eOYdU1OI9epbayhVTimfRn28/I4/mwNmhuevQYNGmt3DuC7RrgPiH0OOqCuu+Cp3
Aed/lVt4lSyeHNQQCLBy8ZPDTfdPbXL449Dvsz6i/2fwFtFjHmTF/Z0Ac0HOiV0y
eqxL+FOb3Qt0VAQ/Abklii282jwC91Wlb+TIifPjF9xD9aUzndbBxBNlPe7mtrIy
QMNwgTerGlJx2FX+81v8EvmzjKuolVeMq+NzYA5ohiUZtiSWa7eJwms28aOCWj50
OOz+QTo4VaCcI0UVrWUcAeNHAfKgNV7eKX2wycPOPnjta/DHYAIuzvoTm3WLShSL
YT+NT4LxkoRf9u26PRRA
=ENLb
-----END PGP SIGNATURE-----


----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 15 June 2011 17:27
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Even with information resources there's a lot of flexibility in what
HTTP can legitimately respond with, there needn't be bitwise identity
across representations of an identified resource. Given this, I'm
proposing a description can be considered a good-enough substitute for
an identified thing. Bearing in mind it's entirely up to the publisher
if they wish to conflate things, and up to the consumer to try and
make sense of it.

As a last attempt - this is a tar pit! - doing my best to take on
board your (and other's) comments, I've wrapped up my claims in a blog
post: http://dannyayers.com/2011/06/15/httpRange-14-Reflux

----------
From: *Jason Borro* <jason@openguid.net>
Date: 15 June 2011 20:35
To: Linked Data community <public-lod@w3.org>


I agree with your sentiments Danny, fwiw.  The current scheme is a burden on
publishers for the sake of a handful of applications that wish to "refer to
these information resources themselves", making them "unable to talk about
Web pages using the Web description language RDF".

What about minting a new URI at "http://information.**
resourcifier.net/encodedURI <http://information.resourcifier.net/encodedURI>"
or similar for talking about such things?  The service could even add value
by tracking last update times, content types, encodings, etc.

Jason

p.s. Don't bother criticizing the half baked idea, I thought about it for <
10 seconds.  The point is 100 alternatives could have been hashed out in the
time spent discussing and implementing http-range-14.  Kudos to google et al
for ignoring it.

----------
From: *William Waites* <ww@styx.org>
Date: 15 June 2011 23:24
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <
richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked
Data community <public-lod@w3.org>, Michael Hausenblas <
michael.hausenblas@deri.org>


* [2011-06-14 08:55:09 -0700] Pat Hayes <phayes@ihmc.us> écrit:

] Well, you have got me confused. Are you saying here that it does
I'm just saying that things like this will be published because the
publisher is confused, or mistaken or doesn't think that making the
distinction is important or convenient and consumers of the data have
to deal with it.

We should encourage the publishers to do a better job but some of them
will balk and sometimes, like with the schema.org that started this
thread, big, important publishers with a lot of influence will balk.
If we're lucky we can convince them to fix it, otherwise writers of
software that consumes the data and tries to reason with it have to
work out a way to be robust in the face of this kind of ambiguity.

That's all.

-w

----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 15 June 2011 23:46
To: public-lod@w3.org


Danny,

This is part of the problem:

TBL's argument: the HTTP URIs (without "#") should be understood as
referring to documents, not cars.

It assumes that the audience doesn't have a clue, so the description has to
be so condescending albeit inadvertent.

How about:
TBL's argument: the HTTP URIs (without "#") should be understood as
referring to an Address. A Data Source Name. What data publisher provides to
user agents for accessing specific data in a given format, courtesy of
content negotiation or lack thereof etc..

The confusion is a self inflicted one courtesy of narrative style and tone,
methinks.

URIs abstract Names and Addresses. This whole thing isn't unlike DNS. Points
of presence on TCP/IP networks have NIC addresses and cnames, courtesy of
DNS. Spreadsheets have offered cell addresses and cell names since forever.
Programmers have worked with de-reference (indirection) and address-of
operators forever. Most of the time when they encounter the: "... is a
document, not cars ... " style narrative, its throws them for a loop!

As you know, a Document == Data Container that's projected to users via user
agents (typically browsers) using a specific presentation oriented metaphor.

Using 303 to deliver indirection is an accurate reflection of the required
heuristic for implementing de-reference (indirection) via HTTP URI based
Names. Otherwise, use a # terminated URI and get similar (but ultimately
limited) effects without an actual 303.

Web users started off using Addresses as Names for Resources (Web Docs). Now
we're introducing a new abstraction where Name and Address are Distinct
(i.e., we have Named Objects and Object Representation Addresses,
interwoven), thus we need to find a variety of ways to explain and
demonstrate this new abstraction generally known as Linked Data. One size
never fits all, and http-range-14 is certainly not going to be the narrative
that breaks that age-old mold :-)

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 15 June 2011 18:30
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Im sure you are right, but I have no idea why you think this fact is
remotely relevant to the issue.
Boy, that is a humdinger of a non-sequiteur. Given that HTTP has
flexibility, it is OK to identify a description of a thing with the actual
thing? To me that sounds like saying, given that movies are projected, it is
OK to say that fish are bicycles.

AFAIKS, the details of HTTP really have nothing at all to do with this
issue, ironically enough. The only thing that HTTP does is to closely
associate rather a lot of URIs to things like Web pages. The *nature* of the
http 'association' really are irrelevant to this issue, which has to do with
when it is legitimate to infer a denotation relation from this association
relation. The question at issue here is what URIs are said to denote. It is
very natural and intuitive to say that a URI which is http-associated with X
also denotes X. Hence the 200 convention; but  we want some URIs to denote
things that are definitely not the kind of thing that HTTP (or any other
XXTP) can possibly associate a URI to. Hence the 303 work-around.
Well, if the publisher wants to say that a web page actually is Sherlock
Holmes, or my pet cat Marco Polo, then that publisher is bat-shit crazy, and
I will ignore them.
OK, thanks. Here is your argument, as far as I can understand it.

1. HTTP representations may be partial or incomplete. (Agreed.)
2. HTTP reps can have many different media types, and this is OK. (Agreed,
though I cant see what relevance this has to anything.)
3. A description is a kind of representation. (Agreed, and there was no need
to get into the 'isomorphism' trap. We in KRep have been calling
descriptions "representations" for decades now.)

4. Therefore, a HTTP URI can simultaneously be understood as referring to a
document and a car.

Whaaat? How in Gods name can you derive this conclusion from those premises?

Pat

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 16 June 2011 02:26
To: Jason Borro <jason@openguid.net>
Cc: Linked Data community <public-lod@w3.org>


I confess to finding this kind of sneering remark rather annoying. If you
think it is this trivial to work out some 'alternative', why don't you come
up with a few actual ideas and see what happens when they get a little peer
review? Your idea, above, hardly makes first base, as Im sure you already
realized when you added the p.s. So why not try inventing one that has a
snowballs chance in hell of actually working? Im sure that the world would
be delighted if you could solve this trivial problem in 5 ways, let alone a
hundred.

If you agree with Danny that a description can be a substitute for the thing
it describes, then I am waiting to hear how one of you will re-write
classical model theory to accommodate this classical use/mention error. You
might want to start by reading Korzybski's 'General Semantics'.

Pat

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 16 June 2011 02:36
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Not that I think I did a non-sequiteur, it is totally ok to say that
fish are bicycles, if that's what you want to say.

[snip]
my wording could be better, but I stand by it...  a document
describing the car, through HTTP, can be an equally valid
representation of the named car resource as the car itself (as long as
it's qualified by media type)

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 16 June 2011 03:27
To: Pat Hayes <phayes@ihmc.us>
Cc: Jason Borro <jason@openguid.net>, Linked Data community <
public-lod@w3.org>


IANAL, but I have heard of the use/mention thing, quite often. I don't
honestly know whether classical model theory needs a rewrite, but I'm
sure it doesn't on the basis of this thread. I also don't know enough
to know whether it's applicable - from your reaction, I suspect not.

As a publisher of information on the Web, I'm pretty much free to say
what I like (cf. Tim's Design Notes). Fish are bicycles. But that
isn't very useful.

But if I say Sasha is some kind of weird Collie-German Shepherd cross,
that has direct relevance to Sasha herself. More, the arcs in my
description between Sasha and her parents have direct correspondence
with the arcs between Sasha and her parents. There is information
common to the reality and the description (at least in human terms).
The description may, when you stand back, be very different in its
nature to the reality, but if you wish to make use of the information,
such common aspects are valuable. We've already established that HTTP
doesn't deal with any kind of "one true" representation. Data about
Sasha's parentage isn't Sasha, but it's closer than a non-committal
303 or rdfs:seeAlso. There's nothing around HTTP that says it can't be
given the same name, and it's a darn sight more useful than a
wave-over-there redirect or a random fish/bike association. I can't
see anything it breaks either.

----------
From: *Jason Borro* <jason@openguid.net>
Date: 16 June 2011 05:04
To: Linked Data community <public-lod@w3.org>


Apologies if my keyboard sneered at you, though comparing an application
problem to 1% of hr14 at web scale hardly trivializes it; certainly it does
the opposite.  Good luck preserving your mental model if you require
webmasters to spell Korzybski.

----------
From: *Alan Ruttenberg* <alanruttenberg@gmail.com>
Date: 16 June 2011 07:53
To: Jason Borro <jason@openguid.net>
Cc: Linked Data community <public-lod@w3.org>




This is an odd comment. It's like saying good luck preserving your model of
TCP if you require network developers to know where Postel worked.

TCP has to work, whether or not webmasters know the intellectual history its
development, and the same will be true of whatever eventually becomes what
the semweb ideas are aiming at. Pat's knows something about the history of
what's known to work and what isn't. You ignore that history at the peril of
your ideas simply not working.

-Alan


----------
From: *Alan Ruttenberg* <alanruttenberg@gmail.com>
Date: 16 June 2011 08:05
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Linked Data community <public-lod@w3.org>




I assume you mean you are not interested in discussions of theory that are
detached from application.

In any case this is a non-sequitor. The definition is offered because some,
including myself, think that there are important classes of applications for
which it is an essential ingredient of success (like some of the ones I need
to build), and because you implied that defining what we meant was not
feasible.
Does what the technology *accomplishes* fit in there somewhere? Looking at
the above, one might conclude that a successful Ponzi scheme of some sort
would score well.

Regards,
Alan



----------
From: *Richard Cyganiak* <richard@cyganiak.de>
Date: 16 June 2011 11:38
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: Linked Data community <public-lod@w3.org>


Web technologies are never about accomplishing anything new; they are about
taking something that already works on a small and local scale, and making
it work across the internet with its loosely coordinated actors.
:-)

If you want to look at it that way, standards, like anything that exhibits
network effects, are a bit like a ponzi scheme: once you're inside, you
benefit from getting others in your vicinity on board. The difference is
that “late adopters” in a ponzi scheme are the suckers who lose their
investment; while late adopters of a standard get the largest benefit at the
smallest cost.

Best,
Richard

----------
From: *Jonathan Rees* <jar@creativecommons.org>
Date: 16 June 2011 17:46
To: Linked Data community <public-lod@w3.org>


In case anyone's not aware, the TAG is working in the area being
discussed on this thread - i.e. on deployment and performance of
linked data nose-following and the possible conflict with current
metadata practices - as its issue 57,
http://www.w3.org/2001/tag/group/track/issues/57 . In my analysis it's
a notational and protocol issue, not a logical or philosophical one.
To frame the discussion I'm preparing a document that collects
multiple complete solution proposals in what I hope is a neutral form.
The idea Richard C puts forth, as well as the one advanced last fall
by Ian Davis, and something equivalent to the :at idea from Alan's
message, are all included, among others.

I have been waiting to announce this work on public-lod until I can
prepare a new draft incorporating feedback I've received over the past
few weeks, but given the volume of email on this thread I felt I had
to say something about it now. If you are impatient and can't wait for
the next draft, take a look at the current one, which you can find
linked from the issue page named above. I invite discussion of it on
the www-tag@w3.org list.

Best
Jonathan


----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 16 June 2011 22:39
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


Not only do I not follow your reasoning, I don't even know what it is you
are saying. The document is a valid *representation* of the car, yes of
course. But as valid as the car itself? So you think a car is a
representation of itself? Or are you drawing a contrast between the 'named
car resource' and the car itself? ???

Maybe it would be best if we just dropped this now. I gather that you were
offering me a way to make semantic sense of something, but Im not getting
any sense at all out of this discussion, I am afraid.

Pat

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 16 June 2011 23:38
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Jason Borro <jason@openguid.net>, Linked Data community <
public-lod@w3.org>


True.
Sasha and her parents are not themselves in your description. I presume you
mean, the arcs between the terms you use, in your description, to refer to
Sasha and her parents.
Sasha and her parents don't have arcs between them (unless you are indulging
in some cruel treatment of animals.) I presume you mean to refer to certain
relationships which hold between Sasha and her parents.

In this simple case (explicitly named relationships, explicit referring
names) there is a kind of structural correspondence between the description
and the reality, indeed. But as soon as you make the descriptive language
even slightly more expressive, this breaks down. (Try adding negation or
disjunction of even blank nodes.) And as soon as you admit that reality is
more complex than any description of it, it breaks down. So its not a very
good foundation to build a semantic theory upon.
No. The reality is what it is; the information is held in the description
(the one with the arcs and the names in it.) The information is ABOUT Sash
and her parents (and the relationship of parenthood and various categories
of doggitude, and so forth.)
You betcha.
What common aspects? If you mean to refer to the fact that a description
with arcs and names can be TRUE OF some aspect of reality, you are talking
about classical model-theoretic semantics, which is based on the idea of
reference (AKA denotation) at its root; it is the interpretation mapping
from names to the things they are interpreted to refer to (eg between
"Sasha" and Sasha.) But the truth-in-an-interpretation relationship is not
similarity or isomorphism, and it certainly does not warrant identifying the
name with the thing named. Quite the contrary, it relies upon keeping this
distinction clear. As Korzybski famously said, the map is not the territory.
"Closer"? In what metric? I would say it is about as different as anything
can get.
OF COURSE it breaks things. It might be true to say that Sasha is a
Collie-German Shepherd cross, but Sasha's description or web page certainly
isn't. It might be true to say that the description is written in RDF, but
Sasha isn't.

Pat

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 16 June 2011 23:40
To: Jason Borro <jason@openguid.net>
Cc: Linked Data community <public-lod@w3.org>


I'd prefer they actually read him, though I won't hold my breath. Sorry to
bother you by using a very long foreign name.

Pat

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 16 June 2011 23:41
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <
public-lod@w3.org>


LOL

Pat

----------
From: *David Booth* <david@dbooth.org>
Date: 17 June 2011 02:46
To: Linked Data community <public-lod@w3.org>
Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, Jason
Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


[ . . . ]
Let's go further and clarify exactly what breaks: Using the same URI
both for Sasha and Sasha's web page breaks *some* applications and not
others.  Applications that need to distinguish between dogs and web
pages will find the URI ambiguous; applications that do not will be
perfectly happy.  This state of affairs is a universal fact of life that
is true of *all* possible distinctions that may be made, regardless of
whether the distinction is between web pages and dogs, or between
different kinds of dogs, or between different kinds of proteins or
anything else.

Except in the absurdly reductionist sense that *every* URI is ambiguous
(because finer distinctions can always be made), whether a URI is
ambiguous or unambiguous is *not* a fundamental property of the URI:
ambiguity is relative to the *application* that is using that URI.

Given this fact of life, I maintain that permitting the same URI to
denote both a web page and a dog does *not* break the architecture of
the web.

I agree with TimBL that this is a design choice about the architecture
of the web, and a clean, extensible architecture is needed.

I agree with TimBL that 303 (and hash URIs) are useful for those who
*choose* to distinguish between the web page and something else.

I agree with TimBL that the httpRange-14 rule is very useful, even if it
was not ideally stated, and should *not* be abandoned.  However, the
major flaw lies not in the httpRange-14 rule itself, but in the
associated assumption that a URI cannot sensibly denote both an
"information resource" and a dog:
http://www.w3.org/TR/webarch/#def-information-resource
This assumption is fatally flawed because: (a) it attempts to make an
IR/non-IR distinction that can never be nailed down precisely (as
several people have pointed out); and (b) it unnecessarily elevates one
particular axis of ambiguity over all others.  It is analogous to a rule
that says "all URIs for dogs MUST distinguish between male dogs and
female dogs": the only applications that break without this rule are the
ones that *need* to distinguish between male dogs and female dogs.  All
other applications will continue to work just fine without it.   And
that is exactly the way it should be for *any* axis of ambiguity.

I agree with TimBL that it is *good* to distinguish between web pages
and dogs -- and we should encourage folks to do so -- because doing so
*does* help applications that need this distinction.  But the failure to
make this distinction does *not* break the web architecture any more
than a failure to distinguish between male dogs and female dogs.


--
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.


----------
From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk>
Date: 17 June 2011 11:21
To: David Booth <david@dbooth.org>
Cc: Linked Data community <public-lod@w3.org>, Pat Hayes <phayes@ihmc.us>,
Danny Ayers <danny.ayers@gmail.com>, Jason Borro <jason@openguid.net>, Tim
Berners-Lee <timbl@w3.org>


We've been encouraging people to do so. Most do not have the time to invest
in complexity that they percieve no benefit from adding.

We need to reward people for good semantics by making sure there's tools and
apps which add value for their business and activities. / Lead Developer,
EPrints Project, http://eprints.org/
/ Web Projects Manager, ECS, University of Southampton,
http://www.ecs.soton.ac.uk/
/ Webmaster, Web Science Trust, http://www.webscience.org/


----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 17 June 2011 14:13
To: public-lod@w3.org


**
Instead of *break* what about compromising or undermining flexibility
implicit in AWWW? This is tantamount to obscuring the WWW potential relative
to its broad user constituency.

Re. schema.org, I don't regard their effort as breaking, compromising, or
undermining AWWW. I simply believe they are taking baby steps that are 100%
defined by their current business models. Rightly or wrongly so, they have
to protect their business models. In a sense, the same applies to academia
and its model where grant funding is vital to research projects.

What is dangerous though, is encouraging people to misuse and misunderstand
AWWW. Names and Addresses are distinct items. AWWW essence depends on
preserving this vital distinction.

When there are more applications (+1 to Henry's comment about focusing on
Linked Data apps and viral patterns) this lower level matter will vapourize.


Although not present (I am too young) I am certain similar arguments arose
during the early days of silicon based computing between OS developers and
programming language developers. I certainly know these conversations did
arise when Spreadsheets vendors tackled Cell Reference functionality.

There are many useful cases in plain sight that many overlook re. power of
URIs as data conductors, integrators, and access mechanisms. I think (based
on my experience with this community and industry at large) that there is
too much focus on reinventing too many parts of the consumption stack, from
scratch. The key is to be "useful" but introduce "usefulness" unobtrusively
if you really seek uptake. Naturally, this requires understanding of what
already exists (i.e., domain and subject matter knowledge) and functionality
areas addressed by existing solutions. Sorry, but if all you do is program,
you cannot really understand the reality of end-users.

I like to make reference to Apple as a great anecdote because they've risen
from near demise to the vanguard of modern computing by exploiting the
InterWeb from the inside out, they don't see the Web as simply being about
HTML. They understand that its a linked information space and future data
space. They utilize this insight internally in a manner that just manifests
as being "useful" to its ever growing customer base.

Remember, there's a lot of old NeXTStep still underlying what Apple does.
Also remember, the WWW was built on an NeXT machine with a lot of
inspiration from how its innards worked. Believe it or not, we are still
playing catch up (circa. 20011)  with NeXTStep and Unix in general re.
really smart and useful Linked Data apps :-)

Embrace history and the future gets clearer and much more exciting. We have
an unbelievable opportunity within grasp. We can embrace and extend (in a
good way) what we may perceive as imperfections by others (e.g. schema.org).
As Pat stated in an earlier post, these imperfections present opportunities
that might even span decades before the behemoths out there hit their
respective opportunity cost thresholds. Once said thresholds are hit they
will respond accordingly via product fixes and/or enterprise acquisitions
etc..

Contrary to popular belief, I will state once again that HTTP 303 is the
poster child for ingenuity inherent in the HTTP protocol and the AWWW.  Yes,
we could also up the semantic smarts on clients and let a retrieved resource
disambiguate Names and Addresses, but that only adds a burden to a target
audience that's already challenged re:

1. recognizing linked data structures via directed graphs
2. recognizing that linked data structures have always been about links and
that HTTP URIs are a powerful vehicle for expanding this concept to InterWeb
scales
3. recognizing that de-reference (indirection) and address-of operations are
achievable via URIs and cost-effectively so via HTTP URIs due to WWW
ubiquity
4. understanding that RDF is *an option* for linked data structures at
InterWeb scales, you can use other syntaxes without losing access to really
useful stuff like RDFS and OWL semantics (which also suffers from over
emphasis on RDF at expense of core syntax agnostic concepts).


Links:

1. http://en.wikipedia.org/wiki/Spreadsheet#Cells
2. http://en.wikipedia.org/wiki/Spreadsheet#Named_cells .

----------
From: *Nathan* <nathan@webr3.org>
Date: 17 June 2011 22:42
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Pat Hayes <phayes@ihmc.us>, Jason Borro <jason@openguid.net>, Linked
Data community <public-lod@w3.org>


You could use the same name for both if each name was always coupled to a
universe, specified by the predicate, and you cut out type information from
data, such that:

 <x-sasha> :animalname "sasha" ; :created "2011...." .

was read as:

 Animal(<x-sasha>) :animalname "sasha" .
 Document(<x-sasha>) :created "2011...." .

the ability to do this could be pushed on to ontologies, with domain and
range and restrictions specifying universes and boundaries - but it's a big
change.

really, different names for different things is quite simple to stick to,
and considering most (virtually all) documents on the web have several
different elements and identifiable things, the one page one subject thing
isn't worth spending too much time focusing on as a generic use case, as any
solution based on it won't apply to the web at large which is very diverse
and packed full of lots of potentially identifiable things.

best, nathan

----------
From: *Nathan* <nathan@webr3.org>
Date: 17 June 2011 22:43
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: Jason Borro <jason@openguid.net>, Linked Data community <
public-lod@w3.org>


well said, although I think we could bracket yourself in that category too
:)



----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 17 June 2011 23:17
To: nathan@webr3.org
Cc: Danny Ayers <danny.ayers@gmail.com>, Pat Hayes <phayes@ihmc.us>, Jason
Borro <jason@openguid.net>, Linked Data community <public-lod@w3.org>


No its quite simple in fact, as I pointed out in a couple of e-mails in this
thread. You just need to be careful when creating relations that certain
relations are in fact inferred relations between primary topics.
yes, but there are a lot of people who say it is too complicated. I don't
find it so, but perhaps it is for their use cases. I say that we describe
the option they like, find out what the limitations are they will fall have,
and document it. Then next time we can refer others to that discovery.

So limitations to look for would be limitations as to the complexity of the
data created. The other limitations is that even on simple blog pages there
are at least three or four things on the page.
indeed.
agree. But it is one of those things that newbies feel the urge to do, and
will keep on wanting to do. So perhaps for them one should have special
simple ontologies or guides for how to build these ObjectDocument
ontologies. In any case this seems to be the type of thing the microformats
people were (are?) doing.

Henry


>
> best, nathan

Social Web Architect
http://bblfish.net/


----------
From: *Nathan* <nathan@webr3.org>
Date: 17 June 2011 23:27
To: Henry Story <henry.story@bblfish.net>
Cc: Danny Ayers <danny.ayers@gmail.com>, Pat Hayes <phayes@ihmc.us>, Jason
Borro <jason@openguid.net>, Linked Data community <public-lod@w3.org>


I'd agree, but anything that involves being careful is pretty much doomed to
failure on the web :p there's also a primary limitation of the programming
languages developers are using, if they've got locked in stone classes and
objects, or even just structures, then the dynamics of RDF can be pretty
hard to both understand mentally, and use practically. hmm.. microformats
seems to be pretty focussed on describing multiple items on one page,
however the singularity is present in that they focussed on being described
using a single Class Blueprint style, one class, a predetermined set of
properties belonging to the class, and a simple chained heirarchy - this
stems from most OO based languages.

With a bit of trickery you can use RDF and OWL the same way, it just means
you have different "views" over the data, where you can see Human(x) with a
set of properties, or Male(x) with another set, or Administrator(x) with yet
another set. This is less about the data published and more about how it's
consumed viewed and processed though.

Quite sure something can be done with that, where the simple version of the
data uses a basic schema.org like ontology, and advanced usage is more RDF
like using multiple ontologies. The "views" thing would be a way to merge
the two approaches..

Best,

Nathan

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 18 June 2011 20:40
To: Pat Hayes <phayes@ihmc.us>
Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <
alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>,
Michael Hausenblas <michael.hausenblas@deri.org>


That's all that's necessary to square this circle.
All HTTP delivers is representations of named resources. (I very much
do think a car is a representation of itself in HTTP terms, in the
same way a document is, but it isn't necessary here).
I'll be delighted to drop it, I thought we were getting stuck in a tar
pit but your statement above is the er, oil, that gets us out.

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 18 June 2011 20:51
To: David Booth <david@dbooth.org>
Cc: Linked Data community <public-lod@w3.org>, Pat Hayes <phayes@ihmc.us>,
Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Thanks David, a nice summary of the most important point IMHO.

Ok, I've been trying to rationalize the case where there is a failure
to make the distinction, but that's very much secondary to the fact
that nothing really gets broken.

----------
From: *Pat Hayes* <phayes@ihmc.us>
Date: 19 June 2011 06:05
To: Danny Ayers <danny.ayers@gmail.com>
Cc: David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>,
Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Really (sorry to keep raining on the parade, but) it is not as simple as
this. Look, it is indeed easy to not bother distinguishing male from female
dogs. One simply talks of dogs without mentioning gender, and there is a lot
that can be said about dogs without getting into that second topic. But
confusing web pages, or documents more generally, with the things the
documents are about, now that does matter a lot more, simply because it is
virtually impossible to say *anything* about documents-or-things without
immediately being clear which of them - documents or things - one is talking
about. And there is a good reason why this particular confusion is so
destructive. Unlike the dogs-vs-bitches case, the difference between the
document and its topic, the thing, is that one is ABOUT the other. This is
not simply a matter of ignoring some potentially relevant information (the
gender of the dog) because one is temporarily not concerned with it: it is
two different ways of using the very names that are the fabric of the
descriptive representations themselves. It confuses language with language
use, confuses language with meta-language. It is like saying giraffe has
seven letters rather than "giraffe" has seven letters. Maybe this does not
break Web architecture, but it certainly breaks **semantic** architecture.
It completely destroys any semantic coherence we might, in some perhaps
impossibly optimistic vision of the future, manage to create within the
semantic web. So yes indeed, the Web will go on happily confusing things
with documents, partly because the Web really has no actual contact with
things at all: it is entirely constructed from documents (in a wide sense).
But the SEMANTIC Web will wither and die, or perhaps be still-born, if it
cannot find some way to keep use and mention separate and coherent. So far,
http-range-14 is the only viable suggestion I have seen for how to do this.
If anyone has a better one, let us discuss it. But just blandly assuming
that it will all come out in the wash is a bad idea. It won't.

Pat

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 19 June 2011 08:43
To: Pat Hayes <phayes@ihmc.us>
Cc: David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>,
Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Point taken Pat but I have been in the same ring as you for many
years, but to progress the Web ---- can't we just take our hands off
the wheel, let it go where it wants. (Not that I have any influence,
and realistically you neither Pat). I'm now just back from a
sabbatical, but right now would probably be a good time to take one.
If these big companies do engage on the "microdata" front, it's great.
I'm sure it's been said before, why don't we get pornographers working
hard on their metadata on visuals, because they work for Google/Bing
whatever. The motivation right now might not be towards Tim's day one
goals of sharing some stuff between departments at CERN, but that's
irrelevant in the longer term. Getting the the Web as an
infrastructure for data seems like a significant step in human
evolution. And it's a no-brainer. But getting from where we are to
there is tricky. Honestly, I don't care. It'll happen, my remaining
lifespan or about 50 on top, there will be another, big, revolution.

Society is already so different, just with little mobile phones.

/gak I'm no going to speculate, we're heading for a major change.

Cheers,
Danny.
--
http://danny.ayers.name

----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 12:37
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>,
Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>,
Tim Berners-Lee <timbl@w3.org>


The way to do this is to build applications where this thing matters. So for
example in the social web we could build
a slightly more evolved "like" protocol/ontology, which would be
decentralised for one, but would also allow one to distinguish documents,
from other parts of documents and things. So one could then say that one
wishes to bring people's attention to a well written article on a rape,
rather than having to "like" the rape. Or that one wishes to bring people's
attention to the content of an article without having to "like" the style
the article is written in.

If such applications take hold, and there is a way the logic of using these
applications is made to work where these distinctions become useful and
visible to the end user, then there will be millions of vocal supporters of
this distinction - which we know exists, which programmers know exists,
which pretty much everyone knows exists, but which people new to the semweb
web, like the early questioners of the viability of the "mouse" and the
endless debates about that animal, will question because they can't feel in
their bones the reality of this thing.
Well hash uris are of course a lot easier to understand. http-range-14 is
clearly a solution which is good to know about but that will have an
adoption problem.
I am of the view that this has been discussed to death, and that any mailing
list that discusses this is short of real things to do.

One could argue much more fruitfully on DocumentObject ontologies, and it
would be interesting to see where that leads one.
Well these are logical necessities you are speaking of. So it will come out
in the wash. Just like 2+2=4, those who wish to ignore it will loose out in
a number of transactions.

So the fun thing is that we can find completely coherent ontologies that
don't brake the semweb and that would allow Richard Cyganiak to write

> <http://richard.cyganiak.de/> a foaf:Document;
>   dofoaf:name "Richard Cyganiak";
>   dc:title: "Richard Cyganiak's homepage";
>   dofoaf:knows <http://bblfish.net/> .

It looks like here that the document has been confused with the object, but
in fact the relations are designed so that they indirectly refer to
something else. Now it is not clear that this is easier or less confusing to
write than pure foaf. But it does make it look like what Danny wants to have
is happening, namely that the document refers to the thing too - assuming a
document only refers to one thing. But that is already the main problem.
Even an image never refers to one thing only. Take a simple image of the
eiffel tower: there can be cars in it, there can be birds, mice, rats
(ratatouille), and many other creatures jumping around on people's heads.
The higher the resolution the more things that picture can be said to refer
to. So to know which is the primary topic of an image one would nearly need
to add a new relation to express that.

Henry

----------
From: *Hugh Glaser* <hg@ecs.soton.ac.uk>
Date: 19 June 2011 13:05
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>,
Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>,
Tim Berners-Lee <timbl@w3.org>


"A step too far"?

Hi.
I've sort of been waiting for someone to say:
"I have a system that consumes RDF from the world out there (eg dbpedia),
and it would break and be unfixable if the sources didn't do 303 or #."
Plenty of people saying they can't express what they want without it.
And plenty of people saying they can't write some code that they might not
be able to understand some RDF they receive properly.
But no actual examples in the wild (at least as far as I can tell in a lot
of messages).

This might be for quite a few reasons, such as:
1) There are no such consuming systems;
2) The existing consuming systems would not break.

Number (1) would be too embarrassing, and is wrong because I have some, so
I'll think about number (2).

There seem to be some axes in the discussion:
publish / consume
long/medium term / shorter term
ideal / pragmatic
Interestingly, we don't seem to have a strong theory / practice axis, which
is great.

As a publisher, I/we have had to work pretty hard to conform to really quite
complex requirements for publishing RDF as Linked Data; not just Range-14,
but voiD, sitemaps and various bits and pieces that Kingsley always tells me
to do in the RDF.
As a consumer, it has been pretty simple: "Well guv, thanks for the URI,
here's some RDF."
It has always been something of a source of angst (if not actual pain) to me
that none of the extra work I put into publishing RDF is ever used by me or
anyone else, as far as I know.
In fact, some of the sites I consume actually don't do things "properly" - I
might have had to change my consuming systems to cope with this, but I
don't, because they already cope fine.
Why is it not a problem? One obvious reason is that the consuming
application is actually looking for specific knowledge about things.
I don't have a consuming system that is considering both lexical and animal
subjects, and so confusion does not arise.
In fact, it is the predicates that tend to distinguish satisfactorily for me
(as has been pointed out by some people).
Thus, if I get a triple that says the URI that would resolve to my Facebook
page foaf:knows the URI that would resolve to your Facebook page, I (my
system) will happily interpret that as one person (or whatever) foaf:knows
the other. I certainly don't want to go and resolve these to find out to
what the URIs actually resolve. And if I did, what would I do about it?
Ignore it?
In fact, as has also been mentioned, you can define domains, ranges and
restrictions for as long as you like, but it is quite possible and likely
that the users of URIs will continue blissfully unaware of any of this, in
exactly the same way that they continue unaware that there might be
something ambiguous about the URIs they are using.

By the way, as is well-known I think, a lot of people use and therefore must
be happy with URIs that are not Range-14 compliant, such as
http://www.w3.org/2000/01/rdf-schema .

When we help people publish, it really is tough to engage them long enough
to care about the complex issues, and they often get it wrong - I am engaged
with quite a few people who are now publishing serious amounts of
interesting RDF where I have contacted them to try to help. The status of
the conversations is that they have fixed what they can, and are now
thinking (for a long time) about how they might configure their systems to
do it properly - but they may never get there. I will still want to use
their RDF.

So, trying to be a little brief:
I have always felt that the full Range-14 distinction was in danger of being
a Step Too Far.
Yes, it does matter, and it is likely (or at least possible) we will pay a
price in the end.
But the world is trying to pass us by - it has at least pulled alongside.
We must work out why we seem to have lost any lead we had, because it is
likely to be the same reason we will get left behind.
And I happen to believe that what we have can be better than the
alternatives.

Sorry Pat, I don't actually have a proposal.
But I do know we need to be liberal in what we consume.
And we might need to be a bit more liberal in what we praise, or at least be
nicer to people who want to publish RDF and don't do Range-14.

Best
Hugh
--
Hugh Glaser,
             Intelligence, Agents, Multimedia
             School of Electronics and Computer Science,
             University of Southampton,
             Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/



----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 19 June 2011 13:23
To: public-lod@w3.org


Danny,

Do you agree with HTTP-range-14 finding or not?

My gripe with HTTP-range-14 is all about aesthetic matters re. language and
anecdote choices, not the core concept it attempts to articulate. If you
clearly state your gripe in similar terms there could be a chance of
yourself and Pat actually realizing that you are in agreement. Personally,
I've always assumed you clearly groked why Name and Address disambiguation
is vital re. Web's data space dimension. I am suspecting that you are
saying: we should find ways to co-exist with initiatives (e.g. schema.org)
that haven't addressed these matters, just yet etc..

Note: many are grappling with how to construct viable business models from
Linked Data, thus in some cases you will have services that look like they
don't care about Name and Address disambiguation on the outside, courtesy of
their publicly accessible resources, while in reality they understand these
matters very well and have put them you use for a while. Remember, a URI
doesn't have to be public :-)  I think the debate will ultimately be more
about getting these big players to share their more powerful URIs with the
public via services and apps from communities like this that make the
opportunity costs of these big players palpable :-)


Kingsley

----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 13:44
To: Hugh Glaser <hg@ecs.soton.ac.uk>
Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David
Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason
Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


As you point out there are some consuming systems but they are not very
distributed: you know ahead of time what you will find there, and so you can
adapt your parsing for the few special cases. At that level the XML
crowd/JSON crowd are right - rdf does not give you much. In fact it makes it
easier to do things wrong. So we should be supporting more RESTful XML that
can be GRDDLed with X-SPARQL.

The semweb gives you a lot more when things get even more distributed, such
as when everyone starts having foaf files on billions of computers. At that
point nobody will want to tweak their app for the specific data at one site.
Also one will want to be careful of the difference between documents and
things, for the same reason I pointed out with the "like" button in
Facebook. So for the moment the errors don't appear, because we are few
consumers and few producers, and we can work around mistakes manually on a
case by case basis.

To get a real linked data application you need:
 1- data that is produced in a completely decentralised way
 2- data that is linked between those decentralised nodes
 3- data that is consumed, and where the consumption has real world effects


Number 3 is the recursive feedback piece that will make 1 and 2 come to a
point of stability, or meta-stability, as we are dealing with self
organising systems here.

This can be done with the social web. We need systems where you publishing
data means that I can do something, learn something about you, and so on...
but without you ever knowing ahead of time what software or services we are
using.
(( The Twitters and other Web2.0 folks have made their life easy by
centralising data publishing and consumption as much as possible. For
systems like there is no real communication problem: there is a central
dictator and he says what the meaning of the terms go. As things evolve that
part even escapes him - the way office document formats escaped M$ - because
of the huge number of people and software dependent on the initial meaning
produced.))

If I write things out wrong, your software should be able to let me know
about it. Just as if we organise to meet but we
give each other the wrong address, we will end up missing the meeting. If
this were not so then giving out addresses and organising meetings would be
a very different exercise.
yes, my point has been we need to work on small vocabularies, widely
distributed, widely used, to kick start the rest of the system
And as pointed out above they are not that distributed, and the consequences
of things going wrong on a lot of the open data stack is not that big yet.
Also you are probably not putting up reasoners yet.
yes, one can do a lot with incoherent data if one ignores the incoherence,
or just follows through some networks like that. I try to follow these
guidelines more for reasons of sanity. They are simple to follow, and help
one think about the issues.
That is the problem that only will appear if people don't consume the data,
or if the data is known ahead of time to be pretty inconsistent, as dbpedia
data probably is.
yes, in these case by case scenarios it is easy for you to write special
case filters. And we could do the
same thing with HTML whenever we browse the web too. But the web had an
application: the browser that lead to
feedback effects that increased the coherence of the system.
It has not passed by, it is not building for the distributed data. The big
players are creating silos of information and getting rich of that. But the
value of distributed information is much greater than what they are building
- even if it is hard to believe. In any case we have no choice: the big guys
are rich already. We can either be their slaves or be free by working
together, and grow so big together that we tie them into our much larger
system :-)
We are not behind. We are way ahead. The arrows in your back are a testament
to that :-)

----------
From: *Dave Reynolds* <dave.e.reynolds@gmail.com>
Date: 19 June 2011 13:45
To: Hugh Glaser <hg@ecs.soton.ac.uk>
Cc: Linked Data community <public-lod@w3.org>


Hi Hugh,
Your general point that there is non-compliant data out there that
people are still able to make use of is probably right, but that
specific example is compliant - those are all (even the ontology URI)
hash-URIs.

Dave




----------
From: *Kingsley Idehen* <kidehen@openlinksw.com>
Date: 19 June 2011 14:04
To: public-lod@w3.org


Er. we use it :-)

The problem with this whole Linked Data thing is that its truly Ninja tech.

The killer conductor of value is the LINK. This lethal weapon applies to all
dimensions of the Web:

1. Information Space
2. Data Space
3. Knowledge Space.

Trouble is, where do we find strong anecdotes for a cross dimensional lethal
weapon? I try to use Stars Wars and the FORCE at times, but even that
doesn't quite nail what we are dealing with here. Thus, we could take
another approach i.e., embrace and extend what we know is anomalous since
the AWWW architecture (FORCE) actually lets us do this anyway. Exactly! You
are using the FORCE :-) You have a Data Space dimension app. The Information
Space dimension doesn't interfere with your world view. This is key in many
ways. For instance, imagine if your app was of the Information Space
dimension instead, the effect would be very close to what we see today re.
those that see Name and Address disambiguation as impractical overkill since
nothing breaks in the world they experience. Yep! The Data Space realm lets
you Describe anything with clarity, and even when unclear, agents can
ultimately agree to disagree without obliteration. As you would in code
generally, encounter an exception, and decide if you avoid making it a
critical fault :-) Yes, when they operate in the Information Space
dimension. In the Information Space dimension, yes. In that dimension it
doesn't matter. Yes, and all you do is show them a tweaked version of their
RDF, should they wander by your data space (which is grounded in the Data
Space realm). Its fine, we just can't present it in edict form to people
experiencing and operating with the Information Space dimension of the WWW.
You betcha! IMHO. People are doing what they always do: ignore warnings and
scramble desperately for cures, post calamity. Note, in most cases, using
the industry behemoths as examples, calamity == business model erosion
courtesy of exponentially increasing opportunity costs. We need to accept
that the WWW has many dimensions to it, Information, Data, and Knowledge.
Thus, we can't speak from the Data Space dimension to folks in the
Information Space dimension and expect immediate comprehension. We could
(hence power of HTTP 200 OK) operate within the Information Space dimension
and unveil the Data Space dimension. Like all contextual matters, we have to
align "context lenses" in order for use to develop constructive dialog. This
is why "embrace and extend" (not the way Microsoft did it many years ago) is
the way to go re. unveiling Data Space dimension from the Information Space
dimension. My proposal is this: we just need to be more accommodating of
what we may perceive as imperfections, in our data space oriented context.
We should always embrace structured data contributions in any form. We can
transform structured data to high fidelity linked data in a myriad of ways
that ultimately help others comprehend what's taking shape re. the WWW as a
Global Data Space. +1000 +1000


Kingsley

----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 14:39
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: public-lod@w3.org


That's a fun way of describing things. But we have to be careful not to hype
things too much, or we risk being tied into the 1980 AI hype space, and then
nobody will listen anymore.

Perhaps a more scientific way to express this is within the language of
self-organising systems. There is a lot of research there which is relevant
to us.

 http://en.wikipedia.org/wiki/Self_organising_systems

I am a bit new to this area. Any books I must read?

Henry

----------
From: *Hugh Glaser* <hg@ecs.soton.ac.uk>
Date: 19 June 2011 15:09
To: Henry Story <henry.story@bblfish.net>
Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David
Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason
Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Thanks Henry.
Just to be clear on one point:

On 19 Jun 2011, at 12:44, Henry Story wrote:
<snip />
<snip />
But I don't write special case filters - if I did it would not consider it
Semantic Web.
I simply follow my nose to use the URI (or in fact usually via an owl:sameas
in a sameas store), and they work.
It all works because my code that consumes the retrieved RDF to build the
data enrichment by inference (things like the communities of practice), and
things like my fresnel lenses, restrict any ambiguity by looking for the
predicates, etc. they care about.
RDF can be a long way short of what we want it to be without having to treat
it as special cases.
----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 15:25
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Pat Hayes <phayes@ihmc.us>, Richard Cyganiak <richard@cyganiak.de>, Alan
Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <
public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org>



On 12 Jun 2011, at 14:40, Danny Ayers wrote:

> [snip]
A photo and a graph work in essentially the same way. They both set
restrictions on possible worlds of which they are true. A photo restricts
the number of possible worlds to those that are visually equivalent to the
picture taken. A graph is true of all the possible worlds where those
relations holds - which is usually infinitely large.

In either case the meaning of a graph or document is a set of possible
worlds. A set is an object - one can speak of it - but a very different kind
of object from what you may think of as what appears in the picture. As such
there is indeed a fundamental logical difference between a document and
objects in the world. And that also explains why a photo is not clearly
about one thing or another - though of course given that it is a restriction
on the way things can be, it limits the things the document could be about.

As stated in a previous mail, the same photo can be about the eiffel tower,
a sunset, a beautiful view of Paris, a vacation experience, a friend that
appears in the picture, a murder that was commited at that moment,... The
photo remains the same in all those descriptions, and it can be tagged in
all those ways, which is why it is good to have names for each of those
things that are different from the photo. Each of those should have definite
descriptions to help identify the referents from the description.

----------
From: *Hugh Glaser* <hg@ecs.soton.ac.uk>
Date: 19 June 2011 15:26
To: Kingsley Idehen <kidehen@openlinksw.com>
Cc: "<public-lod@w3.org>" <public-lod@w3.org>


Er, I'm not sure you do :-)
You certainly consume it, and a very nice job you do to.
But the "use" is more than generic browsers - it suggest to me that
something useful might happen as a result of the consumption (perhaps I
learn that I can ask Jim to introduce me to Mary, as he knows her better
than anyone else I know).
These things are usually called applications, or possibly services.
They tend to be reasonably domain-specific, as generic things tend not to be
easy to sue, or even fit for purpose for end users.
Sorry if I have missed stuff.

----------
From: *Hugh Glaser* <hg@ecs.soton.ac.uk>
Date: 19 June 2011 15:38
To: Dave Reynolds <dave.e.reynolds@gmail.com>
Cc: Linked Data community <public-lod@w3.org>


I know, I know - as I pressed the send button I thought uh-uh :-)
Sorry.
Mind you, I deliberately left the # off the URI and I think I got confused
about ...
Oh never mind - sorry.
>
> Dave

----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 15:48
To: Hugh Glaser <hg@ecs.soton.ac.uk>
Cc: Kingsley Idehen <kidehen@openlinksw.com>, "<public-lod@w3.org>" <
public-lod@w3.org>


exactly.  At that level you start using the specific logic of some
relations, here perhaps the foaf:knows relation, which is other than the
high very lightly constraining rdfs or owl framework. One might say that one
only really uses foaf:knows when one has software than understands the
specific intension of that relationship.
> They tend to be reasonably domain-specific, as generic things tend not to
be easy to use, or even fit for purpose for end users.

yes. And since we are working in a self organising system, these
applications have to be designed so that every use grows the value of the
network, and creates incentives for correct data to be published, and
maintained.
In recent e-mail Hugh also wrote in reply to me:
> RDF can be a long way short of what we want it to be without having to
treat it as special cases.


yes, we will be dealing with inconsistent data whatever we do. But we need
ways of telling when things are inconsistent so that we can then recognise
when this is the case and find ways around things. As I mentioned, I think
we don't recognise inconsistency much because few people use inferencing.
And inferencing need not just be owl inferencing, it can be the type of
inferencing that comes from human understanding of what it means to
foaf:know someone, or other terms with particular complex intentions.

----------
From: *Tim Berners-Lee* <timbl@w3.org>
Date: 19 June 2011 17:13
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>,
Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>


Absolutely, Pat. Well said.
This is really important.

Can we please stop the madness of confusing things with documents about them
and do what we want to do cleanly and in an efficient way.

Tim

----------
From: *Nathan* <nathan@webr3.org>
Date: 19 June 2011 17:33
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>,
Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>,
Tim Berners-Lee <timbl@w3.org>


Exactly,

Things become even clearer when you add in a messenger.

A messenger carried a message about an erupting volcano, to conflate the
message and the subject of the message is to say that a messenger carried an
erupting volcano, which is nonsense.

We've long since known not to conflate the Messenger with the Message, this
is why we don't shoot the messenger, however I think this is possibly the
first time in history where we've questioned whether the message and the
subject(s) of the message were different things or not.

Best,

Nathan

----------
From: *Giovanni Tummarello* <giovanni.tummarello@deri.org>
Date: 19 June 2011 18:27
To: Pat Hayes <phayes@ihmc.us>
Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>,
Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>,
Tim Berners-Lee <timbl@w3.org>


Could it be exactly the other way around? that documents and things
described in it are easy to distinguis EXACTLY becouse one is about the
other, no one can possibly mess them up/except for idiotic computer
algorithms from the 70s that limits themselves to simbolic AI techniques.

Otherwise you seem to say that  its more difficult to distinguish between a
dog and a bitch than it is to distinguish between a dog and a stream of
bytes in return to an HTTP request, and that seems a bit funny?

look if someone points me at a facebook URL i know its about a person and
not about the damn page (which has 2000 ways to change every time that url
is resolved anyway.
i mean we can go on and tell oursellf we cant possibly write applications
that know or understand what  facebook URL is about.

but dont be surprised as  less and less people will be willing to listen as
more and more applications (Eg.. all the stuff based  on schema.org) pop up
never knowing there was this problem... (not in general. of course there is
in general, but for their specific use cases)

Gio





----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 18:35
To: Giovanni Tummarello <giovanni.tummarello@deri.org>
Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David
Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason
Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>



The question is if schema.org makes the confusion, or if the schemas
published there use a DocumentObject ontology where the distinctions are
clear but the rule is that object relationships are in fact going via the
primary topic of the document. I have not looked at the schema, but it seems
that before arguing that they are inconsistent one should see if there is
not a consistent interpretation of what they are doing.


Henry



Gio


----------
From: *Nathan* <nathan@webr3.org>
Date: 19 June 2011 18:56
To: Henry Story <henry.story@bblfish.net>
Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes <
phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth <
david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <
jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Sorry, I'm missing something - from what I can see, each document has a
number of items, potentially in a hierarchy, and each item is either
anonymous, or has an @itemid.

Where's the confusion between Document and Primary Subject?

----------
From: *Nathan* <nathan@webr3.org>
Date: 19 June 2011 18:58
To: Henry Story <henry.story@bblfish.net>
Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes <
phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth <
david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <
jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Or do you mean from the Schema.org side, where each Type and Property has a
dereferencable URI, which currently happens to also eb used for the document
describing the Type/Property?

----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 19:36
To: nathan@webr3.org
Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes <
phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth <
david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <
jason@openguid.net>, Tim Berners-Lee <timbl@w3.org>


Well I can't really tell because I don't know what the semantics of those
annotations are, or how they function. Without those it is difficult to tell
if they have made a mistake. If there is no way of translating what they are
doing into a system that does not make the confusion, then one could explore
what the cost of that will be to them. If the confusion is strong then there
will be limitations in what they can express that way. It will then be a
matter of working out what those limitations are and then offering services
that allow one to go further than what they are proposing. At the very least
the good thing is that they are not bringing the confusion into the RDF
space, since they are using their own syntax and ontologies.

There may also be an higher way to fix this so that they could return a 20x
(x-some new number) which points to the document URL (but returns the
representation immediately, a kind of efficient HTTP-range-14 version) So
there are a lot of options. Currently their objects are tied to an html
document. What are the json crowd going to think?

In any case there is a problem of translation that has to be dealt with
first.

Henry

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 19 June 2011 19:44
To: Henry Story <henry.story@bblfish.net>
Cc: Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data
community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim
Berners-Lee <timbl@w3.org>


On 19 June 2011 12:37, Henry Story <henry.story@bblfish.net> wrote:
>
[snip pat]
I would have come down on you like a ton of bricks for that Henry, if
it wasn't for seeing to-and-fro on Facebook about some Nazi-inspired
club (Slimelight, for the record). On FB there is no way to express
your sentiments. Like/blow to smithereens.
I confess to talking bollocks when I should be coding.

----------
From: *Henry Story* <henry.story@bblfish.net>
Date: 19 June 2011 19:52
To: Danny Ayers <danny.ayers@gmail.com>
Cc: Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data
community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim
Berners-Lee <timbl@w3.org>


yeah, me too. Though now you folks managed to get me interested in this
problem! (sigh)

Henry

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 19 June 2011 20:03
To: Henry Story <henry.story@bblfish.net>
Cc: nathan@webr3.org, Giovanni Tummarello <giovanni.tummarello@deri.org>,
Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data
community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim
Berners-Lee <timbl@w3.org>


I thought forever that if we see iniquities we are duty-bound to stand
in the way.

But that don't seem to change anything.

Let the crap rain forth, if you really need to make sense of it the
blokes on this list will do it.

Activity is GOOD, no matter how idiotic.

Decisions made on very different premises than anyone around here would
promote.

Sorry, I'm of the opinion that the Web approach is the winner. Alas it
also seems lowest common denominator.

Cheers,
Danny.
--
http://danny.ayers.name

----------
From: *Danny Ayers* <danny.ayers@gmail.com>
Date: 19 June 2011 20:15
To: Henry Story <henry.story@bblfish.net>
Cc: Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data
community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim
Berners-Lee <timbl@w3.org>


Only personal Henry, but have you tried the Myers-Briggs thing - I
think you used to be classic INTP/INTF - but once you got WebID in
your sails it's very different. These things don't really allow for
change.

Only slightly off-topic, very relevant here, need to pin down WebID in
a sense my dogs can understand.

The Myers-Briggs thing is intuitively rubbish. But with only one or
two posts in the ground, it does seem you can extrapolate.
--
http://danny.ayers.name




-- 
http://danny.ayers.name
Received on Sunday, 19 June 2011 18:23:19 UTC