- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Sun, 19 Jun 2011 20:22:30 +0200
- To: public-lod@w3.org
- Message-ID: <BANLkTinFpLJ7X+iVYvtd=kcKLq4frBAU5w@mail.gmail.com>
I feel very guilty being in threads like this. Shit fuck smarter people than me. Can we now close this trench down and move elsewhere? Forwarded conversation Subject: Squaring the HTTP-range-14 circle [was Re: Schema.org in RDF ...] ------------------------ From: *Danny Ayers* <danny.ayers@gmail.com> Date: 12 June 2011 14:40 To: Pat Hayes <phayes@ihmc.us> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> On 12 June 2011 01:51, Pat Hayes <phayes@ihmc.us> wrote: > > On Jun 11, 2011, at 12:20 PM, Richard Cyganiak wrote: > >> ... >>>> It's just that the schema.org designers don't seem to care much about the distinction between information resources and angels and pinheads. This is the prevalent attitude outside of this mailing list and we should come to terms with this. >>> >>> I think we should foster a greater level of respect for representation >>> choices here. Your dismissal of the distinction between information >>> resources and what they are about insults the efforts of many >>> researchers and practitioners and their efforts in domains where such >>> a distinction in quite important. Let's try not to alienate part of >>> this community in order to interoperate with another. >> >> Look, Alan. I've wasted eight years arguing about that shit and defending httpRange-14, and I'm sick and tired of it. Google, Yahoo, Bing, Facebook, Freebase and the New York Times are violating httpRange-14. I consider that battle lost. I recanted. I've come to embrace agnosticism and I am not planning to waste any more time discussing these issues. > > > Well, I am sympathetic to not defending HTTP-range-14 and nobody ever, ever again even mentioning "information resource", but I don't think we can just make this go away by ignoring it. What do we say when a URI is used both to retrieve, um sorry, identify, a Web page but is also used to refer to something which is quite definitely not a web page? What do we say when the range of a property is supposed to be, say, people, but its considered OK to insert a string to stand in place of the person? In the first case we can just say that identifying and reference are distinct, and that one expects the web page to provide information about the referent, which is a nice comfortable doctrine but has some holes in it. (Chiefly, how then do we actually refer to a web page?) But the second is more serious, seems to me, as it violates the basic semantic model underlying all of RDF through OWL and beyond. Maybe we need to re-think this model, but if so then we really ought to be doing that re-thinking in the RDF WG right now, surely? Just declaring an impatient agnosticism and refusing to discuss these issues does not get things actually fixed here. For pragmatic reasons I'm inclined towards Richard's pov, but it would be nice for the model to make sense. Pat, how does this sound: >From HTTP we get the notions of resources and representations. The resource is the conceptual entity, the representations are concrete expressions of the resource. So take a photo of my dog - <http://example.org/sasha-photo> foaf:depicts <http://example.org/Sasha> . If we deref http://example.org/sasha-photo then we would expect to get a bunch of bits that can be displayed as an image. But that bunch of bits may be returned with HTTP header - Content-Type: image/jpeg or Content-Type: image/gif Which, for convenience, lets say correspond to files on the server called sasha-photo.jpg and sasha-photo.gif Aside from containing a different bunch of bits because of the encoding, sasha-photo.jpg could be a lossy-compressed version of sasha-photo.gif, containing less pixel information yet sharing many characteristics. All ok so far..? If so, from this we can determine that a representation of a resource need not be "complete" in terms of the information it contains to fulfill the RDF statement and the HTTP contract. Now turning to http://example.org/Sasha, what happens if we deref that? Sasha isn't an information resource, so following HTTP-range-14 we would expect a redirect to (say) a text/html description of Sasha. But what if we just got a 200 OK and some bits Content-Type: text/html ? We are told by this that we have a representation of my dog, but from the above, is there any reason to assume it's a complete representation? The information would presumably be a description, but is it such a leap to say that because this shares many characteristics with my dog (there will be some isomorphism between a thing and a description of a thing, right?) that this is a legitimate, however partial, representation? In other words, what we are seeing of my dog with - Content-Type: text/html. is just a very lossy version of her representation as - Content-Type: physical-matter/dog Does that make (enough) sense? Cheers, Danny. -- http://danny.ayers.name ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 12 June 2011 16:29 To: public-lod@w3.org Danny, Quite a long route to saying: You can use a hyperlinks to Name Observation Subjects. Observation Subject have Representations at an Address. Actual format of Observation Subject Representation is negotiable. The brevity challenge is a function of using hyperlinks as Names since WWW users are only accustomed to their use as Resource Locators or Addresses (URLs). Graph Models for describing Observation Subjects has made sense for a long time, pre WWW. It only when we try to state or infer that this is an RDF (syntax for expressing semantics) invention that all hell breaks loose, and justifiably so. -- Regards, Kingsley Idehen President& CEO OpenLink Software Web: http://www.openlinksw.com Weblog: http://www.openlinksw.com/**blog/~kidehen<http://www.openlinksw.com/blog/~kidehen> Twitter/Identi.ca: kidehen ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 12 June 2011 19:19 To: Danny Ayers <danny.ayers@gmail.com> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Well, I am too. That is, I would love for this whole issue/problem to just go away. But I don't think ignoring it will make it go away. OK, so far. I would just note that (coming from a different, non-HTTP, tradition) I would never have even dreamt of any representation being "complete" in what I think is the sense you mean. So your care and emphasis here seem odd. But OK, I am following you... Really? I thought that HTTP-range-14 just said that if we get redirected, all bets are off, and the URI might denote anything at all, so the thing that gets returned might have nothing to do with the referent. Then (again, according to doctrine) the URI denotes the information resource which this is the HTTP-representation of. Which evidently is not Sasha. No, but what has that got to do with anything? The key issue is that we are told that it is an information resource and hence we know it is not a dog. So we know, for example, that if someone asserts that some other dog is its father, or that it had its vet shots in February, or that it is an instance of http://sw.opencyc.org/concept/Mx4rvVjaoJwpEbGdrcN5Y29ycA , then (if we are smart) something is wrong here, or else (if we are less smart) that something on the Web has these properties. Now, we could try this line, which I think is what you are suggesting. We could say that all such 'information resources' are being used as stand-ins for referential names themselves, i.e. they are not things (like dogs, say) but should always be understood as referring to some other thing. There are some technical problems with this, but Im sure we could work around them; but the serious problem with this idea is, that it makes it impossible to simply refer to these information resources themselves. So we would be unable to talk about Web pages using the Web description language RDF. Frankly, this would not bother me personally very much, as I am not particularly interested in describing Web pages in RDF, but I know it would bother some other people (TIm B-L, for just one) rather a lot. What?? Absolutely not. Descriptions are not in any way isomorphic to the things they describe. (OK, some 'diagrammatic' representations can be claimed to be, eg in cartography, but even those cases don't stand up to careful analysis. in fact.) It is a representation, sure. The question is, what is it a representation OF? A lossy image of a lossy image of X is itself a (very) lossy image of X. But the name of a name of X is not a name of X; and a (descriptive) representation of a representation of X is not a representation of X. For example, "written clumsily and with many spelling errors" describes "Ee were real gude at mafematiks at skool", which in turn describes me; but I am not, myself, composed of spelling errors. Reference is not transitive, in a nutshell. Nope, absolutely not. Reference is not like lossy imaging. NIce try, but no cigar. Want to try again? Seriously, it is not easy to find a coherent way to allow what one might call reference slippage - using a name or description to stand in for the actual thing named - without the whole semantic framework just basically collapsing**. I know we humans do it all the time without hardly noticing, and I REALLY wish that I or someone could figure out how to capture this facility in a formal scheme of some kind. But I cant see how to do it. Pat ** To illustrate. Someone goes to a website about dogs, likes one of the dogs, and buys it on-line. He goes to collect the dog, the shopkeeper gives him a photograph of the dog. Um, Where is the dog? Right there, says the seller, pointing to the photograph. That isn't good enough. The seller mutters a bit, goes into the back room, comes back with a much larger, crisper, glossier picture, says, is that enough of the dog for you? But the customer still isn't satisfied. The seller finds a flash card with an hour-long HD movie of the dog, and even offers, if the customer is willing to wait a week or two, to have a short novel written by a well-known author entirely about the dog. But the customer still isn't happy. The seller is at his wits end, because he just doesn't know how to satisfy this customer. What else can I do? He asks. I don't have any better representations of the dog than these. So the customer says, look, I want the *actual dog*, not a representation of a dog. Its not a matter of getting me more information about the dog; I want the actual, smelly animal. And the seller says, what do you mean, an "actual dog"? We just deal in **representations** of dogs. There's no such thing as an actual dog. Surely you knew that when you looked at our website? ------------------------------------------------------------ IHMC (850)434 8903 or (650)494 3973 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 mobile phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes ---------- From: *Alan Ruttenberg* <alanruttenberg@gmail.com> Date: 12 June 2011 19:53 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak < richard@cyganiak.de>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> That seems too strong. Just thinking about this alternative - that 200 responders (for the purposes of linked data) are not considered IRs. Instead 200 implies an assertion (for, say, http://www.ihmc.us/users/phayes/ ) _:foo a :information-thing _:foo :at "http://www.ihmc.us/users/phayes/"^^xsd:anyURI (there exists an information resource accessible at http://www.ihmc.us/users/phayes/) to which could then be asserted in your favored syntax: _:page a :web-page _:page :at "http://www.ihmc.us/users/phayes/"^^xsd:anyURI _:page dc:creator <http://www.ihmc.us/users/phayes/> This effectively flips what is now the default (you would use, e.g. foaf:primaryTopic to go in the opposite direction) Not that I'm advocating this. For one thing there are many information thinks that couldn't possibly be understood as designators. (well, shouldn't ;-) -Alan ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 13 June 2011 01:13 To: Pat Hayes <phayes@ihmc.us> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Beh! Some isomorphism is all I ask for. Take your height and shoe size - those numeric descriptions will correspond 1:1 with aspects of the reality. Keep going to a waxwork model of you, the path you walked in the park this afternoon - are you suggesting there's no isomorphism? Lovely imagery, thanks Pat. But replace "a novel written by a dog" for "dog" in the above. Why should the concept of a document be fundamentally any different from the concept of a dog, hence representations of a document and representations of a dog? Ok, you can squeeze something over the wire that represents "a novel written by a dog" but you (probably) can't squeeze a "dog" over, but that's just a limitation of the protocol. There's equally an *actual* document (as a bunch of bits) and an *actual* dog (as a bunch of cells). ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 13 June 2011 02:28 To: Danny Ayers <danny.ayers@gmail.com> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Yes, in fact I am *denying* there is *any* isomorphism. What structures are you intending to appeal to when you say 'isomorphic'? Do you see reality as being some kind of giant category? Or what? Lets suppose that the interpretation/denotation/semantic/reference mapping goes from the representation to the reality. (Since its an isomorphism, it should be invertible, so this is an arbitrary choice, right?) Call this mapping ref, so X ref Y means that Y is one way reality might be assuming X is true, when X is used as a representation. First point: for descriptions, ref is a Galois mapping, which means that when X gets larger - when the representation says more about the reality - then Y, the number of ways that the reality can be, gets smaller. The more you say, the more tightly you constrain the ways the world can be. This is exactly the opposite from how an isomorphism would behave. Next point: there can indeed be correspondences between the syntactic structure of a description and the aspects of reality it describes. Your example of the path I walked would be one, if you were to draw the path on an accurate map. But this is completely hostage to the map being **accurate**. If I used a not-to-scale sketch map, then no, you don't get isomorphism. Yet it seems to me that these two cases, the real map and a sketch map, both seem to work in the same kind of semantic way. So this explanation of how they work cannot depend on there being an isomorphism. Maybe there is a kind of homomorphism, but even that is kind of hard to make work. What it seems to be is more like, the map projection function is a homomorphism of the entire mapped terrain, and then marks or symbols on the map indicate terrain location by inverting this projection morphism and asserting an existential to the effect that the thing described is contained in that back-projected space in the terrain from space occupied by the mark or symbol in the map space. But I don't think all this is really germane to the http-range-14 issue. The point there is, does the URI refer to something like a representation (information resource, website, document, RDF graph, whatever) or something which definitely canNOT be sent over a wire? I dont follow your point here. If you mean, a document is just as real as a dog, I agree. So? But if you mean, there is no basic difference between a document and a dog, I disagree. And so does my cat. So improved software engineering will enable us to teleport dogs over the internet? Come on, you don't actually believe this. Pat ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 13 June 2011 03:46 To: Pat Hayes <phayes@ihmc.us> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> That is what I was calling isomorphism (which I still don't think was inaccurate). But ok, say there are correspondences instead. I would suggest that those correspondences are enough to allow the description to take the place of a representation under HTTP definitions. I'm saying conceptually it doesn't matter if you can put it over the wire or not. Difference sure, but not necessarily relevant. It would save a lot of effort sometimes (walkies!) but all I'm suggesting is that if, hypothetically, you could teleport matter over the internet, all you'd be looking at as far as http-range-14 is concerned is another media type. Working back from there, and given correspondences as above, a descriptive document can be a valid representation of the identified resource even if it happens to be an actual thing, given that there isn't necessary any "one true" representation. We don't need the Information Resource distinction here (useful elsewhere maybe). ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 13 June 2011 07:52 To: Danny Ayers <danny.ayers@gmail.com> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> OK, I am now completely and utterly lost. I have no idea what you are saying or how any of it is relevant to the http-range-14 issue. Want to try running it past me again? Bear in mind that I do not accept your claim that a description of something is in any useful sense isomorphic to the thing it describes. As in, some RDF describing, say, the Eiffel tower is not in any way isomorphic to the actual tower. (I also do not understand why you think this claim matters, by the way.) Perhaps we are understanding the meaning of http-range-14 differently. My understanding of it is as follows: if an HTTP GET applied to a bare URI http:x returns a 200 response, then http:x is understood to refer to (to be a name for, to denote) the resource that emitted the response. Hence, it follows that if a URI is intended to refer to something else, it has to emit a different response, and a 303 redirect is appropriate. It also follows that in the 200 case, the thing denoted has to be the kind of thing that can possibly emit an HTTP response, thereby excluding a whole lot of things, such as dogs, from being the referent in such cases. Pat ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 13 June 2011 10:16 To: public-lod@w3.org The Referent of a URI re., http-range-14 is the observation (or description) subject. In this context the subject may or may not be a real world object or entity. In the context of Linked Data, the observation (or description) subject URI resolves to a Representation of its Referent. Actual representation is accessible via an Address. Data representation formats are *optionally* negotiable e.g., via content negotiation, and ultimately varied i.e., many serialization formats for byte stream that actually transmits data from its source to its consumers. ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 13 June 2011 10:25 To: public-lod@w3.org No, 200 OK means this URI is functionally an Address i.e., a place that's ready to transmit the byte stream associated with the Address. When the functionality of the URI changes i.e., its a Name rather than an Address, courtesy of de-reference (indirection), there is a 303 redirect (an act of indirection). Yes, a data server indicates to a client that a given Address is functional i.e., I'll transmit you a byte stream from this place which I crafted for this specific purpose. Yes, if the response is 200 OK since the URI is an Address. No if the response is a 303 since the URI is a Name. It still boils down to the URI abstraction which ingeniously caters for two vital data access by reference operations: Name (for de-reference and indirection) and Address (for Data Access). Kingsley ---------- From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk> Date: 13 June 2011 10:59 To: public-lod@w3.org Cc: public-lod@w3.org ** Before I comment, I just want to summarise my understanding because http-range-14 is a weird term; I understand it as the range-14 issue that when you use 302 to redirect from a URI-A to a URL-B we have a convention that URL-B has some relationship to URI-A but it's not defined, we don't treat this as semantic information and tend to throw it away. (stated to make sure I've understood correctly) This bit a chap working with some of my data; * he loaded some data from <URI-A> using a library * URI-A did a nice content-negotiated 302 to URL-B (and RDF document) * URL-B had a description of <URI-A> * The problem was he also wanted to auto extract the license for this data, but the triples gave the license as a relation to <URL-B>, but the system treated the data as loaded from <URI-A> At the most simple level, we could add some triples when loading a graph via redirection... <URI-A> myprefix:http302redirect <URL-B> or something richer with dates, http options etc. You could do something even fussier with http headers stating an explicit relationship with the 302, but all of this is very nice but the main problem seems to be that it's hard and doesn't benefit someone who just wants to knock something up quickly. The real problem seems to me that making resolvable, HTTP URIs for real world things was a clever but dirty hack and does not make any semantic sense. We should use thing://data.totl.net/scooby to refer to the dog and have a convention that http://data.totl.net/scooby will refer to some content about my dog. This URL can of course then content negotiate as normal. You could also use this in reverse. *thing*:// www.imdb.com/title/tt0910554/ is the primary topic of http://www.imdb.com/title/tt0910554/ Yes, you could end up with a whole bunch of URIs for the same thing; thing://data.totl.net/scooby thing://data.totl.net/scooby.html thing:// data.totl.net/scooby.pdf thing://data.totl.net/scooby.csv all are the same thing, but big deal. The only tricky thing would be people may get confused about the "thing" URI related to a document. For example, given a document in pdf, word and html, you might need a separate thing:// URI to describe the abstract concept of the document, but that's not the primary topic of any of the documents. Such fiddling details are more the province of people with experience, so I'm not too worried. What we should be doing is making the common garden data really easy to produce. I've spent a lot of time trying to teach these concepts to people at hackdays & barcamps, plus in a professional context. http:// URIs for real world things clearly make it harder to learn. The follow-you-nose gimick is cool, but we could do that with a change convention, and a trivial update to existing libraries (just resolve thing:// via http://) I expect the answer is "it's too late to change now". To which I am tempted to say "change or die". (again, another Monday morning ranty mail! but I feel like someone should be commenting on the emperors URI convention. If there's a cheat sheet I should read before continuing commenting on these subject, please point me to it.) -- Christopher Gutteridge -- http://id.ecs.soton.ac.uk/person/1248 You should read the ECS Web Team blog: http://blogs.ecs.soton.ac.uk/webteam/ ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 13 June 2011 11:21 To: public-lod@w3.org ** I think its an ingenious tweak, but easily perceived as a "clever but dirty hack". As you know, the problem with HTTP URI based Names is that they are unintuitive. Thus, the entire narrative re. Linked Data should never have built solely around use of HTTP scheme based URIs for Names. It could have just started with URIs and worked its way toward the benefits inherent in using HTTP scheme URIs due to encapsulation of de-reference (indirection) and address-of operations. Instead, as I've stated repeatedly, we oscillate between use of URI and URL for a concept that leverages all aspects of the URI abstraction. HTTP URI based Names ultimately deliver the least disruptive path of a global data spaces of data objects represented by linked data graphs. We just need to fix the narrative, and that starts by decoupling the concept of Linked Data from RDF. RDF is but an option, if you choose to use RDF in a particular way. But that won't work in any of today's Web Browsers off the bat. Thus, it doesn't solve the need for the transition to be none disruptive to user experience. It potentially works one way i.e., introspectively (to a point) from the resource at: http://www.imdb.com/title/tt0910554/, if so crafted by the publisher. It won't work from the Address bar of a Web Browser. It won't work with cURL or wget etc. It just won't work from the client side. ---------- From: *William Waites* <ww@styx.org> Date: 13 June 2011 22:51 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak < richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> * [2011-06-12 22:52:18 -0700] Pat Hayes <phayes@ihmc.us> écrit: ] OK, I am now completely and utterly lost. I have no idea what you So in the previous email, Danny used the important word - relevant. Let's unpack that a little bit. Suppose we have no range-14 and all these RDF statements out there are all mixed up about what they refer to. Well, not completely mixed up. They're kind of clumped together, web pages and the things they are about tend to get confused but probably the chain of inferences that lead you to believe that the Eiffel tower is a dog is pretty unlikely. So there is some relationship between a description of the Eiffel tower and the tower itself. The relationship is akin to similarity in a very specific way - they are similar enough that someone thought it made sense to write down that the tower was 356m tall. Unfortunately they got confused and wrote down that the web page was 356m tall. No matter, they are still different enough in the relevant ways that anyone interested in heights on the order of hundreds of meters is unlikely to be confused. Same with the dog. Is the distinction between the dog and the picture important to me? Maybe, maybe not. It depends what I'm trying to do. If I want to make sure that I can recognise the doc when I meet her, a picture or the actual dog might do equally well. So that's the thing, similar or different in the relevant respects for the purpose at hand. The purpose at hand is necessary to figure out relevance. Just deriving all the possible things that can be entailed from the information you have is no good. You have to derive the relevant things in a particular context. You have to throw out givens that are irrelevant to you or that lead you to irrelevant or nonsensical entailments. In the general case this is hard. It's not even clear if it is relevance understood like this is computable. The intent of the user is so clearly in the loop providing a reference frame for evaluating relevance and capturing and representing a user's intent is not something we have a good way of doing apart from hand-crafting interactions. Is it doable in simple cases (with rules programmed by humans) like figuring out the foaf:knows graph where people and their homepages can just be merged without too many bad side-effects. We need a different kind of rule here - a cut rule. That says if some condition obtains, *remove* some statements. For example, remove all { ?doc a foaf:Document } before running the productive rules might be a common one where we know that we aren't interested in information resources. Cheers, -w -- William Waites <mailto:ww@styx.org> http://river.styx.org/ww/ <sip:ww@styx.org> F4B3 39BF E775 CF42 0BAB 3DF0 BE40 A6DF B06F FD45 ---------- From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk> Date: 13 June 2011 23:17 To: William Waites <ww@styx.org> Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> Perhaps what we need to start worrying about is getting some test cases -- or a big pile of real (shonky) data to extract useful facts from... Would it be worth starting a collection of data which makes sense to humans but isn't strictly semanticly clear? ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 14 June 2011 05:33 To: William Waites <ww@styx.org> Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak < richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> What has that got to do with the tower being similar to its description? First, you seem to be assuming here that the tower and its description are NOT similar, contrary to what you said earlier and Danny seems to be insisting upon. Second, this hypothetical person is, we both agree, confused. They made a mistake, what they said was wrong. Correct? I ask, because many people seem to want to say that they were NOT confused or wrong, just kind of less correct than if they used the right URI. Third, and most important, anyone interested is unlikely to be confused, yes indeed. But any piece of software or inference engine is not unlikely to be confused. In fact, it is virtually guaranteed to be in the position of generating absolute nonsense. If all the inference software was as smart as the average ten-year-old human, we wouldn't even need the semantic web because the software would be able to read the text on Web pages. But it isn't, and we do (need it, that is.) But if you are a semantic inference engine, and you get the dog and its picture muddled, will you likely generate a lot of nonsensical assertions? Answer, Yes, you will. Which is the key point at issue here. Yeh, yeh. Contexts, local purpose, pragmatism. Now, make this happy thought cash out in an actual logic for use on the Web. Bear in mind that the very first principle of the Web is that the *publisher* of the data, who asserts these things about dogs or pictures of dogs, cannot possibly know what 'context of use' is going to be relevant to the *user* of the published content. So I say that my picture of Fido has had its rabies shots, and what will you make of this information, for your purposes, on the other side of the planet in a foreign city years after Fido has died? And what about all the other people who will use this misinformation for their different purposes? How am I going to keep them ALL happy? When you are the agent who is using this information, sure. But when you are the one publishing it or asserting it, you cannot do this. And when you are the one writing the rules to determine a globally accepted notion of entailment, you cannot do it. Well, now you are stepping into an ocean of cans of worms. Relevance logics, paraconsistent logics, etc. ad nauseam. But I dont think its our business to even go there. The Web logics don't give instruction on how to use information rationally in the face of uncertainty. Their purpose is much less ambitious and more restricted: just give entailment conditions which are universally correct, so that *whenever* you believe (for whatever reason) the premis, you are committed to believing the conclusion. Strict classical entailment works for everyone, and its about the only thing that does. So that is what we should be capturing in RDF and OWL, etc.. So, to go back to the http-range-14 issue, what are the *universal* principles that allow everyone to make the same valid entailments involving URI retrievals? AFAIKS, Danny is saying that there aren't any (?) Which is a reasonable answer, but is rather defeatist. I think http-range-14 is more useful than this. Pat ---------- From: *Michael Brunnbauer* <brunni@netestate.de> Date: 14 June 2011 10:45 To: Pat Hayes <phayes@ihmc.us> Cc: public-lod@w3.org re We should be able to present the user a lot of sensical assertions (and maybe some nonsensical ones) if we know he is concerned with information about dogs instead of information about pictures. Anyway - I think special purpose reasoners will play a much bigger role in the near future than general purpose reasoners because they perform better with big and messy data. And publishers will start to differenciate between dogs and pictures of dogs as soon as it provides them added value. Until that day, we will have to live with the situation and try to nudge people in the right direction (which includes httprange-14). But mass adoption means messy data in any case. Regards, Michael Brunnbauer -- ++ Michael Brunnbauer ++ netEstate GmbH ++ Geisenhausener Straße 11a ++ 81379 München ++ Tel +49 89 32 19 77 80 ++ Fax +49 89 32 19 77 89 ++ E-Mail brunni@netestate.de ++ http://www.netestate.de/ ++ ++ Sitz: München, HRB Nr.142452 (Handelsregister B München) ++ USt-IdNr. DE221033342 ++ Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer ++ Prokurist: Dipl. Kfm. (Univ.) Markus Hendel ---------- From: *Richard Cyganiak* <richard@cyganiak.de> Date: 14 June 2011 10:53 To: Christopher Gutteridge <cjg@ecs.soton.ac.uk> Cc: William Waites <ww@styx.org>, Pat Hayes <phayes@ihmc.us>, Danny Ayers < danny.ayers@gmail.com>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> Define “strictly semantically clear”. Good luck! Best, Richard ---------- From: *William Waites* <ww@styx.org> Date: 14 June 2011 10:54 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak < richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> * [2011-06-13 20:33:47 -0700] Pat Hayes <phayes@ihmc.us> écrit: ] > So there is some relationship between a description of the Eiffel Simply that they are similar enough (in the relevant respects etc) that one can write ":eiffel :height 324" for either and (reasonably?) expect the reader not to be confused. Confused or speaking loosely, not bothering to make the distinction because it seems to them that they are being clear enough that any reader will understand what they mean. If you call them on it they will probably agree that, yes, "what I really meant was ... but to have written that out would have seemed excessively pedantic" in exactly the same way that I wasn't confused when I wrote "confused" but I admit to being inexact :) So I agree with these many people who want to say that there are a lot of inexact statements that are not made by confused people just by people with perhaps unreasonably high expectations that the readers of their statements will be able to figure out what they meant if not strictly what they said. So this is the mismatch. Publishers write things down with some assumptions of what is likely to cause confusion that are probably based largely on their interactions with other humans, not with inference engines. Writing things down exactly is incredibly difficult. A very large part of almost every discussion or disagreement usually comes down to someone understanding what was said differently than the person who said it meant. It can often take a lot of discussion before this becomes apparent. And that's between humans! So we want to get people to publish linked or structured data that is as exact as possible. Each step in that direction is a little bit more burdensome for the publisher, feels a little bit more pedantic and verbose to write down, means the publisher needs to know a little more about the kinds of things a reader can handle, but at the same time is easier to write software that can use it using simpler and more general algorithms that we know. Some people seem to be saying that range-14 is a step too far. Other people seem to be saying that without that step it's impossible to write software in a general way to work with the data. If both are correct then we're stuck. The perception of RDF as complicated, verbose and pedantic is common and is something we cannot afford. Personally I don't think the range-14 arrangment is too burdensome but outside this community this is a minority viewpoint. We cannot throw up extra barriers to publishers. So we need better software that can handle this kind of inexact data. ] When you are the agent who is using this information, sure. But when Publishers will always make assumptions about how the information will be used. The assumptions will usually not be explicit. Even humans don't have a globally accepted notion of entailment, it's all about context and intent on the part of the agent doing the reasoning. They will just have to deal with the fact that the publisher may not have anticipated their use. Since range-14 seems to be a sticking point, we can try to address that particular kind of ambiguity with guidance about how to reason about information and non-information resources, and this guidance won't be general, it will have to do with particular classes and predicates and how they should be interpreted in the local (graph) context. ] Well, now you are stepping into an ocean of cans of worms. Oh, well aware of that :) ---------- From: *Richard Cyganiak* <richard@cyganiak.de> Date: 14 June 2011 10:57 To: Michael Brunnbauer <brunni@netestate.de> Cc: Pat Hayes <phayes@ihmc.us>, public-lod@w3.org Yes. It's certainly true in the case of the web -- you cannot apply off-the-shelf standard OWL reasoners on web data, because of its messiness. This is quite well-documented in the literature. That's spot-on. Best, Richard ---------- From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk> Date: 14 June 2011 11:07 To: Michael Brunnbauer <brunni@netestate.de> Cc: Pat Hayes <phayes@ihmc.us>, public-lod@w3.org I think in this lies a key issue. Much of my experience of producing linked data has been anti-climatic. If I was having to justify every coding hour then it's hard to say the providing public open data really gained much value for our business. What would be really nice is some public services which consume RDF and produce something useful, so that people actually get a direct value out of putting out linked data. One of my unfunded, background projects is programme.ecs.soton.ac.uk -- I'm working on a PHP library which will consume the RDF and produce a nice big part of an HTML website for a conference from it, along with mobile interfaces & tools like a "print out a schedule to go on the door of each room each day" and "check you didn't double book a speaker". Plus a tool to author the data in a spreadsheet and convert that into RDF. The goal is to get people creating nice RDF data for their conferences because it makes their lives easier, not because it's the right thing. Hopefully in the next year or two it'll hit a tipping point and we'll get some third party tools working with the data and it'll be a really useful format. You can see a prototype of the PHP library in action on this conference site: http://data.dev8d.org/2011/**programme/<http://data.dev8d.org/2011/programme/> I'd encourage the community to build more tools for webmasters, not for the linked data community! ---------- From: *Alan Ruttenberg* <alanruttenberg@gmail.com> Date: 14 June 2011 11:47 To: Richard Cyganiak <richard@cyganiak.de> Cc: Christopher Gutteridge <cjg@ecs.soton.ac.uk>, William Waites < ww@styx.org>, Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> Why don't we start with the following: Message sender has some statements they want to communicate. They encode their statements into the language. The encoding is sent. The receiver examines the the encoding and constructs an understanding consisting of some statements. Key is that the construction and interpretation of the message are isolated events - the first communication between the parties is via the message. Now the parties meet and compare the statements intended with the statements understood. Note that the parties might be humans or machines, without prejudice. Repeat. If, reliably (which doesn't mean *always*, but does mean more often then not) the comparison is favorable, then the messages are semantically clear. The "strictly" word is superfluous. We can design various protocols for doing the comparison, which does not have to be a discussion. For example the message might specify some actions and we can check whether the actions taken after interpreting the message match the intention of the sender, or whether the receiver has confidence enough in their understanding of the message. What we have seen is that for some of the messages being discussed in this thread, there have been raised a number of concerns about whether that process will work under various of the assumptions and assertions made by the participants in the thread. My assessment is that, at the moment, the messaging that has been proposed is not semantically clear. -Alan ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 14 June 2011 17:55 To: William Waites <ww@styx.org> Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak < richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> Well, you have got me confused. Are you saying here that it does in fact make sense to say that a description of the eiffel tower is 356M tall? So that your triple here is actually ambiguous, but one can rely of reader's common sense to figure out which one is meant? I had always thought that when people used a name of a name instead of the name of a thing, they were usually just blurring the use/reference distinction, not that they genuinely weren't sure whether they were talking about things or names. Maybe we just produced the Web situation in miniature, because whether or not you were confused, I certainly was (and still am) trying to figure out what you are saying here. If they do this when writing, say, Javascript or PERL, things will go badly wrong. If they do it when writing RDF, things will also go wrong. Even when writing English in a non-conversational situation where reading is separated from writing (eg road signs, email), things will go wrong surprisingly often. I am not sure how much we can expect to be responsible for people saying garbage because they are too lazy or incompetent to learn how to use a language. You seem to be making my point for me here :-) No, what will happen is that a class of people will arise who *do* understand http-range-14 (and other issues that are perceived as 'hard') and they will for a short while be able to earn a living writing (or writing code which generates) this stuff properly. This situation will last at most a decade, because by then a new generation of people will have educated themselves to 'speak' correctly in this new style without apparent effort, and all the whining about how terribly hard it was will be the stuff of nerdish jokes on XKCD. All we really need is enough people who can see through this mist of fear and actually get RDF written. On the whole, looking at the way the linked data is being created, I think we are doing quite well. Once stuff starts working and doing something useful, all this fear of formalism will melt away. Sure, and to solve global warming, we need better power sources that don't emit CO2. Your move. We aren't going to get this magic software any time soon. The inference software in the semantic web engines behind RDF and OWL and RIF are the state of the art. If people can't write data that doesn't break these, we are in trouble. But I think they can: after all, they write RDB data out the wazoo. Anyone who can understand SQL can surely get their head around the distinction between the eiffel tower and a web page. Ah, but they do. That is exactly why inference engines work. No, its not all about context. There really are non-contextual logics. If it really were ALL about context, the Web itself would not work. It will be (well, it can be) general for the entire Web. Why not? The http-range-14 rule is actually pretty simple and intuitive. In my experience, most people kind of assume it without thinking about it, actually. (**Of course** the URI of a web page is the name of the web page...) So why not just say it, loud and clear, until people get it? Pat ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 14 June 2011 17:56 To: Christopher Gutteridge <cjg@ecs.soton.ac.uk> Cc: Michael Brunnbauer <brunni@netestate.de>, public-lod@w3.org Well, +1 to that :-) Pat ---------- From: *Richard Cyganiak* <richard@cyganiak.de> Date: 14 June 2011 20:19 To: Alan Ruttenberg <alanruttenberg@gmail.com> Cc: Linked Data community <public-lod@w3.org> Alan, Google won't scrap schema.org because your thought experiment proved that it's not “semantically clear.” I think that we are beyond the point where that kind of extremely idealised account is useful for evaluating web technologies. But just to stay in the spirit of your proposal: 1. The sender may not care that certain receivers be able to understand their message 2. The message cannot strictly be the first communication -- there always has to be prior agreement on protocols, formats, languages, vocabularies 3. Both parties will already share certain context that is outside of the message, otherwise why would they be communicating 4. Depending on the value of the communication to the receiver, they may or may not be willing to go to certain lengths in order to interpret the message, including the application of heuristics, studying the sender's documentation, dereferencing their schema and applying reasoning etc 5. The receiver may want to use the information for purposes not intended by the sender So this is all rather subjective and context-dependent. I'm extremely skeptical of generic claims about the “strict semantic clarity” of a certain way of publishing data, especially if it is claimed to be a binary black-and-white thing. Best, Richard ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 14 June 2011 21:37 To: public-lod@w3.org Yep! +1000 We just have to accept that the Web is an ocean liner scale space, things are going to happen slowly, courtesy of opportunity costs materialization. Unfortunately, people don't like prescriptions that are preventative in nature, they simply like to have cures for problems as they arise. Sad but true, at least in my years of experience. As stated repeatedly, we should never scorn or take issue with any entity that contributes structured data (in any format) to the Web. Half bread is better than none :-) ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 14 June 2011 21:41 To: public-lod@w3.org The community should build and embrace solutions for a broad range of user and developer profiles. The Web isn't about developers solely, neither is computing in general. As I recall, Apple's success isn't driven by its sole preoccupation with developers, ditto Google, Facebook etc.. All of these organizations provide palpable solutions that deliver tangible value off the bat, no code required whatsoever. Developers are but one community, focusing on them solely isn't going to change much, really! ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 14 June 2011 22:03 To: public-lod@w3.org ** Yes! Serving folks their lunch works better than warning them about the effects of your pending actions or the ill effects of their current action. This is why the solution to the problem lies in delivering applications and services that provide value to a broad spectrum of consumer profiles, courtesy of their fundamental understanding that Generic Names and Location specific Names (Addresses) != same thing. ---------- From: *Michael Brunnbauer* <brunni@netestate.de> Date: 14 June 2011 23:37 To: public-lod@w3.org re as I was talking about "messy" data, some anecdotes from our work with foaf-search.net: -Want to see some people and groups that are an owl:Ontology ? http://www.foaf-search.net/SearchRDFType?type=http%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23Ontology Thank god everyone using our website either knows instantly that this is wrong or does not have a clue what owl:Ontology is. -Today, our website spent hours merging thousands of different people into one person because our java developer made an update and forgot the code to check the inverse functional property foaf:mbox_sha1sum (SHA1-hash of mailbox URI) for bad values like 08445a31a78661b5c746feff39a9db6e4e2cc5cf (SHA1-hash of "mailto:"). We need these kind of hacks to keep everything running. -foaf:homepage and foaf:weblog are inverse functional properties in the foaf ontology. We excluded them in our reasoners in fear of users having shared pages or being sloppy about what to fill in when asked for their homepage or weblog. But the very popular livejournal blog software only uses foaf:weblog to identify your friends so we had to accept at least foaf:weblog. -This is something I found before our crawler found it - fortunately: http://data.totl.net/dave.rdf -From the same website comes a huge database of many of the world's obscure industrial bands. Cool - except they are endless and made up on the fly :) http://data.totl.net/musicdb/music.cgi/bands?page=1 -Speaking about fakes: http://fakefriends.me/ makes up fake identities including crawlable FOAF RDF data on the fly. And almost every elgg blog our FOAF crawler gets to crawl has been taken over by spammers or was installed by them in the first place. -Things can have so many different foaf:names. What is the canonical one ? We are currently using the one with the most quads but this is surely not the best possible solution. This list will probably grow much larger in the near future. ---------- From: *Alan Ruttenberg* <alanruttenberg@gmail.com> Date: 15 June 2011 02:07 To: Richard Cyganiak <richard@cyganiak.de> Cc: Linked Data community <public-lod@w3.org> Richard, that wasn't the point. You mocked the idea that "semantically clear" could be defined. I responded with an attempt. We will agree to disagree then. Perhaps in another thread you will say what *will* be useful for evaluating web technologies. Or do you think they are above evaluation? ah, good! Not relevant to this piece of the thread. The goal was to have a go at defining "semantically clear". But in the spirit of responding I will grant you that some people may not care. However I'm pretty sure that the people we care about using schema.org will care. There will be others who use schema.org not to communicate but to try to game the google ranking system, and for such people, whether there is a message conveyed or not may not matter. However I don't think we are interested in considering their needs Granted. I don't think that this affects the substance of the proposal, but if you say how it would I will try to address it. > 3. Both parties will already share certain context that is outside of the message, otherwise why would they be communicating. I have not said that they are intentionally communicating - that the message was intended for an specific person. This removes the support for the assumption of the first clause. But to address it: that they will share a certain context outside the message may or may not obtain. For instance sender may be a person, and receiver a machine, and it's not clear what shared context they could have given the current state of machine technology. However if you think the shared context somehow undermines the proposal, please say how. Again, this is outside the scope of my proposal, which in response to your skepticism about whether "semantically clear" could be defined. ditto. You have not demonstrated subjectivity or context-dependency in my proposal. However I will be interested if you attempt to. You may be skeptical that semantic clarity (again, I don't think "strict" brings anything) is *relevant* in some or all cases. I may engage you on that issue separately. However I don't see that you have succeeded in finding a flaw in my proposal for how one might go about defining it operationally. Regards, Alan ---------- From: *Richard Cyganiak* <richard@cyganiak.de> Date: 15 June 2011 12:11 To: Michael Brunnbauer <brunni@netestate.de> Cc: public-lod@w3.org Another anecdote, I don't remember whom I heard this from: From FOAF data you can see that a lot of people say that their homepage is … "Google". Best, Richard ---------- From: *Richard Cyganiak* <richard@cyganiak.de> Date: 15 June 2011 12:24 To: Alan Ruttenberg <alanruttenberg@gmail.com> Cc: Linked Data community <public-lod@w3.org> I have no interest in theoretical discussions that are detached from application. Adoption trends, ergonomics, fit with the existing technology ecosystem, existence of migration paths, marketability, potential of network effects. Best, Richard ---------- From: *Mischa Tuffield* <mmt04r@ecs.soton.ac.uk> Date: 15 June 2011 12:24 To: Richard Cyganiak <richard@cyganiak.de> Cc: Michael Brunnbauer <brunni@netestate.de>, public-lod@w3.org -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, <snip/> I am not sure this is on-topic anymore but, these are the following values I blacklisted and flagged when used as an IFP in the FOAF validator I wrote on foaf.qdos.com (I know it is currently down, we are repurposing hardware at the mo - so sorry!). $ifpblacklist = array("<mailto: >",'"da39a3ee5e6b4b0d3255bfef95601890afd80709"','"08445a31a78661b5c746feff39a9db6e4e2cc5cf"','"20cb76cb42b39df43cb616fffdda22dbb5ebba32"','< http://www.google.com/>','<http://www.google.com>','<http://www.bbc.co.uk/ >','<http://bbc.co.uk>','"02085a0d20a5f574c1ce6cfe42bba6e85cfe07cf"'); Some of the hashes in the blacklist where added due to copy and pasting errors when people where knocking together handwritten FOAF files, iirc John Domingue shared one of the foaf:mbox_sha1sum's with Tom Heath (probably from the time when they both worked at KMI). Mischa -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.12 (Darwin) iQIcBAEBAgAGBQJN+IhOAAoJEJ7QsE5R8vfvVTsP/0kx9/spxqLciwUAWCRHPT3V SWgsl/Rlk0i4SDOvBcyAdXpuOxQfB06nuY5Bps4RrfZWb5Q5AwYMThGmEDeXq1n+ STlD3eNsXBscaF5Yocnxp22Z2t98d3bNB8Lia5uuEJmq28mG+H3ijNqcDq7+ztnp f/XG+DV5ONXsE2XRmfQ8nFTKm/6Rkaylg49Ndjx0xcybEUXWthpBxdVprsKXHdq7 lIZ4/TtF5i/B37sIx5yOUhXs1d0wR+D+hkOIk0vBHoCbvcOhutE3LjanNAPK/B+f HWG2AAhc3w+syeXs2noABabCO+1Ac2CkKGfA4F2rhdD5xnk/tCEkwZGrqhb4W61k eOYdU1OI9epbayhVTimfRn28/I4/mwNmhuevQYNGmt3DuC7RrgPiH0OOqCuu+Cp3 Aed/lVt4lSyeHNQQCLBy8ZPDTfdPbXL449Dvsz6i/2fwFtFjHmTF/Z0Ac0HOiV0y eqxL+FOb3Qt0VAQ/Abklii282jwC91Wlb+TIifPjF9xD9aUzndbBxBNlPe7mtrIy QMNwgTerGlJx2FX+81v8EvmzjKuolVeMq+NzYA5ohiUZtiSWa7eJwms28aOCWj50 OOz+QTo4VaCcI0UVrWUcAeNHAfKgNV7eKX2wycPOPnjta/DHYAIuzvoTm3WLShSL YT+NT4LxkoRf9u26PRRA =ENLb -----END PGP SIGNATURE----- ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 15 June 2011 17:27 To: Pat Hayes <phayes@ihmc.us> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Even with information resources there's a lot of flexibility in what HTTP can legitimately respond with, there needn't be bitwise identity across representations of an identified resource. Given this, I'm proposing a description can be considered a good-enough substitute for an identified thing. Bearing in mind it's entirely up to the publisher if they wish to conflate things, and up to the consumer to try and make sense of it. As a last attempt - this is a tar pit! - doing my best to take on board your (and other's) comments, I've wrapped up my claims in a blog post: http://dannyayers.com/2011/06/15/httpRange-14-Reflux ---------- From: *Jason Borro* <jason@openguid.net> Date: 15 June 2011 20:35 To: Linked Data community <public-lod@w3.org> I agree with your sentiments Danny, fwiw. The current scheme is a burden on publishers for the sake of a handful of applications that wish to "refer to these information resources themselves", making them "unable to talk about Web pages using the Web description language RDF". What about minting a new URI at "http://information.** resourcifier.net/encodedURI <http://information.resourcifier.net/encodedURI>" or similar for talking about such things? The service could even add value by tracking last update times, content types, encodings, etc. Jason p.s. Don't bother criticizing the half baked idea, I thought about it for < 10 seconds. The point is 100 alternatives could have been hashed out in the time spent discussing and implementing http-range-14. Kudos to google et al for ignoring it. ---------- From: *William Waites* <ww@styx.org> Date: 15 June 2011 23:24 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, Richard Cyganiak < richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas < michael.hausenblas@deri.org> * [2011-06-14 08:55:09 -0700] Pat Hayes <phayes@ihmc.us> écrit: ] Well, you have got me confused. Are you saying here that it does I'm just saying that things like this will be published because the publisher is confused, or mistaken or doesn't think that making the distinction is important or convenient and consumers of the data have to deal with it. We should encourage the publishers to do a better job but some of them will balk and sometimes, like with the schema.org that started this thread, big, important publishers with a lot of influence will balk. If we're lucky we can convince them to fix it, otherwise writers of software that consumes the data and tries to reason with it have to work out a way to be robust in the face of this kind of ambiguity. That's all. -w ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 15 June 2011 23:46 To: public-lod@w3.org Danny, This is part of the problem: TBL's argument: the HTTP URIs (without "#") should be understood as referring to documents, not cars. It assumes that the audience doesn't have a clue, so the description has to be so condescending albeit inadvertent. How about: TBL's argument: the HTTP URIs (without "#") should be understood as referring to an Address. A Data Source Name. What data publisher provides to user agents for accessing specific data in a given format, courtesy of content negotiation or lack thereof etc.. The confusion is a self inflicted one courtesy of narrative style and tone, methinks. URIs abstract Names and Addresses. This whole thing isn't unlike DNS. Points of presence on TCP/IP networks have NIC addresses and cnames, courtesy of DNS. Spreadsheets have offered cell addresses and cell names since forever. Programmers have worked with de-reference (indirection) and address-of operators forever. Most of the time when they encounter the: "... is a document, not cars ... " style narrative, its throws them for a loop! As you know, a Document == Data Container that's projected to users via user agents (typically browsers) using a specific presentation oriented metaphor. Using 303 to deliver indirection is an accurate reflection of the required heuristic for implementing de-reference (indirection) via HTTP URI based Names. Otherwise, use a # terminated URI and get similar (but ultimately limited) effects without an actual 303. Web users started off using Addresses as Names for Resources (Web Docs). Now we're introducing a new abstraction where Name and Address are Distinct (i.e., we have Named Objects and Object Representation Addresses, interwoven), thus we need to find a variety of ways to explain and demonstrate this new abstraction generally known as Linked Data. One size never fits all, and http-range-14 is certainly not going to be the narrative that breaks that age-old mold :-) ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 15 June 2011 18:30 To: Danny Ayers <danny.ayers@gmail.com> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Im sure you are right, but I have no idea why you think this fact is remotely relevant to the issue. Boy, that is a humdinger of a non-sequiteur. Given that HTTP has flexibility, it is OK to identify a description of a thing with the actual thing? To me that sounds like saying, given that movies are projected, it is OK to say that fish are bicycles. AFAIKS, the details of HTTP really have nothing at all to do with this issue, ironically enough. The only thing that HTTP does is to closely associate rather a lot of URIs to things like Web pages. The *nature* of the http 'association' really are irrelevant to this issue, which has to do with when it is legitimate to infer a denotation relation from this association relation. The question at issue here is what URIs are said to denote. It is very natural and intuitive to say that a URI which is http-associated with X also denotes X. Hence the 200 convention; but we want some URIs to denote things that are definitely not the kind of thing that HTTP (or any other XXTP) can possibly associate a URI to. Hence the 303 work-around. Well, if the publisher wants to say that a web page actually is Sherlock Holmes, or my pet cat Marco Polo, then that publisher is bat-shit crazy, and I will ignore them. OK, thanks. Here is your argument, as far as I can understand it. 1. HTTP representations may be partial or incomplete. (Agreed.) 2. HTTP reps can have many different media types, and this is OK. (Agreed, though I cant see what relevance this has to anything.) 3. A description is a kind of representation. (Agreed, and there was no need to get into the 'isomorphism' trap. We in KRep have been calling descriptions "representations" for decades now.) 4. Therefore, a HTTP URI can simultaneously be understood as referring to a document and a car. Whaaat? How in Gods name can you derive this conclusion from those premises? Pat ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 16 June 2011 02:26 To: Jason Borro <jason@openguid.net> Cc: Linked Data community <public-lod@w3.org> I confess to finding this kind of sneering remark rather annoying. If you think it is this trivial to work out some 'alternative', why don't you come up with a few actual ideas and see what happens when they get a little peer review? Your idea, above, hardly makes first base, as Im sure you already realized when you added the p.s. So why not try inventing one that has a snowballs chance in hell of actually working? Im sure that the world would be delighted if you could solve this trivial problem in 5 ways, let alone a hundred. If you agree with Danny that a description can be a substitute for the thing it describes, then I am waiting to hear how one of you will re-write classical model theory to accommodate this classical use/mention error. You might want to start by reading Korzybski's 'General Semantics'. Pat ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 16 June 2011 02:36 To: Pat Hayes <phayes@ihmc.us> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Not that I think I did a non-sequiteur, it is totally ok to say that fish are bicycles, if that's what you want to say. [snip] my wording could be better, but I stand by it... a document describing the car, through HTTP, can be an equally valid representation of the named car resource as the car itself (as long as it's qualified by media type) ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 16 June 2011 03:27 To: Pat Hayes <phayes@ihmc.us> Cc: Jason Borro <jason@openguid.net>, Linked Data community < public-lod@w3.org> IANAL, but I have heard of the use/mention thing, quite often. I don't honestly know whether classical model theory needs a rewrite, but I'm sure it doesn't on the basis of this thread. I also don't know enough to know whether it's applicable - from your reaction, I suspect not. As a publisher of information on the Web, I'm pretty much free to say what I like (cf. Tim's Design Notes). Fish are bicycles. But that isn't very useful. But if I say Sasha is some kind of weird Collie-German Shepherd cross, that has direct relevance to Sasha herself. More, the arcs in my description between Sasha and her parents have direct correspondence with the arcs between Sasha and her parents. There is information common to the reality and the description (at least in human terms). The description may, when you stand back, be very different in its nature to the reality, but if you wish to make use of the information, such common aspects are valuable. We've already established that HTTP doesn't deal with any kind of "one true" representation. Data about Sasha's parentage isn't Sasha, but it's closer than a non-committal 303 or rdfs:seeAlso. There's nothing around HTTP that says it can't be given the same name, and it's a darn sight more useful than a wave-over-there redirect or a random fish/bike association. I can't see anything it breaks either. ---------- From: *Jason Borro* <jason@openguid.net> Date: 16 June 2011 05:04 To: Linked Data community <public-lod@w3.org> Apologies if my keyboard sneered at you, though comparing an application problem to 1% of hr14 at web scale hardly trivializes it; certainly it does the opposite. Good luck preserving your mental model if you require webmasters to spell Korzybski. ---------- From: *Alan Ruttenberg* <alanruttenberg@gmail.com> Date: 16 June 2011 07:53 To: Jason Borro <jason@openguid.net> Cc: Linked Data community <public-lod@w3.org> This is an odd comment. It's like saying good luck preserving your model of TCP if you require network developers to know where Postel worked. TCP has to work, whether or not webmasters know the intellectual history its development, and the same will be true of whatever eventually becomes what the semweb ideas are aiming at. Pat's knows something about the history of what's known to work and what isn't. You ignore that history at the peril of your ideas simply not working. -Alan ---------- From: *Alan Ruttenberg* <alanruttenberg@gmail.com> Date: 16 June 2011 08:05 To: Richard Cyganiak <richard@cyganiak.de> Cc: Linked Data community <public-lod@w3.org> I assume you mean you are not interested in discussions of theory that are detached from application. In any case this is a non-sequitor. The definition is offered because some, including myself, think that there are important classes of applications for which it is an essential ingredient of success (like some of the ones I need to build), and because you implied that defining what we meant was not feasible. Does what the technology *accomplishes* fit in there somewhere? Looking at the above, one might conclude that a successful Ponzi scheme of some sort would score well. Regards, Alan ---------- From: *Richard Cyganiak* <richard@cyganiak.de> Date: 16 June 2011 11:38 To: Alan Ruttenberg <alanruttenberg@gmail.com> Cc: Linked Data community <public-lod@w3.org> Web technologies are never about accomplishing anything new; they are about taking something that already works on a small and local scale, and making it work across the internet with its loosely coordinated actors. :-) If you want to look at it that way, standards, like anything that exhibits network effects, are a bit like a ponzi scheme: once you're inside, you benefit from getting others in your vicinity on board. The difference is that “late adopters” in a ponzi scheme are the suckers who lose their investment; while late adopters of a standard get the largest benefit at the smallest cost. Best, Richard ---------- From: *Jonathan Rees* <jar@creativecommons.org> Date: 16 June 2011 17:46 To: Linked Data community <public-lod@w3.org> In case anyone's not aware, the TAG is working in the area being discussed on this thread - i.e. on deployment and performance of linked data nose-following and the possible conflict with current metadata practices - as its issue 57, http://www.w3.org/2001/tag/group/track/issues/57 . In my analysis it's a notational and protocol issue, not a logical or philosophical one. To frame the discussion I'm preparing a document that collects multiple complete solution proposals in what I hope is a neutral form. The idea Richard C puts forth, as well as the one advanced last fall by Ian Davis, and something equivalent to the :at idea from Alan's message, are all included, among others. I have been waiting to announce this work on public-lod until I can prepare a new draft incorporating feedback I've received over the past few weeks, but given the volume of email on this thread I felt I had to say something about it now. If you are impatient and can't wait for the next draft, take a look at the current one, which you can find linked from the issue page named above. I invite discussion of it on the www-tag@w3.org list. Best Jonathan ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 16 June 2011 22:39 To: Danny Ayers <danny.ayers@gmail.com> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> Not only do I not follow your reasoning, I don't even know what it is you are saying. The document is a valid *representation* of the car, yes of course. But as valid as the car itself? So you think a car is a representation of itself? Or are you drawing a contrast between the 'named car resource' and the car itself? ??? Maybe it would be best if we just dropped this now. I gather that you were offering me a way to make semantic sense of something, but Im not getting any sense at all out of this discussion, I am afraid. Pat ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 16 June 2011 23:38 To: Danny Ayers <danny.ayers@gmail.com> Cc: Jason Borro <jason@openguid.net>, Linked Data community < public-lod@w3.org> True. Sasha and her parents are not themselves in your description. I presume you mean, the arcs between the terms you use, in your description, to refer to Sasha and her parents. Sasha and her parents don't have arcs between them (unless you are indulging in some cruel treatment of animals.) I presume you mean to refer to certain relationships which hold between Sasha and her parents. In this simple case (explicitly named relationships, explicit referring names) there is a kind of structural correspondence between the description and the reality, indeed. But as soon as you make the descriptive language even slightly more expressive, this breaks down. (Try adding negation or disjunction of even blank nodes.) And as soon as you admit that reality is more complex than any description of it, it breaks down. So its not a very good foundation to build a semantic theory upon. No. The reality is what it is; the information is held in the description (the one with the arcs and the names in it.) The information is ABOUT Sash and her parents (and the relationship of parenthood and various categories of doggitude, and so forth.) You betcha. What common aspects? If you mean to refer to the fact that a description with arcs and names can be TRUE OF some aspect of reality, you are talking about classical model-theoretic semantics, which is based on the idea of reference (AKA denotation) at its root; it is the interpretation mapping from names to the things they are interpreted to refer to (eg between "Sasha" and Sasha.) But the truth-in-an-interpretation relationship is not similarity or isomorphism, and it certainly does not warrant identifying the name with the thing named. Quite the contrary, it relies upon keeping this distinction clear. As Korzybski famously said, the map is not the territory. "Closer"? In what metric? I would say it is about as different as anything can get. OF COURSE it breaks things. It might be true to say that Sasha is a Collie-German Shepherd cross, but Sasha's description or web page certainly isn't. It might be true to say that the description is written in RDF, but Sasha isn't. Pat ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 16 June 2011 23:40 To: Jason Borro <jason@openguid.net> Cc: Linked Data community <public-lod@w3.org> I'd prefer they actually read him, though I won't hold my breath. Sorry to bother you by using a very long foreign name. Pat ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 16 June 2011 23:41 To: Richard Cyganiak <richard@cyganiak.de> Cc: Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community < public-lod@w3.org> LOL Pat ---------- From: *David Booth* <david@dbooth.org> Date: 17 June 2011 02:46 To: Linked Data community <public-lod@w3.org> Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> [ . . . ] Let's go further and clarify exactly what breaks: Using the same URI both for Sasha and Sasha's web page breaks *some* applications and not others. Applications that need to distinguish between dogs and web pages will find the URI ambiguous; applications that do not will be perfectly happy. This state of affairs is a universal fact of life that is true of *all* possible distinctions that may be made, regardless of whether the distinction is between web pages and dogs, or between different kinds of dogs, or between different kinds of proteins or anything else. Except in the absurdly reductionist sense that *every* URI is ambiguous (because finer distinctions can always be made), whether a URI is ambiguous or unambiguous is *not* a fundamental property of the URI: ambiguity is relative to the *application* that is using that URI. Given this fact of life, I maintain that permitting the same URI to denote both a web page and a dog does *not* break the architecture of the web. I agree with TimBL that this is a design choice about the architecture of the web, and a clean, extensible architecture is needed. I agree with TimBL that 303 (and hash URIs) are useful for those who *choose* to distinguish between the web page and something else. I agree with TimBL that the httpRange-14 rule is very useful, even if it was not ideally stated, and should *not* be abandoned. However, the major flaw lies not in the httpRange-14 rule itself, but in the associated assumption that a URI cannot sensibly denote both an "information resource" and a dog: http://www.w3.org/TR/webarch/#def-information-resource This assumption is fatally flawed because: (a) it attempts to make an IR/non-IR distinction that can never be nailed down precisely (as several people have pointed out); and (b) it unnecessarily elevates one particular axis of ambiguity over all others. It is analogous to a rule that says "all URIs for dogs MUST distinguish between male dogs and female dogs": the only applications that break without this rule are the ones that *need* to distinguish between male dogs and female dogs. All other applications will continue to work just fine without it. And that is exactly the way it should be for *any* axis of ambiguity. I agree with TimBL that it is *good* to distinguish between web pages and dogs -- and we should encourage folks to do so -- because doing so *does* help applications that need this distinction. But the failure to make this distinction does *not* break the web architecture any more than a failure to distinguish between male dogs and female dogs. -- David Booth, Ph.D. http://dbooth.org/ Opinions expressed herein are those of the author and do not necessarily reflect those of his employer. ---------- From: *Christopher Gutteridge* <cjg@ecs.soton.ac.uk> Date: 17 June 2011 11:21 To: David Booth <david@dbooth.org> Cc: Linked Data community <public-lod@w3.org>, Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> We've been encouraging people to do so. Most do not have the time to invest in complexity that they percieve no benefit from adding. We need to reward people for good semantics by making sure there's tools and apps which add value for their business and activities. / Lead Developer, EPrints Project, http://eprints.org/ / Web Projects Manager, ECS, University of Southampton, http://www.ecs.soton.ac.uk/ / Webmaster, Web Science Trust, http://www.webscience.org/ ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 17 June 2011 14:13 To: public-lod@w3.org ** Instead of *break* what about compromising or undermining flexibility implicit in AWWW? This is tantamount to obscuring the WWW potential relative to its broad user constituency. Re. schema.org, I don't regard their effort as breaking, compromising, or undermining AWWW. I simply believe they are taking baby steps that are 100% defined by their current business models. Rightly or wrongly so, they have to protect their business models. In a sense, the same applies to academia and its model where grant funding is vital to research projects. What is dangerous though, is encouraging people to misuse and misunderstand AWWW. Names and Addresses are distinct items. AWWW essence depends on preserving this vital distinction. When there are more applications (+1 to Henry's comment about focusing on Linked Data apps and viral patterns) this lower level matter will vapourize. Although not present (I am too young) I am certain similar arguments arose during the early days of silicon based computing between OS developers and programming language developers. I certainly know these conversations did arise when Spreadsheets vendors tackled Cell Reference functionality. There are many useful cases in plain sight that many overlook re. power of URIs as data conductors, integrators, and access mechanisms. I think (based on my experience with this community and industry at large) that there is too much focus on reinventing too many parts of the consumption stack, from scratch. The key is to be "useful" but introduce "usefulness" unobtrusively if you really seek uptake. Naturally, this requires understanding of what already exists (i.e., domain and subject matter knowledge) and functionality areas addressed by existing solutions. Sorry, but if all you do is program, you cannot really understand the reality of end-users. I like to make reference to Apple as a great anecdote because they've risen from near demise to the vanguard of modern computing by exploiting the InterWeb from the inside out, they don't see the Web as simply being about HTML. They understand that its a linked information space and future data space. They utilize this insight internally in a manner that just manifests as being "useful" to its ever growing customer base. Remember, there's a lot of old NeXTStep still underlying what Apple does. Also remember, the WWW was built on an NeXT machine with a lot of inspiration from how its innards worked. Believe it or not, we are still playing catch up (circa. 20011) with NeXTStep and Unix in general re. really smart and useful Linked Data apps :-) Embrace history and the future gets clearer and much more exciting. We have an unbelievable opportunity within grasp. We can embrace and extend (in a good way) what we may perceive as imperfections by others (e.g. schema.org). As Pat stated in an earlier post, these imperfections present opportunities that might even span decades before the behemoths out there hit their respective opportunity cost thresholds. Once said thresholds are hit they will respond accordingly via product fixes and/or enterprise acquisitions etc.. Contrary to popular belief, I will state once again that HTTP 303 is the poster child for ingenuity inherent in the HTTP protocol and the AWWW. Yes, we could also up the semantic smarts on clients and let a retrieved resource disambiguate Names and Addresses, but that only adds a burden to a target audience that's already challenged re: 1. recognizing linked data structures via directed graphs 2. recognizing that linked data structures have always been about links and that HTTP URIs are a powerful vehicle for expanding this concept to InterWeb scales 3. recognizing that de-reference (indirection) and address-of operations are achievable via URIs and cost-effectively so via HTTP URIs due to WWW ubiquity 4. understanding that RDF is *an option* for linked data structures at InterWeb scales, you can use other syntaxes without losing access to really useful stuff like RDFS and OWL semantics (which also suffers from over emphasis on RDF at expense of core syntax agnostic concepts). Links: 1. http://en.wikipedia.org/wiki/Spreadsheet#Cells 2. http://en.wikipedia.org/wiki/Spreadsheet#Named_cells . ---------- From: *Nathan* <nathan@webr3.org> Date: 17 June 2011 22:42 To: Danny Ayers <danny.ayers@gmail.com> Cc: Pat Hayes <phayes@ihmc.us>, Jason Borro <jason@openguid.net>, Linked Data community <public-lod@w3.org> You could use the same name for both if each name was always coupled to a universe, specified by the predicate, and you cut out type information from data, such that: <x-sasha> :animalname "sasha" ; :created "2011...." . was read as: Animal(<x-sasha>) :animalname "sasha" . Document(<x-sasha>) :created "2011...." . the ability to do this could be pushed on to ontologies, with domain and range and restrictions specifying universes and boundaries - but it's a big change. really, different names for different things is quite simple to stick to, and considering most (virtually all) documents on the web have several different elements and identifiable things, the one page one subject thing isn't worth spending too much time focusing on as a generic use case, as any solution based on it won't apply to the web at large which is very diverse and packed full of lots of potentially identifiable things. best, nathan ---------- From: *Nathan* <nathan@webr3.org> Date: 17 June 2011 22:43 To: Alan Ruttenberg <alanruttenberg@gmail.com> Cc: Jason Borro <jason@openguid.net>, Linked Data community < public-lod@w3.org> well said, although I think we could bracket yourself in that category too :) ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 17 June 2011 23:17 To: nathan@webr3.org Cc: Danny Ayers <danny.ayers@gmail.com>, Pat Hayes <phayes@ihmc.us>, Jason Borro <jason@openguid.net>, Linked Data community <public-lod@w3.org> No its quite simple in fact, as I pointed out in a couple of e-mails in this thread. You just need to be careful when creating relations that certain relations are in fact inferred relations between primary topics. yes, but there are a lot of people who say it is too complicated. I don't find it so, but perhaps it is for their use cases. I say that we describe the option they like, find out what the limitations are they will fall have, and document it. Then next time we can refer others to that discovery. So limitations to look for would be limitations as to the complexity of the data created. The other limitations is that even on simple blog pages there are at least three or four things on the page. indeed. agree. But it is one of those things that newbies feel the urge to do, and will keep on wanting to do. So perhaps for them one should have special simple ontologies or guides for how to build these ObjectDocument ontologies. In any case this seems to be the type of thing the microformats people were (are?) doing. Henry > > best, nathan Social Web Architect http://bblfish.net/ ---------- From: *Nathan* <nathan@webr3.org> Date: 17 June 2011 23:27 To: Henry Story <henry.story@bblfish.net> Cc: Danny Ayers <danny.ayers@gmail.com>, Pat Hayes <phayes@ihmc.us>, Jason Borro <jason@openguid.net>, Linked Data community <public-lod@w3.org> I'd agree, but anything that involves being careful is pretty much doomed to failure on the web :p there's also a primary limitation of the programming languages developers are using, if they've got locked in stone classes and objects, or even just structures, then the dynamics of RDF can be pretty hard to both understand mentally, and use practically. hmm.. microformats seems to be pretty focussed on describing multiple items on one page, however the singularity is present in that they focussed on being described using a single Class Blueprint style, one class, a predetermined set of properties belonging to the class, and a simple chained heirarchy - this stems from most OO based languages. With a bit of trickery you can use RDF and OWL the same way, it just means you have different "views" over the data, where you can see Human(x) with a set of properties, or Male(x) with another set, or Administrator(x) with yet another set. This is less about the data published and more about how it's consumed viewed and processed though. Quite sure something can be done with that, where the simple version of the data uses a basic schema.org like ontology, and advanced usage is more RDF like using multiple ontologies. The "views" thing would be a way to merge the two approaches.. Best, Nathan ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 18 June 2011 20:40 To: Pat Hayes <phayes@ihmc.us> Cc: Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg < alanruttenberg@gmail.com>, Linked Data community <public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> That's all that's necessary to square this circle. All HTTP delivers is representations of named resources. (I very much do think a car is a representation of itself in HTTP terms, in the same way a document is, but it isn't necessary here). I'll be delighted to drop it, I thought we were getting stuck in a tar pit but your statement above is the er, oil, that gets us out. ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 18 June 2011 20:51 To: David Booth <david@dbooth.org> Cc: Linked Data community <public-lod@w3.org>, Pat Hayes <phayes@ihmc.us>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Thanks David, a nice summary of the most important point IMHO. Ok, I've been trying to rationalize the case where there is a failure to make the distinction, but that's very much secondary to the fact that nothing really gets broken. ---------- From: *Pat Hayes* <phayes@ihmc.us> Date: 19 June 2011 06:05 To: Danny Ayers <danny.ayers@gmail.com> Cc: David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Really (sorry to keep raining on the parade, but) it is not as simple as this. Look, it is indeed easy to not bother distinguishing male from female dogs. One simply talks of dogs without mentioning gender, and there is a lot that can be said about dogs without getting into that second topic. But confusing web pages, or documents more generally, with the things the documents are about, now that does matter a lot more, simply because it is virtually impossible to say *anything* about documents-or-things without immediately being clear which of them - documents or things - one is talking about. And there is a good reason why this particular confusion is so destructive. Unlike the dogs-vs-bitches case, the difference between the document and its topic, the thing, is that one is ABOUT the other. This is not simply a matter of ignoring some potentially relevant information (the gender of the dog) because one is temporarily not concerned with it: it is two different ways of using the very names that are the fabric of the descriptive representations themselves. It confuses language with language use, confuses language with meta-language. It is like saying giraffe has seven letters rather than "giraffe" has seven letters. Maybe this does not break Web architecture, but it certainly breaks **semantic** architecture. It completely destroys any semantic coherence we might, in some perhaps impossibly optimistic vision of the future, manage to create within the semantic web. So yes indeed, the Web will go on happily confusing things with documents, partly because the Web really has no actual contact with things at all: it is entirely constructed from documents (in a wide sense). But the SEMANTIC Web will wither and die, or perhaps be still-born, if it cannot find some way to keep use and mention separate and coherent. So far, http-range-14 is the only viable suggestion I have seen for how to do this. If anyone has a better one, let us discuss it. But just blandly assuming that it will all come out in the wash is a bad idea. It won't. Pat ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 19 June 2011 08:43 To: Pat Hayes <phayes@ihmc.us> Cc: David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Point taken Pat but I have been in the same ring as you for many years, but to progress the Web ---- can't we just take our hands off the wheel, let it go where it wants. (Not that I have any influence, and realistically you neither Pat). I'm now just back from a sabbatical, but right now would probably be a good time to take one. If these big companies do engage on the "microdata" front, it's great. I'm sure it's been said before, why don't we get pornographers working hard on their metadata on visuals, because they work for Google/Bing whatever. The motivation right now might not be towards Tim's day one goals of sharing some stuff between departments at CERN, but that's irrelevant in the longer term. Getting the the Web as an infrastructure for data seems like a significant step in human evolution. And it's a no-brainer. But getting from where we are to there is tricky. Honestly, I don't care. It'll happen, my remaining lifespan or about 50 on top, there will be another, big, revolution. Society is already so different, just with little mobile phones. /gak I'm no going to speculate, we're heading for a major change. Cheers, Danny. -- http://danny.ayers.name ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 12:37 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> The way to do this is to build applications where this thing matters. So for example in the social web we could build a slightly more evolved "like" protocol/ontology, which would be decentralised for one, but would also allow one to distinguish documents, from other parts of documents and things. So one could then say that one wishes to bring people's attention to a well written article on a rape, rather than having to "like" the rape. Or that one wishes to bring people's attention to the content of an article without having to "like" the style the article is written in. If such applications take hold, and there is a way the logic of using these applications is made to work where these distinctions become useful and visible to the end user, then there will be millions of vocal supporters of this distinction - which we know exists, which programmers know exists, which pretty much everyone knows exists, but which people new to the semweb web, like the early questioners of the viability of the "mouse" and the endless debates about that animal, will question because they can't feel in their bones the reality of this thing. Well hash uris are of course a lot easier to understand. http-range-14 is clearly a solution which is good to know about but that will have an adoption problem. I am of the view that this has been discussed to death, and that any mailing list that discusses this is short of real things to do. One could argue much more fruitfully on DocumentObject ontologies, and it would be interesting to see where that leads one. Well these are logical necessities you are speaking of. So it will come out in the wash. Just like 2+2=4, those who wish to ignore it will loose out in a number of transactions. So the fun thing is that we can find completely coherent ontologies that don't brake the semweb and that would allow Richard Cyganiak to write > <http://richard.cyganiak.de/> a foaf:Document; > dofoaf:name "Richard Cyganiak"; > dc:title: "Richard Cyganiak's homepage"; > dofoaf:knows <http://bblfish.net/> . It looks like here that the document has been confused with the object, but in fact the relations are designed so that they indirectly refer to something else. Now it is not clear that this is easier or less confusing to write than pure foaf. But it does make it look like what Danny wants to have is happening, namely that the document refers to the thing too - assuming a document only refers to one thing. But that is already the main problem. Even an image never refers to one thing only. Take a simple image of the eiffel tower: there can be cars in it, there can be birds, mice, rats (ratatouille), and many other creatures jumping around on people's heads. The higher the resolution the more things that picture can be said to refer to. So to know which is the primary topic of an image one would nearly need to add a new relation to express that. Henry ---------- From: *Hugh Glaser* <hg@ecs.soton.ac.uk> Date: 19 June 2011 13:05 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> "A step too far"? Hi. I've sort of been waiting for someone to say: "I have a system that consumes RDF from the world out there (eg dbpedia), and it would break and be unfixable if the sources didn't do 303 or #." Plenty of people saying they can't express what they want without it. And plenty of people saying they can't write some code that they might not be able to understand some RDF they receive properly. But no actual examples in the wild (at least as far as I can tell in a lot of messages). This might be for quite a few reasons, such as: 1) There are no such consuming systems; 2) The existing consuming systems would not break. Number (1) would be too embarrassing, and is wrong because I have some, so I'll think about number (2). There seem to be some axes in the discussion: publish / consume long/medium term / shorter term ideal / pragmatic Interestingly, we don't seem to have a strong theory / practice axis, which is great. As a publisher, I/we have had to work pretty hard to conform to really quite complex requirements for publishing RDF as Linked Data; not just Range-14, but voiD, sitemaps and various bits and pieces that Kingsley always tells me to do in the RDF. As a consumer, it has been pretty simple: "Well guv, thanks for the URI, here's some RDF." It has always been something of a source of angst (if not actual pain) to me that none of the extra work I put into publishing RDF is ever used by me or anyone else, as far as I know. In fact, some of the sites I consume actually don't do things "properly" - I might have had to change my consuming systems to cope with this, but I don't, because they already cope fine. Why is it not a problem? One obvious reason is that the consuming application is actually looking for specific knowledge about things. I don't have a consuming system that is considering both lexical and animal subjects, and so confusion does not arise. In fact, it is the predicates that tend to distinguish satisfactorily for me (as has been pointed out by some people). Thus, if I get a triple that says the URI that would resolve to my Facebook page foaf:knows the URI that would resolve to your Facebook page, I (my system) will happily interpret that as one person (or whatever) foaf:knows the other. I certainly don't want to go and resolve these to find out to what the URIs actually resolve. And if I did, what would I do about it? Ignore it? In fact, as has also been mentioned, you can define domains, ranges and restrictions for as long as you like, but it is quite possible and likely that the users of URIs will continue blissfully unaware of any of this, in exactly the same way that they continue unaware that there might be something ambiguous about the URIs they are using. By the way, as is well-known I think, a lot of people use and therefore must be happy with URIs that are not Range-14 compliant, such as http://www.w3.org/2000/01/rdf-schema . When we help people publish, it really is tough to engage them long enough to care about the complex issues, and they often get it wrong - I am engaged with quite a few people who are now publishing serious amounts of interesting RDF where I have contacted them to try to help. The status of the conversations is that they have fixed what they can, and are now thinking (for a long time) about how they might configure their systems to do it properly - but they may never get there. I will still want to use their RDF. So, trying to be a little brief: I have always felt that the full Range-14 distinction was in danger of being a Step Too Far. Yes, it does matter, and it is likely (or at least possible) we will pay a price in the end. But the world is trying to pass us by - it has at least pulled alongside. We must work out why we seem to have lost any lead we had, because it is likely to be the same reason we will get left behind. And I happen to believe that what we have can be better than the alternatives. Sorry Pat, I don't actually have a proposal. But I do know we need to be liberal in what we consume. And we might need to be a bit more liberal in what we praise, or at least be nicer to people who want to publish RDF and don't do Range-14. Best Hugh -- Hugh Glaser, Intelligence, Agents, Multimedia School of Electronics and Computer Science, University of Southampton, Southampton SO17 1BJ Work: +44 23 8059 3670, Fax: +44 23 8059 3045 Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652 http://www.ecs.soton.ac.uk/~hg/ ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 19 June 2011 13:23 To: public-lod@w3.org Danny, Do you agree with HTTP-range-14 finding or not? My gripe with HTTP-range-14 is all about aesthetic matters re. language and anecdote choices, not the core concept it attempts to articulate. If you clearly state your gripe in similar terms there could be a chance of yourself and Pat actually realizing that you are in agreement. Personally, I've always assumed you clearly groked why Name and Address disambiguation is vital re. Web's data space dimension. I am suspecting that you are saying: we should find ways to co-exist with initiatives (e.g. schema.org) that haven't addressed these matters, just yet etc.. Note: many are grappling with how to construct viable business models from Linked Data, thus in some cases you will have services that look like they don't care about Name and Address disambiguation on the outside, courtesy of their publicly accessible resources, while in reality they understand these matters very well and have put them you use for a while. Remember, a URI doesn't have to be public :-) I think the debate will ultimately be more about getting these big players to share their more powerful URIs with the public via services and apps from communities like this that make the opportunity costs of these big players palpable :-) Kingsley ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 13:44 To: Hugh Glaser <hg@ecs.soton.ac.uk> Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> As you point out there are some consuming systems but they are not very distributed: you know ahead of time what you will find there, and so you can adapt your parsing for the few special cases. At that level the XML crowd/JSON crowd are right - rdf does not give you much. In fact it makes it easier to do things wrong. So we should be supporting more RESTful XML that can be GRDDLed with X-SPARQL. The semweb gives you a lot more when things get even more distributed, such as when everyone starts having foaf files on billions of computers. At that point nobody will want to tweak their app for the specific data at one site. Also one will want to be careful of the difference between documents and things, for the same reason I pointed out with the "like" button in Facebook. So for the moment the errors don't appear, because we are few consumers and few producers, and we can work around mistakes manually on a case by case basis. To get a real linked data application you need: 1- data that is produced in a completely decentralised way 2- data that is linked between those decentralised nodes 3- data that is consumed, and where the consumption has real world effects Number 3 is the recursive feedback piece that will make 1 and 2 come to a point of stability, or meta-stability, as we are dealing with self organising systems here. This can be done with the social web. We need systems where you publishing data means that I can do something, learn something about you, and so on... but without you ever knowing ahead of time what software or services we are using. (( The Twitters and other Web2.0 folks have made their life easy by centralising data publishing and consumption as much as possible. For systems like there is no real communication problem: there is a central dictator and he says what the meaning of the terms go. As things evolve that part even escapes him - the way office document formats escaped M$ - because of the huge number of people and software dependent on the initial meaning produced.)) If I write things out wrong, your software should be able to let me know about it. Just as if we organise to meet but we give each other the wrong address, we will end up missing the meeting. If this were not so then giving out addresses and organising meetings would be a very different exercise. yes, my point has been we need to work on small vocabularies, widely distributed, widely used, to kick start the rest of the system And as pointed out above they are not that distributed, and the consequences of things going wrong on a lot of the open data stack is not that big yet. Also you are probably not putting up reasoners yet. yes, one can do a lot with incoherent data if one ignores the incoherence, or just follows through some networks like that. I try to follow these guidelines more for reasons of sanity. They are simple to follow, and help one think about the issues. That is the problem that only will appear if people don't consume the data, or if the data is known ahead of time to be pretty inconsistent, as dbpedia data probably is. yes, in these case by case scenarios it is easy for you to write special case filters. And we could do the same thing with HTML whenever we browse the web too. But the web had an application: the browser that lead to feedback effects that increased the coherence of the system. It has not passed by, it is not building for the distributed data. The big players are creating silos of information and getting rich of that. But the value of distributed information is much greater than what they are building - even if it is hard to believe. In any case we have no choice: the big guys are rich already. We can either be their slaves or be free by working together, and grow so big together that we tie them into our much larger system :-) We are not behind. We are way ahead. The arrows in your back are a testament to that :-) ---------- From: *Dave Reynolds* <dave.e.reynolds@gmail.com> Date: 19 June 2011 13:45 To: Hugh Glaser <hg@ecs.soton.ac.uk> Cc: Linked Data community <public-lod@w3.org> Hi Hugh, Your general point that there is non-compliant data out there that people are still able to make use of is probably right, but that specific example is compliant - those are all (even the ontology URI) hash-URIs. Dave ---------- From: *Kingsley Idehen* <kidehen@openlinksw.com> Date: 19 June 2011 14:04 To: public-lod@w3.org Er. we use it :-) The problem with this whole Linked Data thing is that its truly Ninja tech. The killer conductor of value is the LINK. This lethal weapon applies to all dimensions of the Web: 1. Information Space 2. Data Space 3. Knowledge Space. Trouble is, where do we find strong anecdotes for a cross dimensional lethal weapon? I try to use Stars Wars and the FORCE at times, but even that doesn't quite nail what we are dealing with here. Thus, we could take another approach i.e., embrace and extend what we know is anomalous since the AWWW architecture (FORCE) actually lets us do this anyway. Exactly! You are using the FORCE :-) You have a Data Space dimension app. The Information Space dimension doesn't interfere with your world view. This is key in many ways. For instance, imagine if your app was of the Information Space dimension instead, the effect would be very close to what we see today re. those that see Name and Address disambiguation as impractical overkill since nothing breaks in the world they experience. Yep! The Data Space realm lets you Describe anything with clarity, and even when unclear, agents can ultimately agree to disagree without obliteration. As you would in code generally, encounter an exception, and decide if you avoid making it a critical fault :-) Yes, when they operate in the Information Space dimension. In the Information Space dimension, yes. In that dimension it doesn't matter. Yes, and all you do is show them a tweaked version of their RDF, should they wander by your data space (which is grounded in the Data Space realm). Its fine, we just can't present it in edict form to people experiencing and operating with the Information Space dimension of the WWW. You betcha! IMHO. People are doing what they always do: ignore warnings and scramble desperately for cures, post calamity. Note, in most cases, using the industry behemoths as examples, calamity == business model erosion courtesy of exponentially increasing opportunity costs. We need to accept that the WWW has many dimensions to it, Information, Data, and Knowledge. Thus, we can't speak from the Data Space dimension to folks in the Information Space dimension and expect immediate comprehension. We could (hence power of HTTP 200 OK) operate within the Information Space dimension and unveil the Data Space dimension. Like all contextual matters, we have to align "context lenses" in order for use to develop constructive dialog. This is why "embrace and extend" (not the way Microsoft did it many years ago) is the way to go re. unveiling Data Space dimension from the Information Space dimension. My proposal is this: we just need to be more accommodating of what we may perceive as imperfections, in our data space oriented context. We should always embrace structured data contributions in any form. We can transform structured data to high fidelity linked data in a myriad of ways that ultimately help others comprehend what's taking shape re. the WWW as a Global Data Space. +1000 +1000 Kingsley ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 14:39 To: Kingsley Idehen <kidehen@openlinksw.com> Cc: public-lod@w3.org That's a fun way of describing things. But we have to be careful not to hype things too much, or we risk being tied into the 1980 AI hype space, and then nobody will listen anymore. Perhaps a more scientific way to express this is within the language of self-organising systems. There is a lot of research there which is relevant to us. http://en.wikipedia.org/wiki/Self_organising_systems I am a bit new to this area. Any books I must read? Henry ---------- From: *Hugh Glaser* <hg@ecs.soton.ac.uk> Date: 19 June 2011 15:09 To: Henry Story <henry.story@bblfish.net> Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Thanks Henry. Just to be clear on one point: On 19 Jun 2011, at 12:44, Henry Story wrote: <snip /> <snip /> But I don't write special case filters - if I did it would not consider it Semantic Web. I simply follow my nose to use the URI (or in fact usually via an owl:sameas in a sameas store), and they work. It all works because my code that consumes the retrieved RDF to build the data enrichment by inference (things like the communities of practice), and things like my fresnel lenses, restrict any ambiguity by looking for the predicates, etc. they care about. RDF can be a long way short of what we want it to be without having to treat it as special cases. ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 15:25 To: Danny Ayers <danny.ayers@gmail.com> Cc: Pat Hayes <phayes@ihmc.us>, Richard Cyganiak <richard@cyganiak.de>, Alan Ruttenberg <alanruttenberg@gmail.com>, Linked Data community < public-lod@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org> On 12 Jun 2011, at 14:40, Danny Ayers wrote: > [snip] A photo and a graph work in essentially the same way. They both set restrictions on possible worlds of which they are true. A photo restricts the number of possible worlds to those that are visually equivalent to the picture taken. A graph is true of all the possible worlds where those relations holds - which is usually infinitely large. In either case the meaning of a graph or document is a set of possible worlds. A set is an object - one can speak of it - but a very different kind of object from what you may think of as what appears in the picture. As such there is indeed a fundamental logical difference between a document and objects in the world. And that also explains why a photo is not clearly about one thing or another - though of course given that it is a restriction on the way things can be, it limits the things the document could be about. As stated in a previous mail, the same photo can be about the eiffel tower, a sunset, a beautiful view of Paris, a vacation experience, a friend that appears in the picture, a murder that was commited at that moment,... The photo remains the same in all those descriptions, and it can be tagged in all those ways, which is why it is good to have names for each of those things that are different from the photo. Each of those should have definite descriptions to help identify the referents from the description. ---------- From: *Hugh Glaser* <hg@ecs.soton.ac.uk> Date: 19 June 2011 15:26 To: Kingsley Idehen <kidehen@openlinksw.com> Cc: "<public-lod@w3.org>" <public-lod@w3.org> Er, I'm not sure you do :-) You certainly consume it, and a very nice job you do to. But the "use" is more than generic browsers - it suggest to me that something useful might happen as a result of the consumption (perhaps I learn that I can ask Jim to introduce me to Mary, as he knows her better than anyone else I know). These things are usually called applications, or possibly services. They tend to be reasonably domain-specific, as generic things tend not to be easy to sue, or even fit for purpose for end users. Sorry if I have missed stuff. ---------- From: *Hugh Glaser* <hg@ecs.soton.ac.uk> Date: 19 June 2011 15:38 To: Dave Reynolds <dave.e.reynolds@gmail.com> Cc: Linked Data community <public-lod@w3.org> I know, I know - as I pressed the send button I thought uh-uh :-) Sorry. Mind you, I deliberately left the # off the URI and I think I got confused about ... Oh never mind - sorry. > > Dave ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 15:48 To: Hugh Glaser <hg@ecs.soton.ac.uk> Cc: Kingsley Idehen <kidehen@openlinksw.com>, "<public-lod@w3.org>" < public-lod@w3.org> exactly. At that level you start using the specific logic of some relations, here perhaps the foaf:knows relation, which is other than the high very lightly constraining rdfs or owl framework. One might say that one only really uses foaf:knows when one has software than understands the specific intension of that relationship. > They tend to be reasonably domain-specific, as generic things tend not to be easy to use, or even fit for purpose for end users. yes. And since we are working in a self organising system, these applications have to be designed so that every use grows the value of the network, and creates incentives for correct data to be published, and maintained. In recent e-mail Hugh also wrote in reply to me: > RDF can be a long way short of what we want it to be without having to treat it as special cases. yes, we will be dealing with inconsistent data whatever we do. But we need ways of telling when things are inconsistent so that we can then recognise when this is the case and find ways around things. As I mentioned, I think we don't recognise inconsistency much because few people use inferencing. And inferencing need not just be owl inferencing, it can be the type of inferencing that comes from human understanding of what it means to foaf:know someone, or other terms with particular complex intentions. ---------- From: *Tim Berners-Lee* <timbl@w3.org> Date: 19 June 2011 17:13 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net> Absolutely, Pat. Well said. This is really important. Can we please stop the madness of confusing things with documents about them and do what we want to do cleanly and in an efficient way. Tim ---------- From: *Nathan* <nathan@webr3.org> Date: 19 June 2011 17:33 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Exactly, Things become even clearer when you add in a messenger. A messenger carried a message about an erupting volcano, to conflate the message and the subject of the message is to say that a messenger carried an erupting volcano, which is nonsense. We've long since known not to conflate the Messenger with the Message, this is why we don't shoot the messenger, however I think this is possibly the first time in history where we've questioned whether the message and the subject(s) of the message were different things or not. Best, Nathan ---------- From: *Giovanni Tummarello* <giovanni.tummarello@deri.org> Date: 19 June 2011 18:27 To: Pat Hayes <phayes@ihmc.us> Cc: Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Could it be exactly the other way around? that documents and things described in it are easy to distinguis EXACTLY becouse one is about the other, no one can possibly mess them up/except for idiotic computer algorithms from the 70s that limits themselves to simbolic AI techniques. Otherwise you seem to say that its more difficult to distinguish between a dog and a bitch than it is to distinguish between a dog and a stream of bytes in return to an HTTP request, and that seems a bit funny? look if someone points me at a facebook URL i know its about a person and not about the damn page (which has 2000 ways to change every time that url is resolved anyway. i mean we can go on and tell oursellf we cant possibly write applications that know or understand what facebook URL is about. but dont be surprised as less and less people will be willing to listen as more and more applications (Eg.. all the stuff based on schema.org) pop up never knowing there was this problem... (not in general. of course there is in general, but for their specific use cases) Gio ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 18:35 To: Giovanni Tummarello <giovanni.tummarello@deri.org> Cc: Pat Hayes <phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> The question is if schema.org makes the confusion, or if the schemas published there use a DocumentObject ontology where the distinctions are clear but the rule is that object relationships are in fact going via the primary topic of the document. I have not looked at the schema, but it seems that before arguing that they are inconsistent one should see if there is not a consistent interpretation of what they are doing. Henry Gio ---------- From: *Nathan* <nathan@webr3.org> Date: 19 June 2011 18:56 To: Henry Story <henry.story@bblfish.net> Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes < phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth < david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro < jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Sorry, I'm missing something - from what I can see, each document has a number of items, potentially in a hierarchy, and each item is either anonymous, or has an @itemid. Where's the confusion between Document and Primary Subject? ---------- From: *Nathan* <nathan@webr3.org> Date: 19 June 2011 18:58 To: Henry Story <henry.story@bblfish.net> Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes < phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth < david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro < jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Or do you mean from the Schema.org side, where each Type and Property has a dereferencable URI, which currently happens to also eb used for the document describing the Type/Property? ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 19:36 To: nathan@webr3.org Cc: Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes < phayes@ihmc.us>, Danny Ayers <danny.ayers@gmail.com>, David Booth < david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro < jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Well I can't really tell because I don't know what the semantics of those annotations are, or how they function. Without those it is difficult to tell if they have made a mistake. If there is no way of translating what they are doing into a system that does not make the confusion, then one could explore what the cost of that will be to them. If the confusion is strong then there will be limitations in what they can express that way. It will then be a matter of working out what those limitations are and then offering services that allow one to go further than what they are proposing. At the very least the good thing is that they are not bringing the confusion into the RDF space, since they are using their own syntax and ontologies. There may also be an higher way to fix this so that they could return a 20x (x-some new number) which points to the document URL (but returns the representation immediately, a kind of efficient HTTP-range-14 version) So there are a lot of options. Currently their objects are tied to an html document. What are the json crowd going to think? In any case there is a problem of translation that has to be dealt with first. Henry ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 19 June 2011 19:44 To: Henry Story <henry.story@bblfish.net> Cc: Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> On 19 June 2011 12:37, Henry Story <henry.story@bblfish.net> wrote: > [snip pat] I would have come down on you like a ton of bricks for that Henry, if it wasn't for seeing to-and-fro on Facebook about some Nazi-inspired club (Slimelight, for the record). On FB there is no way to express your sentiments. Like/blow to smithereens. I confess to talking bollocks when I should be coding. ---------- From: *Henry Story* <henry.story@bblfish.net> Date: 19 June 2011 19:52 To: Danny Ayers <danny.ayers@gmail.com> Cc: Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> yeah, me too. Though now you folks managed to get me interested in this problem! (sigh) Henry ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 19 June 2011 20:03 To: Henry Story <henry.story@bblfish.net> Cc: nathan@webr3.org, Giovanni Tummarello <giovanni.tummarello@deri.org>, Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> I thought forever that if we see iniquities we are duty-bound to stand in the way. But that don't seem to change anything. Let the crap rain forth, if you really need to make sense of it the blokes on this list will do it. Activity is GOOD, no matter how idiotic. Decisions made on very different premises than anyone around here would promote. Sorry, I'm of the opinion that the Web approach is the winner. Alas it also seems lowest common denominator. Cheers, Danny. -- http://danny.ayers.name ---------- From: *Danny Ayers* <danny.ayers@gmail.com> Date: 19 June 2011 20:15 To: Henry Story <henry.story@bblfish.net> Cc: Pat Hayes <phayes@ihmc.us>, David Booth <david@dbooth.org>, Linked Data community <public-lod@w3.org>, Jason Borro <jason@openguid.net>, Tim Berners-Lee <timbl@w3.org> Only personal Henry, but have you tried the Myers-Briggs thing - I think you used to be classic INTP/INTF - but once you got WebID in your sails it's very different. These things don't really allow for change. Only slightly off-topic, very relevant here, need to pin down WebID in a sense my dogs can understand. The Myers-Briggs thing is intuitively rubbish. But with only one or two posts in the ground, it does seem you can extrapolate. -- http://danny.ayers.name -- http://danny.ayers.name
Received on Sunday, 19 June 2011 18:23:19 UTC