- From: Jonathan A Rees <rees@mumble.net>
- Date: Wed, 10 Oct 2012 18:17:40 -0400
- To: www-archive@w3.org
To whom it may concern: This is yet another assault on describing the "proposal 27" design, which I believe to be consistent with what Henry and Jeni and I have been working on; responding mainly to Alan R's criticisms of Jeni's draft http://www.w3.org/2001/tag/doc/uri-usage-primer-2012-10-03/ , see http://lists.w3.org/Archives/Public/www-tag/2012Oct/0068.html and following. I'm getting kind of sick of this. I expect it shows. I assert that proposal 27 (which Jeni's document valiantly attempts to capture in a way that I, JAR, have a hard time doing) can be formulated rigorously and that the result has a chance of being useful and is therefore worth testing. Formalism is a way to fight back against the ungenerous. I do not want to use it, but be warned I will if backed into a corner. Let's make sure the mechanism is understood, before further criticizing the description of it. Since terminology is the bugaboo (see nearly all discussions on this topic), and precipitates such boring fights, I must proceed, until sensible consensus prevails, without using any of the nasty usual words, the ones with distracting existing connotations. The labels don't matter in the end, and we can easily change them before publication. What's important is the machinery. I'll introduce suggestive but - I hope - connotation-free words for the purpose of explaining the logic. I am the authority on the new words introduced here, so if you have a question about what they mean, ask me. Do not assume anything I wouldn't. Don't ask me to use different words like "denote" or "resource". JUST STOP BUGGING ME ABOUT THE LABELS. THEY DON'T MATTER UNTIL WE GO TO LAST CALL. We have URIs. I introduce two transitive verbs "wdentify" and "tdentify". Sometimes URIs wdentify, and sometimes they tdentify, and sometimes they do both or neither. So: wdentify and tdentify are relationships between URIs and other things. They will be explained further as we go along. Each URI wdentifies at most one thing, and tdentifies at most one thing. There is no reason to stop at wdentification and tdentification; there could be zdentification and so on. But nobody's asking for those yet. == Semantics of wdentification == Short version: A URI wdentifies its webpaige. A webpaige hath zero or more rxpresentations, specifically the ones you GET when you GET using a URI that wdentifies the webpaige. SKIP THIS WHOLE SECTION IF THAT'S GOOD ENOUGH FOR YOU. There are rxpresentations. They can be encoded in bits. They can be carried by HTTP messages. If you look at an HTTP message (request or response), you can pick out the rxpresentation it carries: it is pretty much what RFC 2616 calls the "entity". Rxpresentations have content (perhaps null) and headers (perhaps none). We can make the ontology as detailed or vague as you like. (Do I need to go further?) Rxpresentations have properties such as: Does the string "frog" occur in it? What RDF graph does it serialize, if any? (modulo blank node identity that is.) We could pretty easily express information of this sort in RDF, and might have to, using a standard vocabulary, if issue-57 is to have a complete solution that can satisfy all critics. At any given time it may or may not be the case that a given webpaige "hath" a given rxpresentation. (The domain of "hath" is webpaiges and the range is rxpresentations.) If the rxpresentation is the outcome of a successful retrieval (see RFC 3986) using a URI that wdentifies the webpaige, and the retrieval is "authoritative" per HTTPbis, and the Expires: time in the rxpresentation has not passed, then the webpaige hath the rxpresentation. Basically, the webpaige hath a rxpresentation if a cache could correctly deliver the rxpresentation in response to a retrieval request. (An HTTP GET request is a retrieval request. Other requests are not. An HTTP 2xx response is a successful response. 4xx and 5xx responses are not. I do not want to talk about 3xx responses yet. (Do I need to?)) (This can be made more rigorous, by going into HTTP's caching rules in more detail, diving into "correctness" and authority and speech acts and deontic logic. Dan Connolloy has made a good first cut. Please don't make me do it! It's very tiresome.) There may be other situations in which a webpaige hath a rxpresentation, in addition to the cases where the rxpresentation has been delivered in response to a retrieval request, and where it's cachable; informally this would be if the server has the disposition, at the current time, to deliver it in response to some retrieval request. (Do I need to go into this?) It is not clear when the relationship "hath" does *not* hold between a given webpaige and a rxpresentation. That is, we need to rule out interpretations in which "hath" is the top property. There are cases in which a person would say that displaying a particular rxpresentation where the request URI There is no way to communicate "x does not hath" in the HTTP protocol, or otherwise to test for it objectively, although cache invalidation rules come close. This is one of the tragedies of the architecture. And how do we know, given an arbitrary URI U and an arbitrary webpaige W, whether or not U wdentifies W? We simply don't. That is a matter of interpretation. Another tragedy of the architecture. We can discuss properties of webpaiges by reducing them to the properties of their rxpresentations. That is: some rxpresentation of the webpaige has property P, all of them do, some will, all past ones have, etc. For example, the class of webpaiges W with the property that W "hath" some rxpresentation R that serializes an RDF graph containing the URI U, is fairly well defined (i.e. objective) given U. OK, I hope this gives enough on these relationships and their pragmatics to proceed. If you like the httpRange-14 resolution (not all of you do), you can take "wdentification" to be "identification". But please don't do that at least until you've read the whole thing through. == Semantics of tdentification == Put concisely: a URI tdentifies what the webpaige it wdentifies says that it tdentifies. SKIP THIS SECTION IF THAT'S GOOD ENOUGH FOR YOU. We can be more pedantic about this, as follows: Let the recommendation in question fix a URI G*; for example, G* might be the URI http://www.w3.org/2001/tag/2012/09/issue57/infra#Gstar . Suppose there exist U, a URI W, a webpaige R, a rxpresentation G and G', RDF graphs P, a URI y, an RDF term satisfying the following: U wdentifies W W hath R R serializes RDF graph G G' is an "adequate supergraph" of G (that is, it pulls in enough additional axioms via owl:imports, follow your nose, or some other axiom source to meet the purposes at hand, especially in regard to axioms for properties) A statement <U> <G*> y. is entailed by G' I is a satisfying interpretation of G' I is "acceptable in context" I maps y to X Then U tdentifies X. What does "acceptable in context" mean? Basically it means the rdfs:comment and rdfs:label properties are respected; or that I is the "intended interpretation". In other words the choice of I is not completely arbitrary - we want any given URI to tdentify only one thing. Do I need to elaborate? Is the lack of objective criteria for acceptability fatal? Does the algorithm for extending G to G' need to be nailed down? (It could be the identity.) I confess I find this confusing and unsuitable for normativity. It is related to the general problem of what vocabulary conformance is. But in practice people don't seem troubled by this. There might be ways out of the difficulty, basically by combining G' with some other RDF graph found in the immediate vicinity of the entity whose comformance to recommendation is under consideration. I am willing to work with others on making this better. If you fight me and say it doesn't make sense, I will fight back, but you will have to start engaging me on requirements and so on. If you hate the httpRange-14 resolution (not all of you do), you can take "tdentification" to be "identification" or some variant on it (e.g. what do you do if there's no rxpresentation that has serializes a graph that contains U?). But please don't do that at least until you've read the whole thing through. == Identification == So far all we have is definitions and constraints, nothing normative. Now for the normative part. It is a premise of the proposal 27 exercise that there will never be agreement on what hashless http: URIs identify (or refer to, or denote, or name, or anything else of that sort). If you don't like that premise go prepare your own $%#& proposal!! However, we can recommend that the community observe *constraints* on identification as follows: If U identifies x, and I wdentifies w, then x and w are related by F* (or I should say, by what F* identifies - give me a break already). If U identifies x, and I tdentifies t, then x and t are related by G*. F* and G* identify functional object properties. It would be possible to formalize this, modulo squishiniess of "wdentifies" and "tdentifies" as above, and the formalization hurdle can be overcome if those who complain about the squishiness will engage with me and not take potshots. A specification that is normative on interpretations of a vocabulary MAY reference this specification normatively, in which case interpretations of the vocabulary are further constrained by the above. (Note that there are to date no common practices regarding conformance of artifacts to vocabulary specifications. But I don't think this will hold anyone up - will it?) In writing documentation it will be useful to have a way to express the relations F* and G* both in prose and in RDF. For RDF we can just pick URIs. But the terminology is a bugaboo. The way to make the documentation easy to write and read would be to have a role-noun form, see http://www.w3.org/wiki/RoleNoun. Suppose that "handypaige" and "handything" do not have distracting connotations. Then, using the widespread role-noun pattern, which I personally find unpleasant but use in order to flow with the way most people seem to do things in RDF, we have: F* rdfs:label "handypaige". G* rdfs:label "handything". Q:creator rdfs:label "handypaige's creator". R:creator rdfs:label "handything's creator". - This proposal does not threaten "unique identification" in the RDF semantics sense. - We do not have "unique identification" in the sense of global interoperability per AWWW, so this proposal doesn't make matters any worse than they already are. - Having agreed on this proposal, we could in theory agree on even more constraints in order to approach the AWWW golden unachievable ideal of global interoperable unique so-called "identification". == Giving it more teeth == Define a "disputed URI" to be a hashless http: URI. A conforming document MUST NOT use a disputed URI in the subject position of a statement unless it is entailed, either formally or informally, that the relation in the statement factors on the left through either the "handypaige" or "handything" relation. A conforming document MUST NOT use a disputed URI in the subject position of a statement unless it is entailed, either formally or informally, that the relation in the statement factors on the right through either the "handypaige" or "handything" relation. "Formally entailed" means entailed per applicable formal semantics (RDF or OWL). "Informally entailed" means per common sense after reading all the applicable documentation. Please don't make me go into the gory details about factoring through. I just mean what I talked about above. I'm really tired of this. == JAR's desire == I seek the following additional normative constraint, but JT and HT do not yet understand it: A conforming document with a consistent interpretation MUST NOT be inconsistent with the assumption that identification is wdentification. (From which it would follows that F* and owl:sameAs denote equivalent properties. We're not forcing this on everyone; just permitting it for those who want it.) A symmetric requirement for tdentification is not possible, since that would lead to contradictions. This has nontrivial consequences for identity (i.e. owl:sameAs and owl:differentFrom), but for the audience we're concerned about the requirement won't make much of a difference (they're already hopelessly confused about owl:sameAs, and mired in inconsistencies). It doesn't even matter that I am now confident that identification = wdentification is the only plausible interpretation of what RFCs 2616 and 3986 say. This is why Roy signed his name to the httpRange-14 resolution. NOBODY WILL EVER BELIEVE ME. == Unfinished business == We would *like* for the following axioms to hold: If U is not disputed (as defined above), then what U wdentifies = what U tdentifies = what U identifies. (This covers the case of interoperability of hash URIs with properties that factor on the left or right through handypaige or handything.) If U xdentifies a webpaige that has no rxpresentations (as described above), then what U wdentifies = what U tdentifies = what U identifies. (This covers the case of interoperability of 303 URIs with properties that factor on the left or right through through handypaige or handything.) Before we can say this, we need to convince ourselves that these axioms are consistent with the kinds of interpretations we'd like to have. This is not obvious, but neither is it obvious that there is a problem.
Received on Wednesday, 10 October 2012 22:18:07 UTC