RE: Inference rules for HTTP, etc. from Booth, David (HP Software - Boston) on 2008-02-28 (public-awwsw@w3.org from February 2008)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Thu, 28 Feb 2008 06:56:26 +0000
To: Tim Berners-Lee <timbl@w3.org>
CC: "public-awwsw@w3.org" <public-awwsw@w3.org>
Message-ID: <184112FE564ADF4F8F9C3FA01AE50009E254ABD3D8@G1W0486.americas.hpqcorp.net>
Hi Tim,

Thanks for your very helpful comments.  Some initial replies below.  More when I have more time.

> From: Tim Berners-Lee [mailto:timbl@w3.org]
>
> Comments on the rules:
> http://esw.w3.org/topic/AwwswDboothsRules as of Mon Feb 25
> 09:06:15 EST 2008
>
> You ask for email, rather than annotation of the wiki, even though you
> used a wiki.

Yeah, I guess that was a little inconsistent.  :)  Email is easier, because I'm replacing the rules wholesale as I make corrections and test them.  Perhaps I should mint a proper URI for them soon.

> I have 14 comments, each preceded by "**".
>
> _______________________________________
>
> ** Minor:  [uri:hasURI ] This property should be asserted explicitly
> -- it is NOT inferred..  Well, for anything in the system identified
> by a URI, one URI can be inferred for it.
> Actually the "this should/should not be inferred is a funny
> distinction which I don't find useful.
> Anything can be asserted, whether or not axioms exist which allow it
> to be inferred.

Yes, I realize that properties can be used either way.  I was trying to help convey the intended usage pattern that would demonstrate the path of reasoning from coming across an unknown URI, to dereferencing it (or its racine), to parsing the representation and finally obtaining a URI declaration for it.  But if these comments aren't useful they can be ignored.  Do you think they are harmful?

>
> _______________________________________
>
> ** Minor:  The practice of putting "has" on every predicate is one I
> don't like, I prefer just "location".   This reads better in N3, makes
> a better UI in Tabulator, etc.   (N3 allows you to add "has" as an
> ignored keyword if you  have to but people don't.) Also for the HTTP
> headers it is neat to just use the headers exactly.

That seems to be a stylistic preference.  AFAICT some favor the "has" form because they prefer verbs and explicit directionality of the property, whereas others who favor brevity.  I have seen evidence that the lack of explicit directionality causes some confusion: the string:concat property has a comment saying "(obsolete - (was backwards!) - use: string:concatenation)"
http://www.w3.org/2000/10/swap/string#concat
So thus far I've been swayed by the explicit directionality of the "has" form, but perhaps that's only because I'm a relative newbie.

>
> _______________________________________
>
> ** Thet Tabulator's httph:  ontology (http://www.w3.org/2007/ont/
> httph#) ch I think is more or less equivalent to your http: ontology.
> It is defined to have one predicate for any header in an HTTP message
> -- only a few are documented in th ontology file explicitly.  So you
> can write rules and code generally. There is a separate
> http://www.w3.org/2007/ont/http# (no trailing 'h') for the HTTP
> framework ontology.
>
> Dhttp:hasGetReply    rdfs:subclassof  Thttp:request.  # Maybe

Yes, those ontologies by David Sheets are pretty equivalent to the first parts of the http ontology that I provided, though of course mine goes farther by explicitly defining properties for more headers, and defining some of the GET and response semantics.  I can switch to using those David Sheets' you think it would be better, or I can indicate the relationships between mine and David Sheets'.  For the moment I've added comments that show the relationships between mine and his.  Do you have a preference for how I should handle this?

>
> _______________________________________
>
> ** General comment on datatypes:   Most people use untyped strings in
> RDF and N3.

Sounds good to me: less work.  :)  I've made this change.

>
> _______________________________________
>
> ** Should the HTTP status code be a string or an int? Cleaner to make
> it an integer, I think.
>
>         ?reply http:status 200.
> rather than
>         ?reply http:hasStatusCode "200"^^xsd:string .
>
> so that you can do integer comparisons.

Yes, good point.  I've made this change.

>
> _______________________________________
>
> ** Minor: I suggest you separate the rules themselves from the
> debugging bits.

Note sure how to do that, but I have commented them out.  Is there a common way to see what rules have fired without adding debugging assertions to the rule?

>
> _______________________________________
>
> ** Major.  You say:
>
> {
>          ?u1 a xsd:anyURI .                      # Old URI
>          ?r1 uri:hasURI ?u1 .
>          ?u2 a xsd:anyURI .                      # New URI
>          ?r2 uri:hasURI ?u2 .
>          ?u1 http:hasGetReply ?reply1 .          # IF ?u1 derefs to ?
> reply1
>          ?reply1 http:hasStatusCode "301"^^xsd:string .  # ... with
> 301 status
>          ?reply1 http:hasLocation ?u2 .          # ... and new URI ?u2
> } => {                                          # THEN they denote
>          ?r1 = ?r2 .                             # ... the same thing.
>          } .
>
> I don't think this is correct.   After much thought.   I think we need
> a "same work as".
>
> r1 and r2 can be for example the current front page of the NYTimes and
> a permalink (as they say)  for the same page.  If you assert =
> (owl:sameAs) then anything which applie to one applies to the other.
> This includes for example ?r1 uri:hasURI ?u2 for example.   I think it
> includes a lot of things one would expect to be the same, like access
> control and copyright and authorship etc .. so "Same Work As" is
> useful.  But other things are not the same   ?r1 and ?r2  may be
> content negotiated, so one is more generic than the other, for
> example, as some people to conneg on a redirect.  So some of
> the http://www.w3.org/2006/gen/ont#
>   ontology may apply between them.
>
> Because of the possibility of conneg, it is tricky to deduce many
> things.
>
> The tabulator at the moment classes as a TextDocument anything which
> has any http:getReply (through 301, 303, 307) of content-type text/.
> and similarly for image/* assumes it is a foaf:Image.  This results in
> a different user interface , and a different icon.
>
>
> You say, pointing to some issues, that "This helps explain why these
> HTTP rules are written in
> # terms of URIs rather than awww:InformationResources."  Indeed.  You
> do have to talk about URIs for these rules.  But most of the data
> about the resources will not. And the axiom
> { ?x = ?y.  ?x ?p ?z } =>  { ?y ?p ?z }  is too strong for things
> connected by indirection.

These are very interesting comments -- more than I can address at this late hour of the night.  :)  I'll come back to them when I have more time.

>
> _______________________________________
>
> ** Your definition of hasGetReply applies to all the replies in the
> chain.  It is useful I think to have a successfulGetetReply which is
> limited to the 200 case.

That sounds like it would be defined as:

  {
  ?u http:hasDirectGetReply ?reply .
  ?reply http:hasStatus 200 .
  } => { ?u http:hasSuccessfulGetReply ?reply .  } .

Right?

>
>
> _______________________________________
>
> **  Connections with cwm:  decl:parsesTo is something cwm is missing,
> though it has an n3 parser explictly.   Cwm's log:content connects a
> resource the entity body, but is weak as it looses the mime type.
> Possible cwm improvement there.

Yes, that would be nice.

>
> _______________________________________
>
> ** "Declaration" is an interesting term.   I can go with that,.  I
> don't understand though "(i.e., it does not indicate  the formula that
> constitutes the declaration)".  ?fformula is the formula, no?

Oops!  That was an editorial error.  Fixed now.

>
> Do we also allow passage through rdfs:seeAlso?

No, as I think that corresponds to providing what I call ancillary assertions rather than core assertions, as described here:
http://dbooth.org/2007/uri-decl/#ancillary
But I have added a rule for rdfs:isDefinedBy, which I think *does* indicate core assertions for a URI declaration.

>
> {       ?r log:uri ?u.
>         ?u  decl:hasDeclation ?f.
>         ?f log:includes {  ?r rdfs:seeAlsp ?r2 }.
> }  =>  {
>         ?u decl:hasDeclaration ?r2?
> }
>
> (Tabulator does effectively, in following its nose.)
> Or we could make a superproperty of hasDeclaration.
>
> _______________________________________
>
> ** You say, "The idea is that if we have u1 --303--> u2 --303--> u3,
> then u2 is treated as authoritative  for URI declaration of u1, and u3
> is treates as authoritative for  URI declaration of u2, but u3 is NOT
> treated as authoritative for  URI declaration of u1. I am not certain
> that this is the right  choice  "
>
> Absolutely. I think this i right.  303 separates a thing from a
> document about a thing.  While it makes perfect sense to have more
> than one 303, the semantics are not at all transitive.  An example of
> when one might 303 the URI of a document might be when the URI
> identifies a huge database and you want to respond with some RDF which
> explains to the client how to access the data through a SPARQL
> endpoint, or how navigate the database table by table.

Good.

>
> _______________________________________
>
> ** You say, "For example, if dereferencing
> # http://example/foo yields a 200 response with RDF/N3 content
> # that parses to an n3 formula (i.e., a set of RDF assertions),
> # then the rules for URI declaration will not automatically
> # require everyone who writes that URI to accept those assertions."
>
> This suggests that if there *is* a declaration, then anyone using the
> URI *is* assumed to accept the declaration.   This makes sense, but
> attempts to delimit the commitment I have seen to date have found it
> hard.  Do you also commit to the declarations all the URIs used as
> predicates and types in a declaration, recursively (I call this
> "ontological closure")?  (effectively, you have to or nothing works)
> Do you commit to ANY URI used in a declaration, even as subject or
> object? (No, or you pull in the whole GGG).  I think in practice there
> will be time when people use URIs but in fact disagree with things in
> the ontological closure. Could be this is best described as an error,
> possibly uncaught.

Interesting comments.  I'll have to chew on this when I can think more clearly.

>
> _______________________________________
>
> ** Minor:  "Properties and rule for media type: text/rdf+n3. Not sure
> if this media type is registered yet, but  TimBL suggests it here..."
> During discussion around the registration process, there was strong
> push for changing this to text/n3.

Okay, I've changed it to text/n3.

>
> _______________________________________
>
> Conclusion
>
> The rules look like a great start.  It would be interesting to build
> HTTP out of TCP in fact, as well, but another project.
>
> It would be useful I think to try to find as much overlap between
> these terms and the ones the Tabulator uses.   I should make a
> function to dump the tabulator internal store so that one can grab
> real-life situations and run the rules on them.
>
> Also cwm could be expanded to put in the extra functions you need,
> like parsesTo.
>
> Tim



David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.
Received on Thursday, 28 February 2008 06:57:51 UTC