Re: Inference rules for HTTP, etc. from Tim Berners-Lee on 2008-02-26 (public-awwsw@w3.org from February 2008)

From: Tim Berners-Lee <timbl@w3.org>
Date: Mon, 25 Feb 2008 18:18:21 -0800
To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
Cc: "public-awwsw@w3.org" <public-awwsw@w3.org>
Message-Id: <F59061E8-61FC-4519-8694-7258EC53F160@w3.org>
David,

Comments on the rules:
http://esw.w3.org/topic/AwwswDboothsRules#preview as of Mon Feb 25  
09:06:15 EST 2008

You ask for email, rather than annotation of the wiki, even though you  
used a wiki.
I have 14 comments, each preceded by "**".

_______________________________________

** Minor:  [uri:hasURI ] This property should be asserted explicitly  
-- it is NOT inferred..  Well, for anything in the system identified  
by a URI, one URI can be inferred for it.
Actually the "this should/should not be inferred is a funny  
distinction which I don't find useful.
Anything can be asserted, whether or not axioms exist which allow it  
to be inferred.

_______________________________________

** Minor:  The practice of putting "has" on every predicate is one I  
don't like, I prefer just "location".   This reads better in N3, makes  
a better UI in Tabulator, etc.   (N3 allows you to add "has" as an  
ignored keyword if you  have to but people don't.) Also for the HTTP  
headers it is neat to just use the headers exactly.

_______________________________________

** Thet Tabulator's httph:  ontology (http://www.w3.org/2007/ont/ 
httph#) ch I think is more or less equivalent to your http: ontology.   
It is defined to have one predicate for any header in an HTTP message  
-- only a few are documented in th ontology file explicitly.  So you  
can write rules and code generally. There is a separate
http://www.w3.org/2007/ont/http# (no trailing 'h') for the HTTP  
framework ontology.

Dhttp:hasGetReply    rdfs:subclassof  Thttp:request.  # Maybe

_______________________________________

** General comment on datatypes:   Most people use untyped strings in  
RDF and N3.

_______________________________________

** Should the HTTP status code be a string or an int? Cleaner to make  
it an integer, I think.

	?reply http:status 200.
rather than
	?reply http:hasStatusCode "200"^^xsd:string .

so that you can do integer comparisons.

_______________________________________

** Minor: I suggest you separate the rules themselves from the  
debugging bits.

_______________________________________

** Major.  You say:

{
         ?u1 a xsd:anyURI .                      # Old URI
         ?r1 uri:hasURI ?u1 .
         ?u2 a xsd:anyURI .                      # New URI
         ?r2 uri:hasURI ?u2 .
         ?u1 http:hasGetReply ?reply1 .          # IF ?u1 derefs to ? 
reply1
         ?reply1 http:hasStatusCode "301"^^xsd:string .  # ... with  
301 status
         ?reply1 http:hasLocation ?u2 .          # ... and new URI ?u2
} => {                                          # THEN they denote
         ?r1 = ?r2 .                             # ... the same thing.
         } .

I don't think this is correct.   After much thought.   I think we need  
a "same work as".

r1 and r2 can be for example the current front page of the NYTimes and  
a permalink (as they say)  for the same page.  If you assert =  
(owl:sameAs) then anything which applie to one applies to the other.    
This includes for example ?r1 uri:hasURI ?u2 for example.   I think it  
includes a lot of things one would expect to be the same, like access  
control and copyright and authorship etc .. so "Same Work As" is  
useful.  But other things are not the same   ?r1 and ?r2  may be  
content negotiated, so one is more generic than the other, for  
example, as some people to conneg on a redirect.  So some of the http://www.w3.org/2006/gen/ont# 
  ontology may apply between them.

Because of the possibility of conneg, it is tricky to deduce many  
things.

The tabulator at the moment classes as a TextDocument anything which  
has any http:getReply (through 301, 303, 307) of content-type text/.    
and similarly for image/* assumes it is a foaf:Image.  This results in  
a different user interface , and a different icon.


You say, pointing to some issues, that "This helps explain why these  
HTTP rules are written in
# terms of URIs rather than awww:InformationResources."  Indeed.  You  
do have to talk about URIs for these rules.  But most of the data  
about the resources will not. And the axiom
{ ?x = ?y.  ?x ?p ?z } =>  { ?y ?p ?z }  is too strong for things  
connected by indirection.

_______________________________________

** Your definition of hasGetReply applies to all the replies in the  
chain.  It is useful I think to have a successfulGetetReply which is  
limited to the 200 case.


_______________________________________

**  Connections with cwm:  decl:parsesTo is something cwm is missing,  
though it has an n3 parser explictly.   Cwm's log:content connects a  
resource the entity body, but is weak as it looses the mime type.  
Possible cwm improvement there.

_______________________________________

** "Declaration" is an interesting term.   I can go with that,.  I  
don't understand though "(i.e., it does not indicate  the formula that  
constitutes the declaration)".  ?fformula is the formula, no?

Do we also allow passage through rdfs:seeAlso?

{ 	?r log:uri ?u.
	?u  decl:hasDeclation ?f.
	?f log:includes {  ?r rdfs:seeAlsp ?r2 }.
}  =>  {
	?u decl:hasDeclaration ?r2?
}

(Tabulator does effectively, in following its nose.)
Or we could make a superproperty of hasDeclaration.

_______________________________________

** You say, "The idea is that if we have u1 --303--> u2 --303--> u3,  
then u2 is treated as authoritative  for URI declaration of u1, and u3  
is treates as authoritative for  URI declaration of u2, but u3 is NOT  
treated as authoritative for  URI declaration of u1. I am not certain  
that this is the right  choice  "

Absolutely. I think this i right.  303 separates a thing from a  
document about a thing.  While it makes perfect sense to have more  
than one 303, the semantics are not at all transitive.  An example of  
when one might 303 the URI of a document might be when the URI  
identifies a huge database and you want to respond with some RDF which  
explains to the client how to access the data through a SPARQL  
endpoint, or how navigate the database table by table.

_______________________________________

** You say, "For example, if dereferencing
# http://example/foo yields a 200 response with RDF/N3 content
# that parses to an n3 formula (i.e., a set of RDF assertions),
# then the rules for URI declaration will not automatically
# require everyone who writes that URI to accept those assertions."

This suggests that if there *is* a declaration, then anyone using the  
URI *is* assumed to accept the declaration.   This makes sense, but  
attempts to delimit the commitment I have seen to date have found it  
hard.  Do you also commit to the declarations all the URIs used as  
predicates and types in a declaration, recursively (I call this  
"ontological closure")?  (effectively, you have to or nothing works)  
Do you commit to ANY URI used in a declaration, even as subject or  
object? (No, or you pull in the whole GGG).  I think in practice there  
will be time when people use URIs but in fact disagree with things in  
the ontological closure. Could be this is best described as an error,  
possibly uncaught.

_______________________________________

** Minor:  "Properties and rule for media type: text/rdf+n3. Not sure  
if this media type is registered yet, but  TimBL suggests it here..."   
During discussion around the registration process, there was strong  
push for changing this to text/n3.

_______________________________________

Conclusion

The rules look like a great start.  It would be interesting to build  
HTTP out of TCP in fact, as well, but another project.

It would be useful I think to try to find as much overlap between  
these terms and the ones the Tabulator uses.   I should make a  
function to dump the tabulator internal store so that one can grab  
real-life situations and run the rules on them.

Also cwm could be expanded to put in the extra functions you need,  
like parsesTo.

Tim

On 2008-02 -23, at 23:33, Booth, David (HP Software - Boston) wrote:

>
> I have gotten a set of rules working, for making inferences based on  
> HTTP interactions:
> http://esw.w3.org/topic/AwwswDboothsRules
> There is also some test data.
>
> Among other things, these rules show:
>
> - how a URI is declared via "follow your nose" dereferencing;
>
> - both hash URIs and hashless (303 style) URIs; and
>
> - how two URIs can denote the same resource (in this case an  
> awww:InformationResource), but when dereferenced return different  
> responses (one returning an HTTP 301 status, the other returning a  
> 200 OK).
>
> Please take a look, try them out, make comments, etc.  I haven't  
> minted real URIs for the namespaces -- I'm just using http:// 
> example/... URIs -- but if people want I can mint URIs for them.
>
> Thanks
>
> David Booth, Ph.D.
> HP Software
> +1 617 629 8881 office  |  dbooth@hp.com
> http://www.hp.com/go/software
>
> Opinions expressed herein are those of the author and do not  
> represent the official views of HP unless explicitly stated otherwise.
Received on Tuesday, 26 February 2008 02:21:53 UTC