Re: Is it best practices to use a rdfs:seeAlso link to a potentially multimegabyte PDF?, existing predicate for linking to PDF? from Tim Berners-Lee on 2011-01-13 (public-lod@w3.org from January 2011)

From: Tim Berners-Lee <timbl@w3.org>
Date: Thu, 13 Jan 2011 08:02:44 -0500
To: Dave Reynolds <dave.e.reynolds@gmail.com>
Cc: Nathan Rixham <nathan@webr3.org>, Linked Open Data <public-lod@w3.org>, Ivan Herman <ivan@w3.org>, Thomas Roessler <tlr@w3.org>, Dan Brickley <danbri@danbri.org>
Message-Id: <BC5C34B9-3ED2-4E82-8EDE-CD7C16F63981@w3.org>

On 2011-01 -13, at 07:23, Dave Reynolds wrote:
> 
> Where is the spec for this "engineered protocol" and where in that spec
> does it redefine rdfs:seeAlso?
> 
> [I believe I have reasonably decent understanding of, and experience
> with, linked data. It is a useful set of conventions and practices
> building on some underlying formal specifications. However, I'm not
> aware of those practices being so universally agreed and formally
> codified as to justify some of the claims being made in this thread.]
> 
> Dave
> 

Well, I looked.

The current spec  http://xmlns.com/foaf/spec/  says: (my emphasis)

"Perhaps the most important use of knows is, *alongside the rdfs:seeAlso property*, to connect FOAF files together. Taken alone, a FOAF file is somewhat dull. But linked in with 1000s of other FOAF files it becomes more interesting, with each FOAF file saying a little more about people, places, documents, things... By mentioning other people (via knows or other relationships), *and by providing an rdfs:seeAlso link to their FOAF file*, you can make it easy for FOAF indexing tools ('scutters') to find your FOAF and the FOAF of the people you've mentioned. And the FOAF of the people they mention, and so on. This makes it possible to build FOAF aggregators without the need for a centrally managed directory of FOAF files..."

So here is seeAlso being set up as the target for crawlers.
This clearly breaks when rdfs:seeAlso links go to a big PDF file.

The old FOAF spec is not easy to find. Previous version links go back to
http://xmlns.com/foaf/spec/20050403.html
which has copyright Copyright © 2000-2004 Dan Brickley and Libby Miller 
This is all new flavor in which the document is generated from small documents one about
each term in the vocabulary and so does not have the general description in which the seeAlso
stuff was.

The RDFS spec
http://www.w3.org/TR/rdf-schema/#ch_seealso of course doesn't not define it at all.

On the client side, the first Tabulator paper in 2006
http://swui.semanticweb.org/swui06/papers/Berners-Lee/Berners-Lee.pdf
has a section 4.2 

"4.2	What to dereference

The Tabulator automatically and recursively loads the ontology file for any term used as a predicate or type (object of rdf:type), recursively (ontological closure).
Here we consider a user is browsing or querying information about a subject x. When the user opens up a tab asking for information on x, or a query is being resolved and x is the subject or object of a statement in the query pattern, or is x is bound to a variable during the query, then x is looked up. Looking up currently involves:

– looking up the URI of x itself, and also
 – looking up any y where the store includes the fact that { x rdfs:seeAlso y}.

The latter is necessary for the Friend of A Friend (FOAF) conventions. It is currently not widely used elsewhere. It can, be useful, however, to allow a third party to point out that information is available, when the owner of the URI itself has not, for whatever reason, included that information when x is dereferenced. We implemented the dereferencing of URIs using the HTTP protocol. Suc- cessful dereferencing of an HTTP URI gives a status code and either a redirection (status 300-303) or (status 200) a representation consisting of content bytes and metadata."

Since then, I have certainly used seeAlso as a way to distribute information between several files,
to be able to have a read-only file pull in a machine-editable file, and so on.

I suspect other client libraries have used this algorithm. 

At the time you certainly couldn't navigate the FOAF graph, which was then
the leading and dominant linked data graph remember, without that algorithm.
It may be that things have changed and we should turn off seeAlso handling in our clients
and see what breaks.  Or better, if someone could crawl a significant chunk of the 
FOAF graph (not all one site!) to see how many of the links out there rely on it.

If small, then we could migrate the clients to a <#> link:pleaseLoadData <y> new
property or something.

Or maybe FOAF people know the answer to this question.

Tim

Received on Thursday, 13 January 2011 13:02:49 UTC