URIs, used in RDF, that do not have associated documentation from Jonathan A Rees on 2012-03-26 (www-tag@w3.org from March 2012)

From: Jonathan A Rees <rees@mumble.net>
Date: Mon, 26 Mar 2012 14:03:49 -0400
To: www-tag@w3.org
Message-ID: <CAGnGFMLdxtbDksgXWpcsd-DrvY2k9e+A6yfxZczyMC52YzAhQg@mail.gmail.com>
The question arises often: What are examples of RDF in the wild, where
URIs are used that do not have associated documentation (i.e. RDF that
tells you what the URI refers to)?  That is, what are some situations
where the httpRange-14(a) rule might apply in practice - where linked
data meets the non-RDF Web, so to speak?

Remember that I've stated my dismay that httpRange-14(a) says "is an
information resource" rather than addressing the ambiguity mentioned
in Fielding's email (as illustrated by the Flickr and Jamendo cases).
httpRange-14(a) as written doesn't really help, except that its
authors and nearly everyone else have interpreted it to resolve the
ambiguity in a particular way - that the URI refers to what you
retrieve (generically if you will), not to what is described by what
you retrieve. This interpretation *has* been helpful because it lets
you use these RDF-less URIs in RDF and be understood. That the
resolution didn't say what was meant was a colossal screwup IMO. But
let's set that aside and just look at the question.

I don't have the tools on hand to answer this very satisfactorily. I
hope someone with access to good infrastructure will study this
question. I will just give the examples that come to me off the top of
my hand.

If you look at, say,
http://dbpedia.org/page/Paris,
you find many RDF statements in which the object of the statment is
given as a URI for which
there is no descriptive RDF. Most notably we have the target of the foaf:page
relation, but also thumbnail, wikiPageExternalLink, website, etc.
Since this is true of every dbpedia page, we immediately have quite a
few such URI occurrences. If this use of URIs were called into question, then
dbpedia would have quite a bit of rewriting to do.

Any FOAF page that has homepage, publications, etc. (where the target
lacks its own RDF, which is the normal case) would be affected.

License assertions are affected. It turns out the CC licenses have
embedded RDFa that could be taken as documentation of the license URI;
you might argue that this RDFa sort of implies that the URI refers to
the
license retrieved from the URI, not from httpRange-14(a). But I don't
buy this. The RDFa doesn't really provide enough
information to say it's the retrieved license itself, as opposed to
some other resource. That is, there's no way to distinguish the
license case from the Flickr case, so the fact that there is RDFa
doesn't really help. It's httpRange-14(a) and the (poorly justified)
resulting assumption that URIs generally refer to what is retrieved
that makes this work, not the metadata.

I could hunt around for uses of Dublin Core metadata where the subject
of the statements has no accompanying descriptive RDF. I'm sure
they're out there. Remember that this was one of the first use cases for RDF..

POWDER should be similar to Dublin Core but I have no pointers to
POWDER deployment. (POWDER's predecessors were *the* first RDF use
case, if I understand the history correctly.)

Any use of the RDFS vocabulary is going to be full of such URIs. Look
at the target of almost any rdfs:isDefinedBy assertion and the target URI
will usually fail to have descriptive RDF. E.g. see
http://www.w3.org/1999/02/22-rdf-syntax-ns - it has a bunch of these
assertions.

I would be willing to bet that any URI used to name an RDF graph - as
one finds in, say, SPARQL - has no adequately descriptive RDF. The assumption is
always that the URI refers to the graph that a retrieved
representation serializes. (The graph/serialization confusion is a
wart, but the graph is similar enough to its serializations that I'm
willing to overlook the sloppiness here. But in any case - that
confusion is a *different* question, there is no
descriptive RDF for the URI, so this is an example.)

Additional examples welcome in followup.

I am not aware of any instance (in the URI-refers-to-what-you-GET
case) where, even when embedded RDF (RDFa usually) is provided, it is specific
enough to rule out the possibility that the URI refers to some
resource other than the one whose instances
are retrieved. Usually this is a reasonable assumption, but any
change that says you should not make such assumptions is going to be
disruptive. It would be better, IMO, to codify the assumption
(which is currently not written down anywhere) somehow, than to negate
it.

Many of the change proposals that have come in, such as Jeni's (I
think), do *not* say that you should not assume that these URIs refer
to what is accessed. (Reread that if you have to.) But some of them
do, and many raise the spectre of the ISSUE-14 screwup, and I'm
disappointed no one's trying to fix. (Of course I didn't ask people to
fix it, so that's my screwup.) (And I haven't read all submissions in
detail so maybe someone does propose to fix it and I haven't
discovered that yet.)

I am certainly sympathetic to the arguments against this phenomenon in
principle. Most people just write these URIs without much thought as
to what they refer to or why, and there could be cases where the
intended meaning is not correctly expressed or understood. They don't
think about content negotiation or change over time or the possibility
that the URI might be interpreted as referring to what is described
rather than the description (or whatever is retrieved) or that the
"URI owner" might think the URI refers to something else. Amazingly
the overall result is, in my opinion, quite consistent and useful, in
spite of the opportunities for failure.

Jonathan
Received on Monday, 26 March 2012 18:04:17 UTC