The RDF Approach to Indicating Language-In-Use

One issue jumped out at me during the last meeting, illuminating
perhaps the central difference between Bijan's view and Tim's view.
It was when Jim Hendler asked how one knew if an RDF document was
supposed to be understood as an OWL document.  Bijan said to use
application-specific logic [1].

Ironically, this mirrors exactly an approach Bijan and I were jokingly
advocating the night before (in response to bugs arising from
content-negotiation) that web clients should really just ignore the
Content-Type header, since it's so often wrong.  [2]

The issue in both cases is this: how is a message receiver to know
which language a message is written in?  It may have different
meanings to the receiver, or be errornous, depending on its language.
There is a TAG finding on this issue in terms of Content-Types, which
says: 

      "It is a serious error for the response body to be inconsistent
      with the assertions made about it by the MIME headers. Web
      software SHOULD NOT attempt to recover from such errors by
      guessing, but SHOULD report the error to the user to allow
      intelligent corrective action."  [3] 

It says such behavior can be "dangerous", without quite supporting
that claim to my satisfaction, although I happen to agree.

The question here (unless people want to stop and debate that
finding), is how to achieve in RDF this functionality that mail and
web systens achieve via Content-Type headers.  In RDF, we can't use
header fields, unless we can figure out some way to have them survive
a graph merge, which seems unlikely.  To rephrase: how is some
software which is trying to act on information it receives as
"application/rdf+xml", supposed to know whether it's looking at some
OWL DL, some OWL Full, or some evil Anti-OWL where every URI means
essentially the opposite of what it means in OWL Full?

I think Bijan is suggesting that systems need to work this out on a
case-by-case basis.  That doesn't scale or support ad-hoc
interoperation like we want; it's just the fallback if we can't come
up with anything better.  I think he is advocating falling back now
out of concern that in trying to address this problem we'll end up
creating possibly bigger ones, such as by mandating "strong
ontological committment".

I think Tim is suggesting that RDF should work just like mail and web
content, except (1) using URIs instead of a central registry, (2)
using every URI in the content as if it were, in essense, another
content-type value.

To review the effect of Content-Type labeling: a hypothetical web
client which is a combination of user, programmer, and program, on
encountering an unknown Content-Type value can, in theory, go to the
media type registry [5] and find pointers to the documents needed to
implement an appropriate handler for that type.  So any sufficiently
motivated user/programmer/program can handle any content type, more or
less.

The TAG has proposed that URIs can be used instead of a central
registry [6].  That makes perfect sense to me if we hand-wave enough
over the persistence issues.   According to this proposal, you could
just dereference the Content-Type URI to get the necessary
documentation and probably links to available implementations.

The second half of Tim's suggestion, if I understand it right, is that
the language-in-use for an RDF graph be determined by a combination
(conjunction/intersection, I guess) of the languages defined in all
the specs the user/programmer/agent finds by dereferencing all the
URIs in the graph.

It makes a certain sense.  Some of it can be automated, even, as we
did with the Dingo example.

I do feel like something needs to be done beyond leaving it up to
applications and subnets.  I imagine someone using owl:Class as a
predicate relating people to an index of how "classy" a dresser they
are .... and it seems like they are doing something wrong!  Yes, in
that case, they are violating a W3C CR.  But what if they misuse
dc:author as a synonym for rdf:type?  Even if you grant that W3C has
some authority on correct use of RDF, does Dublin Core?  Or does the
fact they they invented, published, and host that URL (the expansion
of dc:author) actually count for something?

      -- sandro

[1] http://www.w3.org/2003/10/10-sw-meaning-irc#T16-17-05
[2] http://ilrt.org/discovery/chatlogs/rdfig/2003-10-09.html#T01-35-33
[3] http://www.w3.org/2001/tag/2002/0129-mime
[5] http://www.iana.org/assignments/media-types/
[6] http://www.w3.org/2001/tag/2002/01-uriMediaType-9.html

Received on Wednesday, 29 October 2003 10:06:36 UTC