- From: Sandro Hawke <sandro@w3.org>
- Date: Fri, 24 Jan 2003 01:55:04 -0500
- To: "Roy T. Fielding" <fielding@apache.org>, Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org
Maybe it'll help to revisit the old Semantic Web Requirements
Document. Ooops, it hasn't been written yet. Oh well. Let's see if
I can write the relevant parts of one on the fly.....
1. We need to be able to generate identification strings for
long-lived knowledge bases, and then do something like
insert(kbIdent, formula)
retract(kbIdent, formula)
query(kbIdent, formula)
from anywhere on the net and get roughly the same behavior with
the same parameters. (There are also lots of issues about
connectivity, access control, what KR language to use for the
formula, etc.)
You can think of these as database operations if you prefer.
This looks like HTTP and http: URIs if you squint a little. For
now HTTP GET of an RDF/XML file is query-all and HTTP PUT is
retract-all and insert(PUT's parameter). We can shuffle them
together better with some more work.
2. We need to be able to generate new unique strings, to use as
constant terms/symbols in our formulas. They have to be strings
which have never been used before in one of these formulas,
anywhere. Each time we want to talk about something for which we
don't already know a constant symbol, we'll probably want one of
these. (If we don't want other people to re-use the name we give
it, we can assign a local-scope name (an existential variable,
what RDF sometimes calls a blank node), but that's probably best
avoided since linking is good.)
This looks like UUIDs or tag: URIs. You could use something else
like http: URIs, but they give us lots of extra features and
baggage which we don't need.
That's where I was two years ago, and I was happy with it (except that
I couldn't get an RFC on tag: URIs published). Then TimBL pointed out
that tag: URIs were not "clickable". You couldn't get a
representation. And I said "that's the point -- you don't need to get
a representation -- you're just trying to make a new logic symbol!"
But eventually I figured out that he wanted the strings from
requirement #2 to lead automatically to one of the KBs in requirement
#1.
So instead of saying "I'd like to buy widget-435353", we should always
say something like "I'd like to buy
widget-435353-which-you-can-learn-about-by-calling-1-800-BUY-WIDG."
Thus, a 3rd requirement:
3. There should be a cheap and fast way to get from a constant symbol
to the kbIdent of a KB which can give us some authoritative
information about it.
The cheapest/fastest way to do this is string manipulation. TimBL
proposes that URIs with a "#" in them be considered constant
symbols and URIs without a "#" are kbIdents, and you get to the
kbIdent from the constant symbol by truncating at the "#". Fast
and simple.
Another approach is to use rdfs:isDefinedBy, giving a triple for
each constant term, linking it to an authoritative kbIdent. So
you say
<a> <b> <c>.
and then, to let people know something about what those terms
mean, you add
<a> rdfs:isDefinedBy <http://....>.
<b> rdfs:isDefinedBy <http://....>.
<c> rdfs:isDefinedBy <http://....>.
That has two problems. First, the <http//....> terms look
syntactically like constant symbols, not like kbIdents. How do
you know where THEY are defined? That problem can be addressed by
redefining rdfs:isDefinedBy to have a range of uri-string instead
of resource. That gives us:
<a> <b> <c>.
<a> rdfs:isDefinedBy "http://....".
<b> rdfs:isDefinedBy "http://....".
<c> rdfs:isDefinedBy "http://....".
That works, but feels a little like if web addresses were UUIDs,
and along with the UUID you were given an address of where you
could get information about it. Kind of like it's missing the
point of URIs.
So I was using Tim's approach for a while, but I started to want to
make the content *really* clickable. If I give you a URI for an
action item you agreed to last week, you should be able to get some
readable information even without special RDF software.
4. One should be able to use content-negotation to offer structured
data in RDF *and* HTML at the same URI. When you ask for HTML,
you get some nice little HTML tables or diagrams saying the same
thing as the RDF. It's easy enough.
[ Why at the same URI, you ask? Why not have the HTML at the
more public URI, and have some hidden link inside the HTML
pointing machines at the RDF? Because it's really the constant
symbol we're trying to follow; we're trying to get information
about the action item you agreed to. We just want one URI for
that. We don't want http://...//meeting7.html#item6 and
http://...//meeting7.rdf#item6 !
I can almost see a RESTful solution here, but not quite. I end
up with being unable to distinguish between the kbIdent and the
constant symbol identifying the action item. But I await
suggestions here.... ]
Anyway, there's a TAG finding which says I shouldn't serve HTML
and RDF at the same URI -- and with good reason, because the
fragment semantics are different.
The only solution I've seen so far is mine: when you need a
constant term, create a kbIdent for information about it, with it
distinguished as the primary subject. Now, use the same URI for
the kbIdent and the constant term, but follow precise context
rules so you always know which one you are talking about. This
meets requirements 1-4. Does anything else?
The overhead of creating a new kbIdent for each thing you want to
identify with a constant term is insignificant if we reclaim the
use of fragments. Each thing mentioned in a big KB can have a
fragment of the kb devoted to information about it. This seems
to match the use of anchors in some HTML, where each important
concept or defined term in the document gets its own fragment.
It also matches XPointer semantics for RDF/XML being XML if you
use rdf:ID where feasible instead of rdf:about, and think of
rdf:ID as an XML id. The semantics may not be perfect, but
they're pretty darn close.
Justifying #1 would be an interesting excersize for another day.
-- sandro
Received on Friday, 24 January 2003 01:57:31 UTC