RE: [xml-dev] RDF for unstructured databases, RDF for axiomatic from pat hayes on 2002-11-22 (www-rdf-comments@w3.org from October to December 2002)

From: pat hayes <phayes@ai.uwf.edu>
Date: Thu, 21 Nov 2002 18:07:24 -0600
To: "Shelley Powers" <shelleyp@burningbird.net>
Cc: <www-rdf-comments@w3.org>
Message-Id: <p05111b2aba0316949ebd@[10.0.100.86]>
><snip>
>
>>
>>  There is a deeper issue as well. RDF is not a final product: it is
>>  intended to be the 'base' layer of a family of more expressive
>>  languages built on top of it. Some of these already exist - RDFS - or
>>  are being produced right now, eg OWL. We expect that there will be
>>  more. And several people expect that a wide range of other SW
>>  languages might get developed for a wide variety of purposes, but all
>>  in the same general framework and all conformant to RDF. This places
>>  RDF in a particularly tricky position regarding 'ambiguity' in this
>>  sense. For some purposes, it is probably best if RDF itself
>>  under-specifies some meanings, for now, as those meanings will be
>>  given a tighter meaning in later extensions built on top of RDF.
>>  Sometimes we expect that these tighter meanings will emerge from a
>>  process of consensus, which we do not want to pre-empt or pre-guess
>>  at this stage; sometimes, we think that alternatives might emerge,
>>  and in those cases we want RDF to be consistent with all the
>>  alternatives even when they are incompatible with each other. (The
>>  slightly odd treatment of rdfs:range in the current semantics is
>>  partly motivated by this way of thinking, for example. For extensions
>>  in the OWL style, a simpler definition of rdfs:range meaning would
>>  have been preferable; but there are other use cases for RDF,
>>  involving compatibility with strongly typed systems, where that
>>  simpler semantics would have been a problem. So we backed the RDF
>>  meaning off slightly to try to preserve future compatibility. )
>>
>
>Pat, you're the ultimate researcher, and I'm the ultimate applied engineer.
>As such we have to discover each other's language. When you say that RDF is
>not a final product and that it is the base of more expressive languaged, do
>you see RDF/XML as one of the more 'expressive' languages?

No, RDF/XML is part of the RDF spec. I meant things like, well, RDFS, 
but more especially DAML+OIL and its descendant OWL, already 
announced by the Webont WG, which are consciously built 'on top' of 
RDF, more or less in the way that Tim B-L represented in his 'layer 
cake' diagram. It was a lot of work getting them to fit there, as 
well.

>  > >
>>  >Your charter does not preclude you using the deprecation marker. From the
>>  >HTML document:
>>  >
>>  >"A deprecated element or attribute is one that has been outdated by newer
>>  >constructs. Deprecated elements are defined in the reference manual in
>>  >appropriate locations, but are clearly marked as deprecated. Deprecated
>>  >elements may become obsolete in future versions of HTML. User
>>  agents should
>>  >continue to support deprecated elements for reasons of backward
>>  >compatibility.
>>  >
>>  >Definitions of elements and attributes clearly indicate which are
>>  >deprecated."
>>
>>  Right, I understand. But we (deliberately) aren't deprecating
>>  containers and reification in this sense, we are just clarifying
>>  their semantic boundaries and drawing them rather close to the chest,
>>  as it were. Similarly for containers.
>>
>>  Overall, this is an issue best brought up to the WG rather than to me
>>  personally. But its getting VERY close to final call, and I don't
>>  think we will want to make any large changes at this stage.
>>
>
>Personally, I don't want to see anything hold this delivery up. But when I
>write about RDF, I have to understand who RDF is targeted to. I'm writing a
>book, "Practical RDF". The longer this discussion continues, the more I
>think that title is a heavily ironic oxymoron.
>
>I see RDF as comparable to the relational model, and RDF/XML as a generic
>RDBMS.

Hmm, not sure I follow that. I see RDF/XML as just an interchange 
syntax for squirting RDF along wires. But one can live in the XML 
universe (lots of people do) and then the RDF graph looks more like 
an abstract relational model, right. The MT is just a semantics for a 
relational model, by the way, except it takes a rather non-DB stance 
on some issues, like closed world assumptions.

>You can build on an RDBMS and make more sophisticated products such
>as SAP, PeopleSoft, and Oracle Financials (which I would equate to your
>ontologies), but you can also use the 'RDBMS' directly for your
>applications.

Right, exactly. I didn't mean to imply that RDF had no uses alone. It 
certainly does. In many cases its just like you say: the RDF is the 
instance data (in triple-tables, in effect) and the 'higher' stuff - 
the 'ontology' - in RDFS or OWL is more like data constraints, 
encoding the general conditions that all data is supposed to satisfy. 
(These languages don't usually *impose* constraints, however, but 
rather allow them to be inferred. )  But already people are doing 
things like using useful OWL constructs (notably owl:sameAs, which 
used to be called owl:sameIndividualAs) inside what would otherwise 
be entirely RDF docs - which is legal RDF and also legal OWL, which 
is why we went to so much trouble to keep the layering relationship 
intact, but means 'more' in OWL than it does in RDF. The distinction 
between data and data-model is sharp in the RDB world but much 
blurrier here.

>
>Forgive me trying to find a common analogy, but this is again trying to
>discover a shared language. If this viewpoint is wrong, then I have made a
>serious disconnect with the RDF documents at some point.
>
>>  >
>>  ><snip>
>>  >
>>  >>
>>  >>  >Processing -- each item in a sequence is related to every other
>>  >>  item in this
>>  >>  >way -- that is what is known in the computer world as 'processing'
>>  >>  >information. As in, when you see this tool makers, this is how
>>  >>  you process
>>  >>  >it.
>>  >>
>>  >>  Sure, but be careful about the phrase 'processing semantics', which
>>  >>  is often used to convey the idea or claim that the *meaning* of the
>>  >>  language is to be found in the processing done to the expressions of
>>  >>  the language. This is true, or at least plausible, when its a
>>  >>  programming language, particularly an interpreted programming
>>  >>  language; but its not a good way to think about an
>>  >>  assertional/descriptive language like RDF. Of course, RDF gets
>>  >>  processed (hopefully) but the point is that in this case, the
>>  >>  processing should follow the meaning, in the sense that it should
>>  >>  constitute valid inferences, rather than defining the meaning.
>>  >>
>>  >
>>  >But Pat, there is a processing semantic attached to containers.
>>  That aspect
>>  >of containers that represents a data structure -- a descriptive
>>  structure if
>>  >you will -- a grouping of related items has no implied
>>  processing other than
>>  >the relationship. However, when you give a Bag, Seq, and Alt type, you're
>>  >attaching processing semantics to the construct. This is no
>>  different then
>>  >attaching conditional processing semantics to 'if' in most programming
>>  >language.
>>
>>  Sorry, I disagree, and think that your view is a misunderstanding of
>>  RDF. RDF really, really, really is not a programming language. It has
>>  nothing like a programming language's semantics: it doesn't assume
>>  that its domains are recursive or computable. It is more like a
>>  simple assertional logic, or a notation for a database, than it is
>>  like a programming language. So when you see rdf:Bag, that does *not*
>>  mean that RDF is constructing a bag, or defining a bag, or that an
>>  RDF processor is obliged to construct a bag-like datastructure which
>>  conforms to baggish behavior. (It MAY do that, but that goes beyond
>>  what the RDF actually says. On the other hand, if it does create such
>>  a datastructure, then it really would be a good idea to have it be
>>  treated baggishly rather than, say, listishly or settishly, if you
>>  see what I mean...)
>>
>
>Then why does the RDF documentation mention Bag, Seq, and Alt? Why
>differentiate between types of container? What possible reason and _use_
>would this be?

Good question. Suppose you have some information encoded in some way 
that actually does use bags, say.  And you want to put this into RDF, 
ie to express the same information, as far as possible, in RDF. RDF 
itself can't, by its very nature, actually provide you with actual 
bags (it has no internal state to encode things like permutations) , 
but what it CAN do is provide you with a kind of generic 
container-member vocabulary that you can use to say what things were 
in your bags, and a way to say that this thing they are all in, is 
supposed to be a bag. That is, in a sense it provides primitive 
CONCEPTS of a bag and a seq and an alt , but the actual way of saying 
what's in the container is just the same in all three cases. But that 
is OK, since RDF itself isnt going to DO anything to your containers, 
other than record what's in them and what type they are; its not 
really going to do anything at all, its only purpose is to encode 
information about things. So it lets you encode information about 
your bags.

In some ways its rather like what we do when we specify the members 
of a bag in program text, eg by writing something like
bg :=  mkBag(a, b, c, d).
Since its a bag, the ordering of the items is irrelevant; 
nevertheless, you did actually write them in an order on the page, 
because you have to choose some order to type things in. The RDF 
description is more like the thing you type than like the actual bag 
itself.

Why bags/seqs/alts in particular and nothing else? Well, I guess 
people thought these would be a useful small set of container-types 
that would cover a lot of cases. We added LISP-like lists, called 
collections, because Webont wanted them badly. Nobody else requested 
any others. But someone could invent a new category of rdf containers 
and call them foodles, if they wanted to:

ex:foodle rdfs:subClassOf rdfs:Container .

and then the range constraints on the existing RDFS vocab will ensure 
that the containermembership properties will apply to Foodles. Of 
course, they will just be things like the others, with members in 
places identified by integers, as far as RDF knows; but you can now 
record the fact that one of them is a foodle and if you and the 
people you are talking to know what foodles are, why then you will be 
able to communicate with one another via RDF.  And if I don't know 
what a foodle is, I at least know its some kind of container and what 
things are in it, and that alone might enable me to draw *some* 
useful conclusions. And maybe later I will find out what a foodle is, 
and then I can draw some more.

If you were to use OWL you might even be able to *describe* what a 
foodle is, but that gets hairier. And maybe you wouldnt, in any case.

For me, this freedom that RDF provides for you to invent your own 
categories and immediately use them, is its central feature. And to 
emphasize, there are no miracles going on here: you are just writing 
descriptions. The RDF engine might draw some conclusions, but all it 
knows about foodles is what you tell it. It never has to *construct* 
a foodle: it couldn't, since it doesn't know how to. But that's fine, 
nobody is expecting it to actually construct anything: it just has to 
*remember* that this thing you said was a foodle, is indeed in the 
class of things called 'foodles', and be able to draw any valid 
conclusions that this might entail.

>
>>  What it actually means, by the way, is that some thing exists which
>>  is classified as an rdf:Bag and which has some other things in
>>  various 'positions' in it. That is all; and what being an rdf:Bag
>>  *really means* is not specified. But then what it *really means* to
>>  be in most RDFS classes is not specified, so what's new?
>>
>>  Now, if you were to say, then it's a damn shame that RDF uses names
>>  that strongly suggest programming language constructs, like 'bag' and
>>  'alt', then I would heartily agree. But those came with the package,
>>  and our charter required us not to make merely cosmetic changes.
>>
>>  >
>>  ><snip>
>>  >
>>  >>  >
>>  >>  >Actually, it has a lot of problems. It was created by a
>>  group of smart,
>>  >>  >wonderful people who really care about making RDF work BUT who have a
>>  >>  >difficult time understanding that not all of us have PhDs in
>>  >>  linguistics and
>>  >>  >mathematics. Or Philosophy.
>>  >>
>>  >>  Well, I did try to write the semantics doc so that it didn't
>>  >  > presuppose having technical qualifications in logic or philosophy,
>>  >>  and explained its ideas as it goes along. If it was written for a
>>  >>  technical/mathematical audience it would probably be about 1/3 the
>>  >>  length and have hardly any English words in it. The Webont WG OWL
>>  >>  semantics are written more in this style, you might find the contrast
>>  >>  amusing.
>>  >>
>>  >
>>  >I'm actually not as concerned about the semantics document as I am the
>  > >other. I think the biggest problem with the documents in difficulty
>>  >understanding who your audience is.
>>
>>  Yeh, we have the same problem. :-)
>>
>>  >Is it the tools developers? The language
>>  >semantician? The RDF end user? Rather than break across functional lines,
>>  >perhaps the documents should have broken along audience lines.
>  > (Or did they?
>>  >Is the RDF Primer the document for the end user?)
>>  >
>>  >>  >They add references to containers in the primer and the syntax,
>>  >>  but in the
>>  >>  >semantics document, add this statement basically forcing
>>  interpretation
>>  >>  >about 'containers' back on the user.
>>  >>
>>  >>  Have you read the newest draft? It tries to give a better exposition
>>  >>  of RDF container. The key point is that RDF *describes* containers
>>  >>  rather than *constructing* them. Then the rather sparse semantics
>>  >  > makes more sense, I think: its not saying that the containers
>>  >>  themselves are 'thin' or ambiguous; its just that RDF doesn't say a
>>  >>  whole lot about them.
>>  >>
>>  >
>>  >Enough to be possibly damange the credibility of the release.
>>
>>  If you think of RDF containers as RDF-defined datastructures, then
>>  your comment would be justified; but that is not the right way to
>>  think about them.
>>
>>  >
>>  >>  >In this case, they definitely
>>  >>  >re-introduced ambiguity. Why? Because a lot of them don't like
>>  >>  containers,
>>  >>  >they wanted to get rid of containers, they think containers are
>>  >>  redundant.
>>  >>
>>  >>  No, not at all. The problems run deeper. The real problem is that if
>>  >>  containers are unordered (like rdf:bag) and you use an ordered set of
>>  >>  selection functions on them, then you are kind of imposing an order
>>  >>  on what is conceptually an unordered thing. So your description in a
>>  >>  sense says *too much* about it. So if we allow RDF to make any formal
>>  >>  entailments about bags, they are almost all going to be wrong, in
>>  >>  fact, so we had to block them. For example suppose you say that A is
>>  >>  a bag and its first item is X and its second item is Y. If you allow
>>  >>  RDF to permute the items, then you can infer that Y is the first
>>  >>  item. But now *both* X and Y are the first item...
>>  >>  There are other problems, notably with there being no way to say that
>>  >>  a container is 'closed', ie has no more elements.
>>  >>
>>  >
>
>Didn't Jonathan put gun to his head and go 'bang!' at about this point?

Yes, it was a rather notorious screw-up in the original spec. With 
list-style collections you can do this, which is largely why DAML 
uses them.

>
>>  >And that's why I don't like containers in this type of model.
>>
>>  What do you mean by 'type of model'? (I think we may agree, if you
>>  mean what I think you mean. But Im in the minority in the WG on that
>>  point.)
>>
>>  >
>>  >>  >Worse, containers add processing semantics to what is a data
>>  >>  model. I happen
>>  >>  >to agree with them -- containers are redundant. They were,
>>  at one point,
>>  >>  >actually pulled from RDF. Or at least there was a WG note
>>  for this at one
>>  >>  >point.
>>  >>
>>  >>  Yes, that might have been one option. Another would have been to
>>  >>  redesign the containers from the ground up; we could have done a much
>>  >>  more elegant and formally tight version. But the old container
>>  >>  vocabulary would then be deprecated, and we felt this was needlessly
>>  >>  drastic, particularly as we were adding collections to overcome many
>>  >>  of the problems. Making something retrospectively illegal is not an
>>  >>  action to be taken lightly.
>>  >>
>>  >
>>  >As stated previously, deprecation is good and doesn't
>>  necessarily hurt your
>>  >existing tool and RDF users. W3C has used deprecation with HTML,
>>  and it has
>>  >a much wider user base. Deprecation would not have violated your charter.
>>  >
>>  >As for not doing it lightly -- leaving in vaguely defined semantic
>>  >constructs strikes me as a bit more serious. Wouldn't you think?
>>
>>  Well, no; because in this sense of 'vaguely', *all* assertional
>>  semantics are 'vague'. One only gets non-vagueness in this sense when
>>  describing domains which satisfy the recursion theorems, which
>>  guarantee that recursively described domains are unique. Programming
>>  languages do that. Most of the worlds that RDF will be used to
>  > describe (worlds containing things like people, wines, works of art,
>>  aircraft) are not recursive (computable) and cannot be tied down
>>  uniquely.
>>
>
>I'll drop these questions back and forth before someone comes along and says
>they're outside of the scope of the mailing list. I am concerned, though,
>about who is the target audience for RDF (documents, RDF/XML, and all).
>However, I must be too dense because every time we cycle through one of
>these question/response, I feel less and less that RDF will ever have a
>directly 'practical' use.

Think of it as text markup which is readable by an inference engine. 
Thats really is all it is. All the complication is an artifact of all 
the XML/RDB/logic/unicode/CC-PP/Dublin-core/DAML hoops we have to 
make it fit under.

>However, as I said earlier, Pat, yours and my experiential differences are
>vastly different. That would account for much confusion.

Sure, I understand. Its hard to know how to say things right when you 
can't see the audience.

Pat

-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Thursday, 21 November 2002 19:07:35 UTC