Re: RDF semantics: applications, formalism and education

Pat,

Thanks for your comments.  At this point, I wish to stand back a little 
and, rather than defend my suggested "stack", to focus on the issues I was 
hoping to draw out...

... I think I'm getting an inkling of the added dimension of 
"semantics".  It appears I might have been confusing semantics with 
structure (a rather elementary error, it now seems).


Let's see if I'm getting even remotely close here:  the "lower" layers of 
the structure I mentioned previously are almost entirely concerned with 
structure.  Referring to the Formal Systems Definitions page that Dan C 
cited a while ago 
(http://www-rci.rutgers.edu/~cfs/305_html/Deduction/FormalSystemDefs.html), 
I tried to fit the rather trivial case of bits and octets into this 
framework and what I noticed was that there was no distinction among those 
things that constitute a well formed formula (i.e. for octets, a wff is a 
sequence of 8 bits):  there is no further notion of axioms or 
inference:  the closest I could get was to say that *all* wffs were also 
axioms, which leaves nothing to infer.

So, if I'm on the right track, semantics only come into play when one:
- can reasonably/justifiably designate some subset of wffs as axiomatic
- introduces rules of inference that relate consequent wffs to some 
antecedent wffs.

Then there is the assignment of concepts to symbols so that there is some 
kind of correspondence between the theory and some states of affairs - the 
"model theory"?

[[[
QUESTION:  how are the axioms and rules of inference different from the 
symbols and rules of construction for wffs?
   symbols + syntax rules -> wffs
   axioms + inference rules -> theorems
]]]

Working up the "stack" the first hint of WFFs being not all equally valid 
seems to come with XML, which has "validity constraints" as well as "well 
formedness constraints" -- does this admit the idea of valid XML documents 
as the axioms of a formalism based on XML?  I think not:  I think the 
validity constraints are more reasonably viewed as additional structural 
requirements that cannot be expressed by a context-free grammar.

[[[BTW, I view the suggested layering to be, at best, a "projection" of one 
view of some components that may be combined to achieve the kind of target 
function (I think) we're after.  Others will almost certainly have 
different views.  As a system architect, I find it very useful to try and 
tease apart components of a complex system that can be designed in 
(relative) isolation.  Whether that results in a stack, or tree, or 
something else is not of primary import here.]]]

Somewhere in the RDF(S) components it seems to be reasonable to start 
distinguishing between RDF graphs that are "believed true", and those which 
aren't, and looking for some rules of deduction.  This does appear to be a 
very different kind of idea to the purely structural relationships of the 
underlying layers (though I'm still gnawing at the question above).

...

>That is exactly what the description-logic community have been studying in 
>depth for the last decade. There are some results, some usable systems 
>have been designed and implemented, and a lot of tough theoretical 
>problems remain. Maybe y'all should do some reading; it will save a lot of 
>time in the long run.

I'm sure you're right... can you recommend anything in 
particular?  (Preferably accessible -- not assuming too much background, 
and relevant to the current topic.)  My searches on Amazon/Google have 
failed to turn up anything that looks obviously promising as introductory 
material.

I just dug out your papers "In defense of logic" and "The second naive 
physics manifesto" (I happen to have copies to hand) and these have helped 
me understand something of the significance of interpretation as part of 
semantics;  viewpoints from other commentators, and a concise introduction 
to the logic formalisms involved (especially covering the model theory for 
FOL) might be helpful.

And also, a working definition of "semantics" might help.

#g
--


At 11:28 PM 4/6/01 -0700, pat hayes wrote:

>>[** Thinking...]
>>
>>Maybe we need to review and clarify the architectural framework.
>>I can imagine something like this:
>>
>>Logic:           DAML+OIL, etc  (Full-strength inference, typing)
>
>(niggle: DAML+OIL isn't full-strength logic.)
>
>>Schema:          RDFS           (Limited inference, typing)
>>Abstract syntax: RDF            (Directed labeled graph)
>>                 XML Infoset    (Annotated tree)
>>                 XML Namespace  <<--------------------- URIs
>>Syntax:          XML            ("Pointy brackets")
>>                 Characters     (Unicode, UCS, others)
>>                 Octets         (Pretty universal now, not always so)
>>                 Bits
>
>I have some problems understanding what you are meaning here. The 
>relations between the various layers don't seem similar, so this doesnt 
>feel like a stack to me. Octets are a way to put characters into a data 
>stream; characters form lexical items (a missing layer, by the way) for 
>any language, not just XML. All languages have syntax: it doesnt stop with 
>XML. Also the relation between XML and RDF is much murkier, since RDF more 
>or less declares itself to be independent of XML (pointy brackets are just 
>one possible option for RDF 'linear' syntax; real RDF structure resides in 
>the triples. Which I think makes sense, since just about any language can 
>be XML-ized. One can write an XML rendering of KIF, which puts real 
>full-strength inference right on top of XML with no nonsense. )
>So what I see here is a tree, which branches at each stage into many 
>possibilities. Among those, RDF doesnt exactly stand out as a winning option.
>
>>Different kinds of application can sit on different levels of this 
>>"Stack".  All computer applications ultimately sit on 'bits', and most 
>>sit on 'Octets'.  Unicode/UCS is becoming the norm for applications using 
>>character-coded data (text, XML, and more).  E.g. it's standard in Java.
>
>Yes, though UCS-2 is about as much as any sane person could ever need.
>
>>Today, there are many applications that sit directly on the XML layer 
>>(+URIs):  many of the current W3C recommendations specify XML applications.
>
>I agree XML is a generally useful device. Labelled directed graphs are 
>about as universal a format as anyone is likely to invent.
>
>>XML Infoset (if I understand correctly) is an abstraction layer that 
>>might allow XML to be based on non-character representations. 
>>Applications based on this (using DOM?) should be isolated from the 
>>underlying character syntax.
>>
>>I see the RDF abstract syntax as a simplification and generalization of 
>>the underlying XML on which it is based.
>
>I think just about everything in that sentence is wrong.
>1. RDF isnt based on XML
>2. RDF doesnt generalise XML
>3. XML doesnt underlie RDF.
>Arguably, I guess you could say that RDF is simpler than XML, in some 
>sense that isn't very useful.
>
>>Hopefully one which lends itself tolerably well to the construction of 
>>higher semantic layers.  I think this abstract syntax is a significant 
>>step towards being able to truly exchange information between different 
>>applications (TimBL gives an example somewhere of an invoice containing 
>>information about airplane parts:  financial management data meets 
>>engineering design data).  I see current RDF applications (RSS, CC/PP, 
>>etc as mainly operating at this layer).
>
>The issue is not whether RDF could be used to exchange data. It obviously 
>could. So could just about any other notation. The issue is whether RDF 
>has any particular things going for it compared to all the other 
>alternatives; and if so, what.
>
>>The next layer, and probably the most difficult to judge correctly, I 
>>have called the RDF schema layer, encompassing the basic ideas that lead 
>>to primitive inferencing (rule following) and typing/classification of 
>>resources.  I don't know if this is possible, but I think it would be a 
>>useful goal if evaluations defined at this level were guaranteed to be 
>>computable;  i.e. to terminate in finite time.
>
>That is exactly what the description-logic community have been studying in 
>depth for the last decade. There are some results, some usable systems 
>have been designed and implemented, and a lot of tough theoretical 
>problems remain. Maybe y'all should do some reading; it will save a lot of 
>time in the long run.
>
>>I understand that this excludes full FOL.
>
>It depends on what you mean by 'evaluations'. Checking a FOL proof for 
>correctness is quite computable, even trivial. It can be done in linear 
>time. Generating them takes a little longer.
>
>>I think this could be a basis for a range of simple processing tasks -- 
>>possibly a majority of those performed over the Web today.
>
>I guess I was assuming we were trying to do better than that.
>
>>Then there's the full logic layer, characterized by DAML+OIL.  This would 
>>include the "universal web proof checking engine" that has been proposed.
>
>Checking proofs is all very well, but unless someone or something 
>generates them, there won't be any proofs to check. And since checking is 
>always much easier than generation, I would suggest we take complexity of 
>proof generation to be the key computational issue in thinking about where 
>we should go. Certainly that was central in designing DAML+OIL.
>
>>This layer would support a range of what might be called knowledge 
>>applications.
>
>It can if it has a clear semantics. Without that, however, confusion will 
>reign.
>
>This is not just preaching, but the voice of experience, bitterly won. 
>People have been through this territory before: database modellers, AI 
>knowledge representers, computational logicians, natural language 
>understanders, planners, and even psychologists and philosophers. One 
>thing we all know a lot about is how NOT to do it.
>
>>In setting out the above, I would hope to sketch a framework in which we 
>>can usefully discuss what capabilities should be defined where;  in 
>>particular, what are the appropriate levels of functionality to be 
>>designed into the RDF and RDF schema layers?
>
>Well, we certainly need some framework, to be sure.
>
>Pat Hayes
>
>---------------------------------------------------------------------
>IHMC                                    (850)434 8903   home
>40 South Alcaniz St.                    (850)202 4416   office
>Pensacola,  FL 32501                    (850)202 4440   fax
>phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes

------------
Graham Klyne
GK@NineByNine.org

Received on Tuesday, 10 April 2001 08:51:33 UTC