- From: Pat Hayes <phayes@ihmc.us>
- Date: Thu, 30 Dec 2004 03:36:09 -0800
- To: Sandro Hawke <sandro@w3.org>
- Cc: www-rdf-logic@w3.org, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
- Message-Id: <p06001f02bdf770107790@[192.168.1.4]>
>"Peter F. Patel-Schneider" <pfps@research.bell-labs.com> writes: >> My understanding of the problem was to write extra semantic conditions on >> RDF(S) interpretations, and thus generate an embedding/encoding of binary >> first-order logic into a semantic extension of RDF(S). You can't extend >> the syntax at the same time, however, which means that you have to encode >> formulae as triples somehow, and these triples retain their RDF(S) meaning. >> >> However, it doesn't seem to me that Sandro has the same understanding. > >No, that's exactly what I want, too. > >I would also like the whole exercise to be, in the end, simple enough >that the community interested in rule languages and RDF can be >comfortably with the results. If you want simplicity, surely a 'natural' syntax for FOL (or OWL, for that matter) is simpler than any kind of RDF embedding? If you don't like recursion, stick to clausal form and use something like TPTP. FOL syntax is not hard. Its way simpler than, say, Javacode. What makes everything complicated is this insistence on having it be simultaneously syntactically embedded in RDF and at the same time a semantic extension of RDF. >I keep hearing about projects using RDF >encodings of Horn or FOL, and they turn out to have either unclear or >inconsistent semantics. If we're going to move to a standard here, it >would help to clear up whether it is possible to do this kind of >encoding properly. WHY? Why not get the standard done first, and then argue later about how to do an RDF encoding? Or, even more radically, why bother with an RDF encoding? What's wrong with XML? Seems to me that OWL has pushed this RDF idea about as far as it can be pushed; maybe a bit further, in fact. None of the other standardization efforts seem to feel the need to be encoded into RDF. SPARQL has its own syntax, completely different in style from RDF (it doesnt even encode RDF in RDF) . The Rule proposals all use a rule-style syntax. The RDF-in- HTML embedding guys don't use RDF/XML syntax. Why do you have this RDF syntax bee in your bonnet? Dont get me wrong. Im all for RDF, and I like using it for all kinds of things that it can be naturally used for. But it does have its limitations. >I can't imagine, for instance, that the approach >taken by SWRL 0.6 would ever be approved in a W3C Recommendation, for >the reasons I outlined in the team comment [1]. The semantic web can make its own standards without the W3C imprimateur, if the W3C insists on being insane. But I think you are wrong, in fact, since the W3C's own WGs are going to be producing draft recs which don't satisfy these requirements. >If Peter can convince me that it's not possible to write the desired >semantic condition, I can help him convince others that a different >path needs to be taken for rule languages. If I can convince Peter it >*is* possible You won't be able to convince him that it is, for the very good reason that it isn't. There is a very general result due to Montague. Here's one way to state it. Suppose you have a formal system which is capable of expressing very simple arithmetic (technically, Robinson arithmetic). FOL would do, for example, but weaker logics would also do. Pretty much any language in which you can express or use recursion will also do the job. Now assume that you have some way to encode expressions of this language into terms (names) in the language itself, ie a downward reflection principle. Exactly how you do this doesn't matter: just as long as every expression e has an encoding [e]. Reification would do this job for RDF, for example, or using quoted strings, or using XML literals enclosing a chunk of RDF/XML. Now add a predicate P to this language which satisfies the condition: for any e, P([e]) implies e (so that you can get back from the encoding to 'use' the logic, as you described in an earlier email) and suppose that you can always, somehow, perform the 'encoding' inference from e, infer P([e]) (that second one doesnt have to be an axiom; it only requires that you can somehow get from e to P([e]) by some inference path or other. ) Given this, then the system is self-contradictory; you can derive a paradox from it. Put another way, any language (good enough to express basic arithmetic) can't contain an useable meta-theory of its own notion of truth, or of provability, or even of self-knowledge. One way to get the second rule to hold, by the way, is to have the axioms P([e]) and P([e implies f]) implies P([f]) ie if you can 'mirror' the modus ponens rule in the metalanguage, then you are dead. This is the form that Montague and Kaplan originally identified. It was originally thought of as being to do with self-knowledge, and was called the paradox of the knower. This is a very basic result, and it is kind of robust. People have tried all ways to wriggle past it, but there aren't any. The derivation of it is kind of elementary (doesnt use advanced set theory, etc.) . There really arent going to be any magic bullets which will slay this dragon: looking for them is like trying to invent a perpetual-motion machine. You can get past the barrier by weakening the notion of reflection, which breaks the second Montague condition. This is how the KIF metatheory wriggled past the paradoxes, by putting a syntactic tweak into the truth predicate so that it doesnt apply to some expressions. But this is a tricky operation to get right. It works in KIF partly because KIF does reification very transparently, using direct quotation, so the syntax of the coded expression can be accessed straightforwardly when stating the necessary constraint. This would break if, for example, the language used an encoding mechanism that allowed identity statements between encoded names (this is Peter's observation). So if you were to do this for RDF somehow, the trick might fail when you moved to OWL (because of owl:sameAs). >(and we figure out how onerous it is), then a standard >can use it (or not, but because the difficulty has been actually >weighed). > >Unfortunately, I can't just hand Peter the desired semantic >conditions. I don't really know how to write them in the formalism of >RDF Semantics [2]. My understanding is that Peter doesn't either and >suspects it's not possible to write them. The problem has got nothing to do with the form in which they are written. >Meanwhile, it is "obvious" how to write them informally and with >subtle errors. It SEEMS obvious, but appearances are misleading. It seemed obvious to Frege, until Russell pointed out the problem. Even Russell was kind of embarrassed by how almost childish the paradox seemed - nobody took the liar paradox seriously at the time - and was convinced that there would be a neat, simple-to-understand way to bypass it. The result of that particular naive enthusiasm was ramified set theory (abandoned as too complicated to use around 1925) and Principia Mathematica. But we know now, almost a century later, that this apparently simplicity really is an illusion. Don't be misled by how obvious it seems to be. It really would be a pity if the entire W3C SW standardization effort were to be warped by this false sense of how "obvious" all this stuff is. This is like saying that Newtonian physics is kind of obvious, so transistors shouldn't be based on quantum theory. We KNOW that the problems of consistency in self-referential languages are a black hole. We KNOW that none of the techniques that some very clever people thought might be able to handle it - reification, substitutional quantification, 3-valued logics, constructive logics, etc. - will get you past the basic, foundational, snag. We know this is something very fundamental in the foundations of formal semantics. So why doesn't the W3C just give up on this insane crusade to reduce everything to RDF? That idea was a mistake. Get used to it. It is not usually a good idea to work hard to to try to base a standard on a howler, particularly when the territory has been so thoroughly investigated. And particularly when there is in fact, it seems to me, no real problem here to be solved. It is trivial to embed RDF into full FOL, corresponding to the fact that RDF is semantically a special case of FOL. See the RDF-> SCL embedding described in http://www.ihmc.us/users/phayes/SW2SCL.html for example. which is about as obvious as it can get. Why wouldn't that be good enough? It has many advantages: for example, it can allow future extensions to the language very neatly, it means you can use Xpath to access sub-expressions, you can use DTDs to check basic syntax, and so on. Why do you want to be able to push all SWeb languages through an RDF pipeline? Its not as though it is hard to transmit, parse and use simple recursive languages, or that there are any real obstacles to doing it. People have been doing it for years. HTML with embedded Javascript does it. And if you start doing things like embedding FOL expressions in RDF literals, then you have to violate the RDF conventions to get at the logic anyway; the RDF has just become a kind of outer crust that is getting in the damn way. Since you will have to have engines that go 'outside' the RDF conventions in order to extract the encoded FOL, why not just send the strings to them directly? You could put the FOL in a file, give it a URI and use RDF to say that it is a file full of FOL, using an RDF ontology of SWeb languages, if you really want to have everything triggered by RDF. I don't think there would be much operational difference between this and having the FOL encoded syntactically inside the RDF, in fact - as Peter points out, you still have to parse the FOL expressions somehow, however they are encoded - but it would enable us to put this silly issue to rest. An analogy. Suppose that someone were to say that a construction like <img src="http://i.a.cnn.net/cnn/.element/img/1.1/ceiling/gradient.line.gif" /> in HTML was inherently faulty because the actual image was not 'inside' the HTML document, and so the HTML does not properly reflect the part/whole structure of the final viewed document, in which the image really does appear inside the document page; and this is a violation of a Basic Theory of Web Documents. On this view., HTML needs to be re-designed so that all parts of a web page are incorporated into the same file as the actual HTML text, not as external resources referred to by URIs, but actually part of the same syntax, parseable using HTML (or XHTML) conventions and also be an HTML semantic extension in the sense that the way that the embedded image-entity was processed should be the same standard method of processing as HTML uses for everything else, including text. There might be various schools of thought on how to do this. They would both encode the image bytes as text in some way, lets suppose (which was decided by a WG which met for 4 years and produced a huge report explaining the technique of writing the image file as a bitstring which is split into 16-bit chunks, interpreted as UTF-16 Unicode and then mapped back into ASCII) but one school thinks that the text itself should be simply thrown away, and the image displayed using image-display conventions; whereas the members of the W3C Team insist that this amounts to a non-standard semantic treatment, and suggest instead that images should always be represented using 'Unicode-art' where each pixel is a Unicode glyph, so as to preserve the standard HTML meaning of all embedded text uniformly: images in HTML are then, strictly speaking, just text, but rendered in a very small font. You can imagine an entire novella here, all horribly plausible: there might be special proposals by consortia of commercial foundries, proposing standardized minifonts suitable for rendering CMYK images. Now, this little fantasy seems to me almost exactly analogous to the debates over using RDF syntax for everything. The insistence on preserving the RDF standard semantics while at the same time embedding everything into its restrictive syntax is going to produce exactly the same kind of damn silly time-wasting discussions, and run the entire process into the same kind of pointless rathole, that 'true embedded HTML' would have taken HTML.; and for much the same reasons. The real issue is not whether or not it can possibly be done, but whether it is a good idea to even try. Clearly if it can be done, it will only be by virtue of an enormous amount of work, and the result will be ugly, hard to use and fragile; and we already know much better ways to do this, ways that are already in use, known to be robust and are widely understood. In case you are thinking about CWM as an proof-of-concept, it isn't. The way that CWM encodes FOL is in direct and stark violation of the W3C's own recommendations. If the idea is to move forward along those lines, then please say so now so that we can drop this entire topic, resign from the W3C WGs and find more productive ways to spend our time. > Something like "put the FOL expressions in a string >literal and assert it" Or in a reified triple and assert that, or coded as a Goedel number and assert that. It makes no difference to the Montague result. The form of the encoding is irrelevant: you could use a binary encoding of an XML character string, rewritten in musical notation and then transcribed back through base-7 arithmetic into an XSD integer. You might as well try to program an uncomputable function by picking a different programming language. The only way to avoid paradox is to weaken the connection between language and metalanguage, eg by partly breaking the reflection inferences. But then you have to be able to prove that there isn't some hidden way of sneaking past your restriction. >, as I explored in my previous e-mail [3]. > >So I've been trying to write them using a formalism in which I have a >good sense of how to do it and with which I think I can detect most if >not all subtle errors. It may also show the way in which to do it in >the desired formalism. > >I'll try to return to the substance of this thread shortly. Peter's >raised to questions I need to explore. There are lots more waiting for you when you get past those. Pat > > -- sandro > >[1] http://www.w3.org/Submission/2004/03/Comment (in the "RDF" section) >[2] http://www.w3.org/TR/rdf-mt/ >[3] http://lists.w3.org/Archives/Public/www-rdf-logic/2004Dec/0022.html -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 30 December 2004 11:37:04 UTC