- From: Pat Hayes <phayes@ai.uwf.edu>
- Date: Mon, 15 Oct 2001 15:02:30 -0500
- To: Dan Brickley <danbri@w3.org>
- Cc: <www-rdf-rules@w3.org>, <em@w3.org>
- Message-Id: <p05101003b7f0ab89717d@[205.160.76.193]>
>(+cc:eric miller) > Hi Dan >On Fri, 12 Oct 2001, Pat Hayes wrote: > >> >On Fri, 12 Oct 2001, Peter F. Patel-Schneider wrote: >> > >> >> And just how are RDF applications supposed to determine when to do this >> >> merging? >> >> >> >> peter >> > >> >By using all that DAML+OIL good stuff you've been slaving over, of >> >course :) >> > >> >All DAML+OIL instance data is RDF, >> >> Actually I would say not, though it's only a matter of terminology. > >Yes, its a matter of terminology. I'm happy you didn't say branding, >though doubtless that's involved too. But as well, it goes to the heart of >some misunderstandings about the RDF project, about what we've been trying >to do with the RDF and SemWeb effort, and about the relationship between >the RDF core specs and the specs that will join it to flesh out the RDF >family of specifications. Wow, "RDF family"? That's a new term in my lexicon. Sounds like a TV series. >For some, "RDF" is just the triples stuff, our >"pipsqueak of a language"; for others, it is this whole (possibly insane) >project of rolling out an (increasingly expressive) framework for >describing stuff in the Web. That last point of view seems crazy to me. Or rather, if that is what "RDF" means, then everything that has been done that has been called 'RDF' is meaningless. With this grandiose understanding of 'RDF', there is no such thing as a grammar for RDF, a parser for RDF, a model theory for RDF, etc.. RDF, in this view, isn't a formalism or something that could possibly be standardized; its a kind of grand aspiration of the human spirit, or something. Whatever you are talking about, it isn't the RDF that I'm working on. > I'm firmly in the latter camp, perhaps >because my ambitions for RDF have long included the things we're now >calling "Semantic Web" technology, and perhaps because I prefer the phrase >"resource description framework" to our new slogan "Semantic Web". For >one, it lends itself better to nouning: "an RDF file" versus "a Semantic >Web file". There is no such thing as a 'semantic web file'; and with your interpretation of 'RDF', theres no such thing as an RDF file either, seems to me. Or better, there's no way to tell if a file is an 'RDF file' or not, in this way of talking. >It is easier and more informative to say "all DAML+OIL files >are RDF files" Its easier, but it is dangerously misleading if not immediately qualified. >than to say "all DAML+OIL files are Semantic Web files"; That doesn't mean anything. What's wrong with saying they are DAML+OIL files? That is short, accurate, and informative. >we >need some umbrella terminology that goes beyond branding to say something >about what all these components (of the description framework) have in >common. For me, they're all RDF, and they all share the intentionally >simplistic RDF worldview of resources, relationships, URIs etc. Hold on. Relationships are used by virtually every formalism, graph syntax was invented by C.S.Peirce around 1880, and 'resource' just seems to be a W3C name for 'entity' or 'individual', so the only really distinctive thing about RDF is the use of urirefs. Is that what constitutes the 'worldview' you are referring to, or is there more to it? More to the point, why must they all have 'something in common'? Nobody wants to know what HTTP and FTP have in common, because it doesn't matter. All the SW needs is ways to *translate* between different formalisms, not that they all be the same under the hood. They aren't the same, in any case, eg RDF1.0 and RDF/XML are distinct languages already, not to mention N3. >But I can >see that others are using the acronym differently. I guess it is for W3C >to clear up the confusion; an update to the RDF FAQ is looming, as is the >RDF Core primer. > >(anyway, here's my view...) > > >> It is encoded in RDF syntax, but its meaning isn't specified by RDF. > >We still call it RDF. I don't! And this is absolutely central. If the RDF specs don't specify a meaning, then that meaning is NOT in RDF. That's what 'being in RDF' means. Now, a particular piece of RDF might in some broader sense 'have' some meaning that is invisible to an RDF processor - ie something that interprets RDF graphs according to their RDF-model-theory meanings and draws RDF-valid conclusions from them, say - but it is very important not to say that this meaning in 'in RDF', because in the only sense of meaning being 'in' a language *that is available to a mechanical process*, it isn't. At best you might say that it is RDF-encrypted, or something; its there, but completely hidden from the RDF layer. One problem with saying that these 'hidden' meanings are 'in RDF' is that this phrase then becomes meaningless in isolation, since *anything* can be 'in RDF' in this sense. It can also be 'in PSML', where PSML is the language defined by the following BNF: <psml> ::= <unicode-char>|<psml>* (Proof: serialize your favorite notation into some subset of unicode and record the serialization as a character string. QED) So this is not a useful notion. Saying that some meaning is "in L", where L is some formal language with a formal semantics, is usually taken to mean that that meaning is accessible to an engine that knows (only) the semantic rules of L. You ought to be able to figure out the meaning from the L-expressions plus what you can learn from reading the L manual. If you need to go beyond what it says in the L manual to figure out the meaning, it's not "in L". There is a very basic, almost philosophical, point underlying this. In a very real sense, on the SW, there IS NO CONTENT. There is only language; and for the SW that must be processable by software, there is only formal language. The "content" is what the writers and readers of the languages intend, but there is no way to send an intention along a wire. Now, people can intend all kinds of stuff, since people are very smart and very subtle. But when, as in the SW, at least some of the readers and writers are programs, they have no chance at all of guessing at all the subtleties that a human might have intended. All they can do is use the rules they have built into them to extract as much meaning from the marks we send them as they can. If we humans encode other stuff into those marks that go beyond the rules which were used to build the software agents, they haven't got a chance of knowing about it: we might as well ask them to be telepathic. So we have to be very careful what we say about which rules are supposed to be being used to interpret the formalisms. RDF and DAML+OIL are based on different assumptions; they are not the same, and there is no way to encode the latter in the former. (There is a way to *extend* the former to the latter, of course, but its a real extension. DAML+OIL goes beyond RDF. In fact, RDFS goes beyond RDF, which is why the semantic conditions on an RDFS interpretation need to be stated separately in the model theory.) >I have some very simple pieces of content, both >instance level and schema (see example below) whose meaning isn't >captured by DAML+OIL. OK, let me take you up on that. How IS it captured, then? (It has to be captured *somehow*, right?) >For eg., we may want to use RDF/DAML to talk about >"a util:Document whose dc:title is 'foo' and whose dc:creator is the >foaf:Person whose foaf:mbox is mailto:webmaster@example.com". DAML+OIL >can't distinguish between the cases where (at any one point in time) there >is at most one entity with a given personal mailbox; and the case where >across-all-time that property can only ever have a single value. But RDF >tools, including but not limited to those that understand DAML+OIL, >can still do >useful things with this kind of data, even if there are aspects of its >meaning that are not captured in the RDF or DAML+OIL formalisms. Oh, sure, of course. It may well be that *part* of a DAML+OIL formalization is RDF-accessible. But that doesn't make the whole thing into RDF, any more than my quoting Pascal makes my entire essay French. I would say that in this case, if I follow your example, that neither the RDF nor the DAML captures the intended meaning, since they both assume that urirefs denote like simple names. So although that time-relative 'meaning' might be in some human user's mind, it is not in fact in the RDF/DAML. If the human thinks it is, he or she is liable to be disappointed by the performance of the software. > >For "its meaning isn't specified by RDF" you may as well say "its meaning >isn't specified by DAML+OIL" in many many cases. So we shrug and >admit that yeah >sure, all aspects of meaning are not easily formalised at this stage of >history. Its more fundamental than that. We can PROVE that some aspects of meaning can NEVER be captured by very simple languages like RDF. (Eg you can't express disjunction or implication in RDF. There are extensions of RDF in which you can, of course. ) > And we don't pin this on RDF, nor on DAML, it's just the way >things are: meaning is only partialy captured by the mechanisms we're >playing with here. I would put it differently. The formalisms say what they say, and we can study that, and those are the only 'meanings' that we actually have available. In a very real sense, there is no other 'meaning' in the formalism. What you are calling 'meanings' are a kind of aspiration; those are things that we would *like* to be able to express in a machine-accessible way. But again, you can't send a research agenda along a wire. The trouble with your way of speaking is that it suggests that 'meaning' and 'content' are real stuff that can be somehow captured and put into a box, and the task is to get hold of more of it. I think that is seriously misleading. There is no end to 'meaning' in your sense. Whenever some aspect or part of it is formalized, it is always possible to think of some other aspect that is missing, because we are here really talking about something like human creativity. >Despite all this, we need to deploy meaningful >documents in the Web ASAP, without putting everything on hold while we >wait for a formalism that can capture all of that meaning. We need a >framework for getting incrementally better Wait. That "incrementally better" sounds like progress towards some kind of ultimate goal. What is that goal? To capture ALL of meaning? Forgeddaboutit. To capture the same amount of meaning that a human could get by reading the web page? If so, then: 1. you are doing AI, see above, and I would advise against setting out to get AI done in the near future; 2. why bother? Humans are cheap these days anyway, and if the software was this smart then it could read HTML; 3. surely what we want are things that are a bit less smart than humans but also a lot faster, less likely to get bored, more willing to do our bidding and not have ideas of their own, etc..; in fact, 'agents'. >at describing resources in the >Web; that is the 'resource description framework', RDF. The RDF approach >to this has always been to sneak up on the problem bit by bit. The graph >model provides a useful cartoon world view ("objects, types, properties, >relationships; identifiers") that can be shared by more expressive parts >of the system that get designed later. Pity y'all didn't include connectives and quantifiers. >DAML+OIL takes the RDF Schema world >view of classes, properties and constraints, and it adds in a bunch of >richness that reflects into the formalism things that RDF could >previously carry but didn't explicitly acknowledge. What does "could" mean? Are you saying that the RDF authors screwed up, or that they had DAML+OIL in mind all along, but just kind of forgot to mention all the picky little details? (You know, to call the ideas of class, property and constraint "the RDF Schema world" is kind of silly. Surely nobody thinks that RDF *invented* these ideas, do they?) >Now my point is just that DAML is in the exact same situation, there are >meaningful constructs that can be carried through DAML without DAML >realising the full meaning. In a sense of 'carried through', that is correct; but its a trivial sense, because in this sense, any content can be carried through almost any language. All the language needs is strings and it can carry through anything by embedding PSML into it. >And I'm not talking here about the >reference/naming/denotation aspects of >meaning that I've talked about before, though something similar can be >said about that. Rather, I'm talking about aspects of the meaning of our >content (eg. temporal issue raised below) which one might imagine _are_ in >scope for some fancier Model Theory or Axioms to engage with. Just >in DAML 1.x we >don't try. Does this mean that my friend-of-a-friend RDFWeb application, >which uses the property whose URI is 'http://xmlns.com/foaf/0.1/mbox' is >neither an RDF application nor a DAML application but something else yet >to be named. Surely not. If it relies on temporal changes in URI references, then yes, I would indeed say that really is using something not yet named, and not using RDF (though I concede that this is a very strict interpretation that I might be willing to relax in practice :-). What it would be really doing is something best described as MISusing RDF, ie using it in ways that are not sanctioned by its official meaning (and therefore are liable to be misunderstood by another RDF engine) but are nevertheless useful. Let me hasten to add that I wouldn't want to stop people doing things like this; on the contrary, experimenting outside the box in this way is a very good way to discover the real limitations of any formalism and to get started on the process of designing a better one. But if, for example, your friend's application were to break because of those aspects that lie outside the current RDF formal model, and he were to sue the W3C on the grounds that RDF doesn't do what we said it would do, I would say that he had no case. >Of course there are aspects of meaning, and >specifically the meaning of that property, which neither RDF nor DAML >captures. Nevertheless we can use the Web right now to successfully deploy >and use descriptions of resources in RDF that employ the foaf:mbox >property. Those descriptions don't cease to be RDF because the property is >a particularly interesting one, or because there are rules that one might >(eventually) formalise about its use which can't be written down in a >Semantic Web language yet. That's why we called it a (description) >Framework not a (file) Format: it's a deployment strategy for easing all >this stuff out of research labs Well, this stuff has been out of the research labs for quite a long time now. Ever hear of 'databases' ;-) ? >and into mainstream Web technology. Slowly >but surely... ;-) > >So when I say "all DAML+OIL instance data is RDF data", I mean >(colloqially) that it has basically the same cartoon worldview: of objects >identifiable by URIs, having URI-named classes and URI-named relationships >to one another. Oh, if that is all, then OK. But if "RDF" just refers to the use of URIs as names, then RDF1.0 isn't "RDF", since it allows anonymous resources. > >It is of course possible to produce such tangled representations >(encodings of rules, queries etc for example) in RDF that the >object/property/value worldview loses much of its utility. For that matter >I could run something like "tar -c mailbox/* | binhex | gpg --encode > >mail.txt", put that into a literal string in an RDF graph, then go around >claiming that I had an RDF representation of my mailbox. I could, but I'd >be silly rather than wrong. Sure at one level my mailbox is represented >in, or carried through, RDF. It's just a rather useless representation. > >Similarly, there are representations-in-RDF (such as the person/mailbox >thing below which (a) draw on aspects of Schema/ontology meaning that are >yet to be formalised and (b) nevertheless make perfect sense as useful >chunks of RDF instance data, couched in the objects/properties/values >cartoon worldview. Maybe I haven't been following you. If they make sense as RDF data, then *with that understanding of what they mean*, they are RDF. Sure, no problem with that. But that doesn't mean that DAML+OIL *is* RDF; it just means that a piece of DAML+OIL makes some kind of sense to an RDF engine. We set it up that way. But the sense that the RDF engine gets out of it isn't the same sense that a DAML+OIL engine would get out of it. Now in fact, DAML+OIL is rather better integrated with RDF than this suggests, in that their model theories also line up rather well, so that one can view RDF(S) (without reification) as being a sublanguage of DAML+OIL, rather than just a formalism into which DAML+OIL content is embedded in some opaque way. That means that anything that an RDF engine would do would in fact also make DAML+OIL sense (though still not the reverse). But this took a lot of work and care; it doesn't come easily or naturally, so don't expect that this is going to be the normal case. Temporal sensitivity isn't going to be that easy, for example; I think it is going to require re-doing RDF from the ground up, rather than extending it. It really is impossible to pre-guess all the things anyone is going to want to say, and invent a single basic notation that will never need to be modified, only extended. Even adding context-sensitive datatyping to RDF will involve extending the Ntriples syntax, for example. The nearest anyone has come to such a thing is conventional FOL, which is a very stable region in the space of all expressive assertional languages. But for some purposes, FOL isn't really what you want either. > >> >and RDF apps that are built >> >to know about even a subset of DAML+OIL can make good use of that when >> >doing data merging. >> >> Well, they can if they are DAML-savvy, but then why don't you call >> them DAML apps rather than RDF apps? >> >> > >> >For eg., consider the property http://xmlns.com/foaf/0.1/mbox >> >from the namespace http://xmlns.com/foaf/0.1/ >> > >> > [[ >> > FOAF is expressed as an RDF Schema, annotated with DAML to express the >> > fact that a foaf:mbox uniquely picks out an individual. >> > ]] >> > >> >Excerpting from that schema: >> > >> > <rdf:Property rdf:about="http://xmlns.com/foaf/0.1/mbox" >> > rdfs:label="Personal Mailbox" >> > rdfs:comment="A web-identifiable Internet mailbox associated >> >with exactly one owner. >> > This property is a 'unique property' in the DAML+OIL sense, in that >> > there is at most one individual that has any particular personal >> > mailbox."> >> > >> > <rdfs:domain rdf:resource="http://xmlns.com/foaf/0.1/Person" /> >> > <rdfs:range >> >rdf:resource="http://www.w3.org/2000/01/rdf-schema#Resource" /> >> > <rdf:type >> >rdf:resource="http://www.daml.org/2001/03/daml+oil#UnambiguousProperty"/> >> > <rdfs:isDefinedBy rdf:resource="http://xmlns.com/foaf/0.1/" /> >> > </rdf:Property> >> > >> >Since we say the property is of type >> >http://www.daml.org/2001/03/daml+oil#UnambiguousProperty > > >we can use this knowledge in RDF-based applications >> >> How does the RDF application know what the DAML expressions mean? >> (Should it know about all the other extensions to RDF that havnt even >> been invented yet?) >> >> >-- for example merging >> >blank nodes where each node has a property with the exact same resource as >> >its value. In this example, merging nodes that stand for the individual >> >whose presonal mailbox is mailto:foo@example.com, perhaps. >> > >> >Aside: I could complain here that DAML+OIL gives us no mechanism for >> >guaranteeing >> >that the at-most-one-ness remains static in the face of time and change, >> >but that's probably a can of worms best opened in a separate thread. >> >DAML+OIL's "worldview" isn't one that explicitly acknowledges time and >> >change, and there are good reasons for this being the case. How this >> >relates to the need to deploy DAML+OIL ontologies in the Web is something >> >that looms rapidly, imho. >> >> Maybe, but you can hardly pin this on DAML; *nobody* has really >> tackled this issue yet, AFAIK. > >It's not a matter of blame, it's a matter of layering. This whole thing >we're building, the graph stuff, the simple schema stuff, the fancier >ontology language, perhaps a rules language... what we're assembling is a >framework for describing resources in the Web. RDF. That picture is >only coming >together slowly, and DAML+OIL is a key component. There will be others. I would agree with all this except for the degree of integration implied by that word "component". Imagine someone in about 1957 saying "this programming stuff is really coming together; we have several of the layers sorted out; there's the 704 assembly code (that youngster Minsky has proven that it is a universal machine) and there's FORTRAN and IPL ... Pretty soon we will have all the components for the Machine Programming Framework, and then things will really start humming." In a sense they would have been right, but they also would clearly have been missing something important. (In fact I suspect that many folks at IBM did have something close to this attitude, deep down, which is why Bill Gates was able to con them so easily.) >So going back to my original claim that all DAML+OIL instance data is RDF >instance data: what I'm getting at is that to deploy this stuff for real, >on Web sites, in browsers, palm pilots, everywhere, we need some >stability even when the complete resource description framework is not yet >finalised. I don't grok this notion of 'finalised'. That sounds like finalising evolution, or something. The whole thing seems more open-ended to me. We put out tools and people start to use them, then people put out better tools, and other people use them, and so on. God alone knows what will happen, but is going to be more like a stampede than like herding cattle. >Maybe the RDF project never will be finalised, but always >pushing to get things out of the lab and into the Web mainstream. I hope >so, fwiw. > > >The 'description logic meets temporal logic' work is still >in the lab, but please lets not plan to tell the world that there are RDF >instance files, and DAML/OIL/WebOnt instance files, and >WebOnt-PlusTemoralLogic instance files and who knows what follows >after. Why not? That is exactly what the world needs to know. If we don't tell them this we would be dishonest, because this is the truth. There ARE all these formalisms, and they all have their uses and limitations. I think you have a vision of the SW as a kind of single integrated system where content is flowing smoothly along pipes, all encoded in W3C-sanctioned RDF. I have a very different vision, more like a kind of bustling market or bazaar, where agents are busily brokering meanings, and many different languages are being spoken. I see a kind of market economy of meanings, all happening at electronic speeds. I can imagine all kinds of new economic opportunities in this virtual semantic web world. For example, for a (small) fee per thousand bytes of unicode, I ("I" here is a program, of course, but like a well-trained truffle-hound, it gives all its earnings to its human owner) will undertake to translate anything in any of these notations... into any of these other notations.... Or, I will (for no fee, but in return for some small favor, eg that you undertake to transmit some small cookie for me to every other agent you know) undertake to find you a service which can translate your notation into some other notation. Or, I will (for a very significant fee) read anything written in one of these notations, check it for internal consistency and agreement with any US federal database, and then warrant that it has been so checked, and maybe (for a truly astonishing fee) accept any risk arising from such warrant. Or, I have a new notation which you can use (for a fee schedule that we can negotiate) and it will provide you with all these advantages... Now, for this to work, I want to see every file out there branded with something that tells me - and I'm a piece of software, remember - *exactly* what semantic rules I am supposed to use to interpret it. That way, everyone knows whose fault it is when things go wrong. If I use the semantics it referred to, then its someone else's fault; if I used some other one, I have only myself to blame. (Some questions and answers. Q. What if I don't know those rules? A. Well, then you need to get the content translated into some rule format you do know. Find a translation service, or ask the other site if it can translate for you. Q. What if I know some better rules? A. Well, go ahead. Maybe you know more about that agent's rules than it knows; but you are taking a risk. .Q. What if I know that my rules are less powerful than its rules? A: Then you are safe; but you have a responsibility to not mess up any content that you might not understand, particularly if you are going to tell anyone *else* that you derived a conclusion from that agent's sources. ) Q. What if it doesn't have a brand, but I find that I can read it and make sense of it? A. Well you are free to do so, of course, but if something goes wrong as a result, its not clear whose fault it was. (My guess is that this kind of question will eventually end up being decided by case law in civil tort cases, and that in any case a kind of 'reasonable practice' code of usage will develop in order to enable e-commerce to work properly. ) >If and when for example the description-logic-meets-temporal stuff >gets more fully baked, and perhaps submitted to W3C, we'll likely >have some way >of annotating my RDF Schema (or Web Ontology) at >http://xmlns.com/foaf/0.1/ to better represent the meaning of the classes >and properties I name there. Does that mean that you'll want me to call my >instance data files something other than "RDF files" (DAML/OIL/Webont >files...). Yes, most definitely. If you call them RDF files and my old, dumb, RDF inference engine can't understand them, or misinterprets them, I may be *very* unhappy with you. Your lawyers may hear from my lawyers. >Or maybe they're not even RDF (DAML etc) files today, since the >semantics are not full captured by any Semantic Web schema/ontology/rule >language that I know of. Or we could just worry about something more >interesting than categorising the flavours of instance data, and get used >to calling all this stuff "RDF". Why not just call it "Semantic Web Stuff" and forget about publishing specs? That provides about the same amount of useful information to someone trying to write code. Pat PS, I just saw this wonderful quote from John Milton: "When there is much desire to learn, there of necessity will be much arguing, much writing, many opinions; for opinion in good men is but knowledge in the making." -- --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Monday, 15 October 2001 16:02:53 UTC