- From: pat hayes <phayes@ai.uwf.edu>
- Date: Fri, 18 May 2001 15:46:04 -0500
- To: jos.deroo.jd@belgium.agfa.com
- Cc: www-rdf-logic@w3.org
> > >There's indeed a point here. Yesterday I was doing a testcase > > >with 200001 concepts used in 100000 statements (no real application, > > >just stress testing some inference engines). In that particular > > >testcase I found that the RDF/XML file could be zipped 20 times. > > >Using RDF/N3 this was just 4 times. So the XML file is 10 MB, the > > >N3 file is 2 MB and the binary compressed file is 0.5 MB. Needless > > >to say that this is having an impact on communication, storage and > > >processing. We found the best balance with N3 [1][2][3][4]. > > > > Your figures speak for themselves, but I'm not sure of your implication - > > that N3 should be used in preference to RDF/XML? Wouldn't this be throwing > > the baby out with the bathwater? Performance and efficiency lie on a > > continuum, interoperability comes in big discrete chunks - do we >really want > > an extra N converters? When the binary XML brigade on xml-dev have come up > > with something workable, that perhaps will be worth considering. > >Honestly, I don't know the answers to your questions. >We just gathered some facts (such as sizes, speeds, etc.) >add for an artificial testcase. >Of course, you couln't be more right in saying that > Performance and efficiency lie on a continuum, > interoperability comes in big discrete chunks. Well, I am not sure this is true. Let me make a case against this widespread doctrine. I take it that the claim is based on the idea that if N people agree to use a standard interchange format - say, XML - then interoperability is done with; but if they do not, then in the worst case N(N-1) converters need to be written. This however is the worst possible case. In practice, perhaps after some initial experimentation, about N converters will need to be written, because the participants will evolve a protocol of their own and write translators or converters into and out of it. Which is exactly what happens when they decide to use XML, in fact. So the big interoperability advantage of using a 'standard' format is that it avoids that initial period of negotiation and expermentation during which the interchange format is designed. But in fact it does not even do this, unless the needs of the participants have been exactly anticipated by the designers of the format. XML itself is just a notation for encoding labelled directed acyclic graph structures as sequences of character codes with a rather low information density. If I send you labelled graphs that you cannot interpret, the fact that they are encoded in XML is not much of an advantage over having them encoded in, say, reverse Polish in ASCII. So the community of users must somehow design the format to its own needs, as many communities are of course doing within XML. But now, take one of these: say, Rules-XML. Suppose they had chosen some other basic notational convention: suppose they were in fact doing Rules-ABC instead of Rules-XML; in what way would they be worse off? What does using XML buy one, apart from the reassuring sense that one is being up-to-date? (Of course it gets one XML-syntax-checkability and so on, but this is rather like putting legs on cars so that they can wear shoes.) As another amusing anecdote, I gather that Mike Genesereth, the original author of KIF, is trying to oversee the design of an XML-ised version of KIF. The trouble is, there are at least three different ways to render KIF into XML, and each has its own proponents, and each group has formed its own committee. My own advice to Mike would be, to hell with XML: stick to S-expressions, and let everyone write their own converters into an XML format, and then they can write N converters between them. Pat Hayes --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Friday, 18 May 2001 16:46:05 UTC