- From: Ron Daniel <rdaniel@interwoven.com>
- Date: Mon, 2 Apr 2001 17:24:48 -0700
- To: "'Aaron Swartz'" <aswartz@swartzfam.com>, "'Dave Beckett'" <dave.beckett@bristol.ac.uk>, "'RDF Interest'" <www-rdf-interest@w3.org>
- Cc: "'spec-comments'" <spec-comments@prismstandard.org>
People interested in the upshot of this thread can cut to end to see what I'm currently putting into the PRISM spec. ---- Aaron said: > it > seems to me > that having a file which claims to be RDF but means one thing > to an RDF > processor and another to a PRISM processor seems to be a bad thing. Yes, that would be bad if it were really the case. It is not clear to me that this it is. Correct me if I am wrong, but two RDF models will 'mean' different things if and only if they return different results for a query such as (some.doc dc:creator ?X) -- modulo the order of the results that X binds to. Dave Beckett's suggestion simply says that the order, which I think you claim not to care about, should be allowed to go a certain way. Kind of like adding an equivalent of SQL's ORDER BY clause to the RDF query language, and allowing 'original document order' as a value for it. The only way the RDF would 'mean' something different would be if someone goes in and starts changing the RDF model derived from the input, adding an rdf:Seq where none was originally. That is certainly not my intent. It would be a perfectly legitimate implementation technique to use rdf:Seq to track the order of statements. But it would be improper to add an rdf:Seq to the model. The tracking of statement order needs to be held externally, as a system annotation about the model it has imported. And there are many different ways of implementing this that have nothing to do with rdf:Seq. I'll spare you the litany in the interests of space. This is not a data model issue. It is a quality of implementation issue, which Dave's earlier message clarified for me. PRISM implementations should prefer to be implemented on top of RDF software that can reconstruct the original order, just as they should prefer to be implemented on top of XML software that knows about the xml:base attribute. > Furthermore, I don't see why this is necessary. There is a simple > RDF-compatible way to deal with this situation, and I'm not > sure why you > can't use it. As Roland pointed out, simply put these in an > rdf:Seq and this > will indicate that order should be maintained to any RDF > processor. Just > like this: > > <dc:creator> > <rdf:Seq> > <rdf:li>Contributor 1</rdf:li> > <rdf:li>Contributor 2</rdf:li> > </rdf:Seq> > </dc:creator> > > Is there any reason why this can't be done? Yes. What is 'simple' to you and I is not simple to everyone. Making an explicit Seq means that the order IS significant, and as Roland pointed out, there are many times when it is not. So you introduce options into how things are handled. By the time you deal with Seq for significant order, Bag for authorship by committee without individual attribution, and repeated elements for other cases, you have done four VERY BAD THINGS: 1) Raised the cost of training catalogers on which model to use. 2) Raised the implementation and maintenence cost of software that will analyze the record and do things with it. 3) Raised the cost of cataloging, because people are being asked to make subtle distinctions. 3) Raised the error rate in the models because: a) People use the wrong model. (e.g. using Bag when the work was written by individuals, not a group). Since the distinctions are subtle, the error rate will be high. b) Provided a predefined model that was close, but not quite a match. Real life is too ambiguous to be accurately modeled with so coarse a set of tools as Bag, Seq, Alt, and their absence. Raising costs means I can't sell this stuff to publishers. Raising the error rates means that logical inference code can't handle it as cleanly, thus slowing the benefits of the semantic web. You have heard of the difference between accuracy and precision? 'Simply' using Seq provides precision, but not accuracy. Beware of its unintended consequences. --------------- Based on discussions so far, here is what I plan on doing: Current wording of the part of the spec that responds to Dave Beckett's suggestion will be changed to: 4.8.3 Further Qualifications ... Note that although a sequence of dc:creator elements in an RDF/XML file implicitly defines a sequence (in the XML world), RDF parsers have no obligation to preserve that ordering, unlike if an explicit rdf:Seq were given. PRISM implementors are advised that there are quality of implementation issues between different RDF processors. In general, implementers MAY prefer to build on top of an RDF parser that allows the original order of the statements to be reconstructed. That will allow the original order of the authors on a piece to be reconstructed, which might or might not carry additional meaning to the viewer of a styled version of the record. Similarly, XML software that can handle the almost-standardized xml:base attribute will be preferred. ...
Received on Monday, 2 April 2001 20:26:19 UTC