Re: Dispositions of Dave Beckett's comments from Brian McBride on 2001-04-03 (www-rdf-interest@w3.org from April 2001)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Tue, 03 Apr 2001 12:56:56 +0100
To: rdaniel@interwoven.com
CC: "'Aaron Swartz'" <aswartz@swartzfam.com>, "'Dave Beckett'" <dave.beckett@bristol.ac.uk>, "'RDF Interest'" <www-rdf-interest@w3.org>, "'spec-comments'" <spec-comments@prismstandard.org>
Message-ID: <3AC9BA88.558065C9@hplb.hpl.hp.com>
Hi Ron,

I think the solution you propose mean that, for example, if one stored
PRISM data in an RDF database, e.g. RDFDB, that one would lose
information essential to PRISM.  I'm concerned that:

  o PRISM applications won't be able to make full use of standard
RDF/semantic web tools and components

  o Semantic web tools won't capture the full semantics of PRISM data,
so their ability to reason about it will be impaired

  o If PRISM claims to be RDF compliant whilst having a different data
model, this will cause confusion not only for PRISM developers, but also
for RDF developers.

If we could find a way of meeting PRISM's needs whilst fully
representing PRISM's semantics in standard RDF syntax, then standard RDF
tools are more likely to be useful in PRISM applications and we would
reduce the risk of adding further confusion about what the RDF datamodel
really is.

I think we'd probably all agree that the RDF model (not the augmented
PRISM model) needed to represent the order of authors needs to use a
sequence.  I'm not sure I understand your objection to using it.

Since the solution you are proposing regards the ordering of authors as
always significant - you could avoid burdening your cataloguers with
making any decisions by simply always requiring the sequence element. 
Presuming there is tool support for generating the RDF/XML, this burdens
the cataloguers not a whit.

I'd really like to help find a way to enable PRISM to fully represent
its semantics in standard RDF.

Brian 






Ron Daniel wrote:
> 
> People interested in the upshot of this thread can
> cut to end to see what I'm currently putting into the
> PRISM spec.
> 
> ----
> 
> Aaron said:
> 
> > it
> > seems to me
> > that having a file which claims to be RDF but means one thing
> > to an RDF
> > processor and another to a PRISM processor seems to be a bad thing.
> 
> Yes, that would be bad if it were really the case. It is not
> clear to me that this it is. Correct me if I am wrong, but two
> RDF models will 'mean' different things if and only if they
> return different results for a query such as
>     (some.doc dc:creator ?X)
> -- modulo the order of the results that X binds to.
> 
> Dave Beckett's suggestion simply says that the order, which
> I think you claim not to care about, should be allowed to go a
> certain way. Kind of like adding an equivalent of SQL's ORDER BY
> clause to the RDF query language, and allowing 'original
> document order' as a value for it.
> 
> The only way the RDF would 'mean' something different would
> be if someone goes in and starts changing the RDF model
> derived from the input, adding an rdf:Seq where none was
> originally. That is certainly not my intent.
> 
> It would be a perfectly legitimate implementation technique to
> use rdf:Seq to track the order of statements. But it would
> be improper to add an rdf:Seq to the model. The tracking of
> statement order needs to be held externally, as a system
> annotation about the model it has imported. And there are
> many different ways of implementing this that have nothing
> to do with rdf:Seq. I'll spare you the litany in the interests
> of space.
> 
> This is not a data model issue. It is a quality of
> implementation issue, which Dave's earlier message
> clarified for me. PRISM implementations should prefer
> to be implemented on top of RDF software that can
> reconstruct the original order, just as they should
> prefer to be implemented on top of XML software that
> knows about the xml:base attribute.
> 
> > Furthermore, I don't see why this is necessary. There is a simple
> > RDF-compatible way to deal with this situation, and I'm not
> > sure why you
> > can't use it. As Roland pointed out, simply put these in an
> > rdf:Seq and this
> > will indicate that order should be maintained to any RDF
> > processor. Just
> > like this:
> >
> > <dc:creator>
> >   <rdf:Seq>
> >      <rdf:li>Contributor 1</rdf:li>
> >      <rdf:li>Contributor 2</rdf:li>
> >   </rdf:Seq>
> > </dc:creator>
> >
> > Is there any reason why this can't be done?
> 
> Yes. What is 'simple' to you and I is not simple to
> everyone. Making an
> explicit Seq means that the order IS significant, and
> as Roland pointed out, there are many times when it is
> not. So you introduce options into how things are handled.
> By the time you deal with Seq for significant
> order, Bag for authorship by committee without
> individual attribution, and repeated elements for
> other cases, you have done four VERY BAD THINGS:
>   1) Raised the cost of training catalogers on which
>      model to use.
>   2) Raised the implementation and maintenence cost of
>      software that will analyze the record and do things
>      with it.
>   3) Raised the cost of cataloging, because people are being
>      asked to make subtle distinctions.
>   3) Raised the error rate in the models because:
>      a) People use the wrong model. (e.g. using Bag when
>         the work was written by individuals, not a group).
>         Since the distinctions are subtle, the error rate will
>         be high.
>      b) Provided a predefined model that was close, but
>         not quite a match. Real life is too ambiguous to
>         be accurately modeled with so coarse a set of tools
>         as Bag, Seq, Alt, and their absence.
> 
> Raising costs means I can't sell this stuff to publishers.
> Raising the error rates means that logical inference code can't
> handle it as cleanly, thus slowing the benefits of the semantic
> web.
> 
> You have heard of the difference between accuracy and
> precision? 'Simply' using Seq provides precision, but
> not accuracy. Beware of its unintended consequences.
> 
> ---------------
> Based on discussions so far, here is what I plan on
> doing:
> 
> Current wording of the part of the spec that responds to
> Dave Beckett's suggestion will be changed to:
> 
> 4.8.3 Further Qualifications
> ...
> Note that although a sequence of dc:creator elements in an
> RDF/XML file implicitly defines a sequence (in the XML world),
> RDF parsers have no obligation to preserve that ordering,
> unlike if an explicit rdf:Seq were given. PRISM implementors
> are advised that there are quality of implementation issues
> between different RDF processors. In general, implementers
> MAY prefer to build on top of an RDF parser that allows
> the original order of the statements to be reconstructed.
> That will allow the original order of the
> authors on a piece to be reconstructed, which might or
> might not carry additional meaning to the viewer of a styled version
> of the record. Similarly, XML software that can handle the
> almost-standardized xml:base attribute will be preferred.
> ...
Received on Tuesday, 3 April 2001 07:56:24 UTC