- From: Ron Daniel <rdaniel@interwoven.com>
- Date: Tue, 3 Apr 2001 07:56:29 -0700
- To: "'Brian McBride'" <bwm@hplb.hpl.hp.com>
- Cc: "'Aaron Swartz'" <aswartz@swartzfam.com>, "'Dave Beckett'" <dave.beckett@bristol.ac.uk>, "'RDF Interest'" <www-rdf-interest@w3.org>, "'spec-comments'" <spec-comments@prismstandard.org>
Hi Brian, You said: > I think the solution you propose mean that, for example, if one stored > PRISM data in an RDF database, e.g. RDFDB, that one would lose > information essential to PRISM. Wait a sec. Let's make a distinction between 'important' and 'mission critical'. Preserving things like order of authors is important for some things, but not mission critical. PRISM's main use cases showed goals of 1) discovery of resources 2) fast determination of rights or rights owner 3) enhancement of the content 4) targeted distribution of the content For discovery, it would be very nice to be able to display a record about the resource that listed the authors in the original order instead of alphabetical. But is that MANDATORY? Will companies fail to find things because of it? No. Would the users prefer to see the authors listed in the original order? Probably. So stating it as a quality of implementation issue seems reasonable. > I'm concerned that: > > o PRISM applications won't be able to make full use of standard > RDF/semantic web tools and components I'm certainly planning to use existing RDF tools instead of reinventing everything. But the simple fact is that not all RDF tools are created equal. For asking questions like "gimme all the documents where X is an author" they will all return the same results. But not in the same time, using the same disc space, on the same platforms, at the same level of development effort, at the same cost, or in the same order. All the PRISM spec currently says is that implementers should be aware of this, and they MAY prefer to use an underlying RDF engine that does offer some control over the order. But it is not a MUST or even a SHOULD. RDF implementations will improve and offer more functionality over time. They will do so based on demand from RDF users such as PRISM. > o Semantic web tools won't capture the full semantics of PRISM data, > so their ability to reason about it will be impaired FYI, this is a hot button issue for me, but I will keep it short. 1) You can't capture the 'full semantics' of anything. Nothing exists in isolation except Platonic ideals. We are building models of reality. All models are abstractions of reality which throw away lots of stuff *on purpose* in order to concentrate on what is essential for a particular problem. 2) PRISM's purpose is *explicitly* not general reasoning or description. It's purpose is to meet the 4 goals mentioned above. It has been shown that people will pay real costs to achieve those goals because the benefits can exceed the cost. By explicitly limiting the problems PRISM tackles, we can define simple answers for certain issues such as 'The Mona Lisa Problem' that bedevil others. > o If PRISM claims to be RDF compliant whilst having a different data > model, this will cause confusion not only for PRISM > developers, but also > for RDF developers. Again, I reject this assertion that PRISM has a different data model. If you ask identical RDF queries, you get identical results, modulo the order of the results, which you claim is unspecified. 'Unspecified' does not mean 'randomized'. It means it is up to the implementer. There are, of course, PRISM-specific behaviors, such as how to process PRL clauses. But that is not at the RDF level. On a more practical note, a feature like SQL's ORDER BY clause will prove to be important to many people as they try to put RDF to use in real applications. Relational databases regard order as insignificant, unless a query says otherwise. Makes sense to me. I would accept a milder assertion, that PRISM recommends certain things that are not commonly implemented in all RDF processors. But that is customer demand. > If we could find a way of meeting PRISM's needs whilst fully > representing PRISM's semantics in standard RDF syntax, then > standard RDF > tools are more likely to be useful in PRISM applications and we would > reduce the risk of adding further confusion about what the > RDF datamodel > really is. As mentioned above, I take it as axiomatic that one cannot fully represent the semantics of anything. What one can do is represent them to a degree of accuracy such that the errors are acceptable for a given purpose. Reordering errors are tolerable given PRISM's goals, but I predict that users would show a decided preference for not arbitrarily messing around with the order of things. > I think we'd probably all agree that the RDF model (not the augmented > PRISM model) needed to represent the order of authors needs to use a > sequence. No, I would not agree with that at all. > I'm not sure I understand your objection to using it. Because Seq, like Bag and Alt, has a particular meaning that is not ALWAYS appropriate. > Since the solution you are proposing regards the ordering of > authors as > always significant No, I do not regard the order as ALWAYS significant. It is sometimes significant, sometimes not. This is a place where simple modeling breaks. You either make the model more complex to deal with it, or you do something outside the model. Proposing "always use Seq" does not fix the problem. It explicitly states that the order is always important, so important that it has to be encoded in the model. And that is not true. For the model to be more correct, you have to allow Seq to appear sometimes, and not appear others. At that point, why bother? The costs of the reordering errors look to be less than the costs of the added complexity. - you could avoid burdening your cataloguers with > making any decisions by simply always requiring the sequence element. I find this a surprising statement from anyone who wants general purpose 'logic' tools to operate a semantic web on top of many collections of RDF data. "Oh, it doesn't matter if the order is really significant, just always say that it is". I prefer to NOT say such things to machines. No telling where they will run off to with it. > Presuming there is tool support for generating the RDF/XML, > this burdens > the cataloguers not a whit. That is true. Tool-wise, it doesn't bother the catalogers. It does take a little explanation to the tool-builders, but not a lot. But it will play hell down the line when people have to decide if a Seq element was put in because the order was REALLY important or just because somebody wanted to enforce a syntactic rule. > I'd really like to help find a way to enable PRISM to fully represent > its semantics in standard RDF. Put an 'ORDER BY' clause into the requirements list for the eventual RDF query language, and make sure that 'document order' is one of its allowed expressions. Then, logic engines can decide when order is important by analyzing the queries and not the underlying data. Ron
Received on Tuesday, 3 April 2001 10:58:06 UTC