Use/misuse of RDF:Value

Dear all,

I'm copying this to both lists since the cross-over is obvious.

We in the dublin core architecture working group are dealing with the
problem of expressing a default or "dumb-down" value for dublin core
elements.

A brief bit of background, which can be filled in by referring to a
paper by me at http://www.dlib.org/dlib/january01/lagoze/01lagoze.html
and a paper by Tom Baker at
http://www.dlib.org/dlib/october00/baker/10baker.html.  At issue here is
the ability to preserve the original and still most important
application of Dublin Core as a vocabulary for simple resource discovery
descriptions.  Both Tom's and my paper describe how the hanging of
arbitrary value sub-graphs off of DC elements violate this principle and
thus interfere with the interoperability of the elements set.
Therefore, we are trying to maintain the simplicity - i.e., explicitely
modeling the simple string values of dc properties, the "appropriate
literals" as Tom calls them - while acknowledging that communities may
want to hang arbitrary stuff off of dc properties.

(parenthetical note: I will not address in this email some philosophical
issues with such practice, especially the fact that it encourages use of
dc properties as a parking place for all sorts of arbitrary values and
thus implicitely discourages a modular approach whereby such other
values exist within the context of separate vocabularies).

There has been a fair amount of traffic on the dc-architecture list
discussing the RDF mechanics for doing such.  There are trial balloons
floating around:

1. In
http://www.jiscmail.ac.uk/cgi-bin/wa.exe?A2=ind0102&L=dc-architecture&O=
A&P=22548 I repeated a suggestion that has been passed around that
exploits RDF:value.  A construct such as


     R1 ---------------> INTNODE  ------------> "apprt. literal"
          dc:property        |      rdf:value
                             |
                             |
                             -----------------> arbitrary subgraph

might say that the "apprt literal" is the default value of the
dc:property and could therefore be used as the "simple resource
discovery value". The arbitrary sub-graph would then be a space for
putting anything else a community might want.

2. in http://zoe.mathematik.uni-osnabrueck.de/dc/dumbdown.html Stefan
Kokkelink suggests a dumbing down (discovery of default algorithm) that
assumes the use of both rdf:value arcs and rdfs:label arcs.


Given these two alternatives, I'd like to raise some issues, points of
discussion, and questions for both the dc-architecture and rdf-interest
crowds.  I start with the axiom (hopefully non-controversial) that dc is
but one of many applications of rdf and the manner that dc uses rdf
abstractions must be general rather than specific to dc.  That said here
is my list of issues:

1. My reading of the rdf schema documentation says that Stefan's
suggestion to use rdfs:label for the expression of a default value is
very conventional.  As stated in 5.2 of rdf schema, rdfs:label: "This is
used to provide a human-readable version of a resource name".  The
examples throughout the document are multi-language labels for
definition, which seems to have nothing to do with the purpose for which
Stephan is using it.

2. My reading of rdf:value in 2.3 of Model and Syntax leads me to some
confusion with the usage as exemplified above and in
http://www.jiscmail.ac.uk/cgi-bin/wa.exe?A2=ind0102&L=dc-architecture&O=
A&P=22548.  Section 2.3 shows uses the example of the qualificatioin of
a property value such as saying "the price of the pencil is 75 us
cents".  I read this to say then in general the other arcs hanging off a
node to which there is attached an rdf:value arc should be generally be
interpretted as completing the partial information in the value
expressed by the simple literal at the end of the rdf:value arc.  This
sounds less like "default value" and more like partial vs. full
information.  THe implication in 2.3 is that the union of the rdf:value
arc and the other arcs provides the full information space for the
original property, that the rdf:value "value" provides partial
information, and the non-rdf:value "values" can not stand alone as a
value but only as a qualifier for the rdf:value "value".  The difference
is subtle but I'd rather that the dc-architecture folks and the rdf
folks come to common terms for this.  

The common language (instructures for a processor) might be: upon
encoutering a node with an rdf:value arc and other arcs, a processor
can:

a. use the value of the at the end of the rdf:value arc as a partial
(simple?) value for the property that it followed to get to the
respective node.
b. combine the value at the end of the rdf:value arc with the
arbitrarily large sub-graph(s) rooted in the non-rdf:value arcs as an
expression of more complete value.

Am I wrong here and if I am we do need to come up with other common
language to describe this subtlety.

3. Of special concern for the dc-architecture folks, I'm still concerned
that the hanging of that arbitrary information off a intermediate node
associated attached to a dc property says espresses that the arbitrary
information (e.g., organizational affiliation of a creator, which states
a "has a" rather than "is a" relationship) is indeed a value of the dc
property.  As I've said before, this leads me to suggest that all dc
element semantics be change to "anything related to (e.g., the creator
of the resource)".

4. Finally, I have a schema concern.  In
http://www.jiscmail.ac.uk/cgi-bin/wa.exe?A2=ind0102&L=dc-architecture&O=
A&P=22548 I suggested that, if we are going to adopt the intermediate
node modeling technique as in:


     R1 ---------------> INTNODE  ------------> "apprt. literal"
          dc:property        |      rdf:value
                             |
                             |
                             -----------------> arbitrary subgraph

then we drop the simple modeling expression of:

     R1 ---------------> INTNODE  ------------> "apprt. literal"
          dc:property              rdf:value

That is, always have the intermediate node whether there is additional
stuff hung off the dc property or not.  Stu Weible objected to this
suggestion in
http://www.jiscmail.ac.uk/cgi-bin/wa.exe?A2=ind0102&L=dc-architecture&O=
A&P=22800 where he states that:

stu>>It is not obvious to me that it is necessary to include a null
INTNODE in
cases that DO NOT have subgraphs; is it not sufficient to simply invoke
the
rule:

     Properties will terminate in either an "appropriate literal" or in
another node (INTNODE); in the latter case, the "appropriate literal"
for
the property is identified by the rdf:value arc.<<stu

However, my impression is that there is then no way to write an rdf
schema for such.  My reading is that Stu is suggesting a violation of
RDF schema which says that:

A property can have at most one range property. It is possible for it to
have no range, in which case the class of the property value is
unconstrained. 

That is, I can't write a schema that says a dc property can have a range
that is either a rdfs:literal or an intermediate node.  The schema
document makes some noise about creating a superclass to express the
single range, but I certainly can't create a superclass for
rdfs:literal?

If we can't express this with an rdf schema then we are left with the a
rather uncomfortable situation for both the dc community and the rdf
community, both of whom want to see a common use of technologies.


I'm sorry to be so orthodox about this but I believe we either have or
will miss an opportunity to create conformance between RDF and an
important application of it.

Thanks,

Carl
---------------------------------------
Carl Lagoze, Digital Library Scientist 
Department of Computer Science
Cornell University
Ithaca, NY 14853 USA
Phone: +1-607-255-6046
FAX: +1-607-255-4428
email: lagoze@cs.cornell.edu
WWW: http://www.cs.cornell.edu/lagoze/lagoze.html

Received on Friday, 23 February 2001 11:17:48 UTC