PRISM and multiple element concern from Ronald Daniel on 2002-02-05 (w3c-rdfcore-wg@w3.org from February 2002)

From: Ronald Daniel <rdaniel@interwoven.com>
Date: Tue, 5 Feb 2002 09:36:30 -0800
To: RDF Core <w3c-rdfcore-wg@w3.org>
Cc: "Brian McBride (E-mail)" <bwm@hplb.hpl.hp.com>
Message-ID: <145C1D60907A4944ABAE75DE3FF6418C3B146D@xchanger3.interwoven.com>
Hi all,

Off-list, Brian asked me to give the PRISM view re. Patrick's
point about multiple versions of a property.

The following are my opinions, they have not been voted on
by the PRISM group. Opinions always subject to change, etc.

1) PRISM WILL NOT define multiple versions of its properties.

   To be completely clear about this, we will sacrifice datatyping
   precision in order to have a specification that we believe has
   a chance of broader success. I am not about to try and educate
   our community on the differences between xsd:date.val,
   xsd:date.lex, and xsd:date.map - especially since those distinctions
   are invisible to the 'date' datatype in any commercial database
   management system I am aware of.

2) PRISM CAN specify one or the other idiom to be used for its elements.
   Patrick had asked "who are they [groups like the Dublin Core or PRISM]
   to tell me or anyone which idiom I am to use?!"
   The answer for the PRISM WG is that we set the definition of the
   things in our namespaces. We don't presume to define
   things outside our namespaces, but we absolutely have the right
   to tell you what the things we define are supposed to mean. Inside
   your system you are free to do with that what you will.

3) For the elements in the PRISM-controlled namespace, they could all
   be given an rdf:range property following one idiom or another.
   The precise choice does not appear critical to us. I would like
   the RDF Core group to recommend one as default practice. PRISM
   would then use it if and when we define RDF Schemas for our
   namespaces.

4) Of particular concern to PRISM is the case where a description
   mixes elements from multiple namespaces, because we do. However,
   it seems that the different elements can be defined according to
   different idioms without conflict (so long as they don't declare
   that they are both subPropertyOf a common ancestor. But presumably
   the ancestor was not defined with a type if people derive
   subPropertys that follow different idioms. If that is the case
   then it is probably OK to postulate the existence of a datatype X
   which is a supertype of the conflicting types. But again, this is
   not a key issue for PRISM and I'm digressing.)

5) There are places where the same string will be used to 'mean'
   different things (and no, I am not going to say what I mean when I
   say the word 'mean' :-). I see two cases to consider - where there
   is and is not an intermediate node.
   a) Two strings might occur which mean the same thing. This is expected
      to be rare, but it can certainly happen that there are different
      dc:creators with the same name. For example:
          <some_article> <dc:creator> "J. Doe"
          <another_article> <dc:creator> "J. Doe"
      Given PRISM's purpose, which is discovery and muti-purposing of
      content with humans in the loop, it is OK if a simple system
      munges the two together. Obviously not ideal, but very
      understandable behavior on the part of the system and not likely
      to offend our target customers.
   b) Two distinct resources with the same label. This will occur with
      reasonable frequency when people are dealing with controlled
      vocabularies. (As an example, how many different meanings of the
      word 'set' can you think of? Tennis, theatre, glue, math, ...)
      PRISM provides the pcv:label property to associate a resource
      in a controlled vocabulary (call it a concept, one for each
      distinct meaning of 'set') with one or more names for that
      concept. There is no assumption that labels are unique or
      unambiguous. (Unique meaning only one label per meaning,
      unambiguous meaning only one meaning per label. The stuff about
      'set' is an example of ambiguity. Synonyms and language tags are
      cases of non-uniqueness - cases explicitly provided for in the
      PRISM spec.) The URI of the individual 'concepts' is the key
      identifier to use. So if we have the same label with two
      meanings, we actually have something like:
          <article1> <dc:subject> <v:set1>
          <v:set1>   <pcv:label>  "set"
          <article2> <dc:subject> <v:set2>
          <v:set2>   <pcv:label>  "set"
      and there is no concern about whether the two articles would
      conflate the two different subjects by mistake.

5) A place where I don't see a satisfactory answer: PRISM allows
   both
   <article> <prism:location> "Texas"
   <article> <prism:location> <iso3166-2:us-tx>
   One is a string, the other a term in a controlled vocabulary.
   I could declare the range of prism:location to be a location, but
   that is vacuous. I don't see that I can say anything else about
   the data type of the object of the statements.

   Along those lines, here are some 'reasonable' data type declarations
   that can't be made:
    <dc:creator> <rdfs:range> <x:Person> . # Companies can be authors too
    <dc:publisher> <rdfs:range <x:Company> . # people can self-publish


Regards,

Ron Daniel Jr.
Standards Architect
Interwoven, Inc.
Tel: 408-530-5922
Cell: 925-368-8371
Email: rdaniel@interwoven.com 
Visit www.interwoven.com
Received on Tuesday, 5 February 2002 12:37:02 UTC