discussion of metadata proposals

Sorry for hurling another long email on this festive day...

The following is an attempt to organize our thoughts on the requirements to
RIF metadata. It should help guide us to a speedy resolution. (The main
reason why I did not add metadata to the "math English" syntax was that
while thinking about it we started having doubts about the earlier
proposal.)


	--michael  



Problems with the earlier proposal

  1. The old proposal injects new syntax at the metadata level,
     which cannot be processed by BLD rules.

     For some in this group it is thus a non-starter.

     In fact, Sandro mentioned that he would like to import just the
     metadata -- presumably for processing by a ruleset -- and we agree
     that this is a good application.

  2. Inability to attach metadata to a subset of rules.

     The new proposal allows arbitrary nesting of metadata attachments at
     the level of rules and facts.

     For instance, Bob may publish a bunch of rules. One subset of those
     rules he got from Mary and one from Liz, so the subsets are annotated
     with

     "http://mary.com/rules"^^rif:iri[date->"2002-12-12"^^xsd:date,
				      author->"Mary Smith"^^xsd:string].
     and
     "http://liz.com/rules"^^rif:iri[date->"2008-10-10"^^xsd:date,
				     author->"Liz Biz"^^xsd:string].
     respectively.

     In turn, Mary's rules are partially due to Jerry and Victor.
     So, Mary's subset of rules, which looks like

     <Ruleset> <meta> </meta> <rule> </rule> ... <rule> </rule> </Ruleset>

     (the <meta> </meta> part is the first frame above), can in turn
     have two nested <Ruleset> </Ruleset> pairs -- one for Jerry's rules
     and one for Victor's (plus other rules that Mary may have authored by
     herself. And so on.

     Thus, the overall structure looks like this:

     Ruleset(
        meta?
	rule*
	Ruleset(
	   meta?
	   rule*
	   Ruleset(...)*
	   ...
	)*
	...
	rule*
	Ruleset(
	   meta?
	   rule*
	   Ruleset(...)*
	   ...
	)*
     )

  3. Two separate tags for attaching metadata instead of one.
     (This is a lesser issue.)


Responses to the arguments against the new proposal

  1. The syntax of frames in XML is more verbose than lists

     The group has decided on using a fully-striped syntax even inside
     slots, which can make document content quite verbose. Complaining
     about one extra tag, <meta>, to connect to <Frame> metadata (which
     will be a much smaller part of the document) is inconsistent with
     the earlier decisions.

  2. The name <Ruleset> for denoting metadata attachment may be confusing.

     Well, we could perhaps change the name to <Rules>. This latter
     keyword carries less baggage.

  3. The new proposal increases the level of nesting of wrappers for
     attaching metadata.

     Not true. The nesting level of the wrappers is exactly the same
     (or smaller, in the absence of single-rule metadata).

  4. The new proposal claims that it uses fewer tags, but this is only
     for the metadata markup. We still need another tag to indicate the
     beginning and end of a ruleset.

     There is no need for another top-level tag. We can keep the same
     Ruleset (or Rules) tag at the top. And above that, there will be only
     rif:Document.

  5. If we use RIF syntax for metadata then people will be confused that
     the metadata is part of the knowledge base.

     a. This is not a serious argument. People who would be confused
        by that should not be allowed within 1000 feet of RIF. :-)

     b. The main idea of our proposal IS to make metadata into a
     	  knowledge base and make it processable by other knowledge bases.
	  It is just that the metadata is part of a knowledge base that is
     	  distinct from the main rulebase (cf. Sandro's wish-list).

     c. Another advantage is that we can reuse the existing mapping of
        frames to RDF (in the appendix to the RDF compatibility document).


Explaining misconceptions

   1. The Ruleset (or Rules) scope has implications for local RIF symbols.

      The Rules/Ruleset wrappers are just attachment points for metadata.
      If anything, they are like the include statements of C, not like
      import statements (see
      http://lists.w3.org/Archives/Public/public-rif-wg/2008Apr/0006.html)

      The local/global symbols business should be left to the import (and
      future modules) mechanism.

   2. Where is the global IRI?

      It is the object Id of the frame used for the metadata. In the above
      example of Mary's rules,
      "http://mary.com/rules"^^rif:iri[date->"2002-12-12"^^xsd:date,
				       author->"Mary Smith"^^xsd:string],
      this global Id is "http://mary.com/rules"^^rif:iri.
      (We are not sure whether the right constant to use is
       "http://mary.com/rules"^^rif:iri or
       "http://mary.com/rules"^^xsd:anyURI,
       but this is beside the point.)

   3. Can metadata contain just an Id (the global iri)?
      
      Yes:  "http://mary.com/rules"^^rif:iri[]

   4. Can there be metadata without a global Id?

      Although the old proposal allowed that, it is unclear whether this is
      really needed. Assuming it is, there are several options:

      a. Use a local symbol as the object Id in the frame:
           "someruleset123"^^rif:local[...]
      b. Use a variable
           ?V[...]
      c. Use a Skolem constant (we do not have them, but should).
 

Received on Wednesday, 2 April 2008 03:37:59 UTC