Re: Spec update with vocabulary expansion from Ivan Herman on 2011-08-25 (public-rdfa-wg@w3.org from August 2011)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 25 Aug 2011 14:26:44 +0200
To: Gregg Kellogg <gregg@kellogg-assoc.com>
Cc: public-rdfa-wg WG <public-rdfa-wg@w3.org>
Message-Id: <6821AB47-747E-486F-9EF3-68C6BC62EADB@w3.org>
Great!

(I am not sure it is worth regenerating the Overview.html at all steps. I think it is perfectly o.k. to concentrate on the -src file while we are working on the text).

Some remarks and editing I have made on Overview-src.html:

- I am o.k. having the additional triple documented in 7.2(2). I think we discussed it and we all agreed with this. Actually... I do _not_ think that the powder predicate would be the right one; the base document is not really 'described' by that vocabulary file, only partially and only the RDFa specific parts...

I actually think that '#has-vocab' would be a better name. After all, the same base may have several vocabs, and the current name might suggest that it is singular. I have changed in the document at 7.2 (we can change it again if needed)

- Actually, in the original discussion, we also raised the possibility to add the triple

<base> rdf:type rdfa:source .

It is not strictly necessary for the vocab stuff, but it makes it a bit more complete. On the other hand... I had an interesting discussion with some guys in a small Franch company the other day (antidot) who use RDFa internally but said that it is sometimes necessary for them to 'index' back to the original source from the generated RDF, and the current RDF generated from RDFa has no handle. So this additional and innocent-looking triple may help a bunch of people...

I have not made any editing on the document on that, leave it for discussions...

- I think the example you had there was not really good. You relied on a @prefix setting for cc; however, at least until now, we defined this mechanism only for the case when @vocab is used. I have, therefore, reworded the example in setion 10 to rely on vocab instead. I have removed your note on further examples; I think it is perfectly fine the way it is!

- You have added the issue into the text:

[[[
<p class="issue">
  Why this new way of using <code>rdfa:vocab</code> for discovering information about vocabulary terms? It could be
  enough to say e.g. "For each class and property IRI for which no descriptions exist in the vocabulary graph, that
  IRI is dereferenced." I.e. rely on existing knowledge and gain more by applying the follow-your-nose principle on
  used terms. We propose to go with this unless implementers find it to be a burden. Otherwise, it should be
  considered that every occurrence of <aref>vocab</aref> should yield an <code>rdfa:vocab</code> triple. And if
  so, why not for each IRI in <aref>@prefix</aref> as well? These are different merely of a syntactical level, and
  both serve to declare vocabularies used in the document. The upside of <code>rdfa:vocab</code> (or
  <code>powder:describedby</code>) though, is that an RDFa author can explicitly reference auxiliary mapping data.
  For instance link:s in head to mappings like those on schema.rdfs.org (which maps schema.org terms to multiple
  well-known vocabularies).
</p>
]]]

I think this raises two different issues

   1. We could of course simply rely on a follow-your-nose principle. Ie, let the processor follow up all class and property URI-s and take it from there. That would of course work, but that would mean quite a burden on an implementation. What we actually do here is to provide a _minimal_ processing level that yields what we want to achieve via the usage of vocabs. Maybe this should be emphasized in the text at the beginning.
   2. Another issue is whether we should expand this to all @prefix statements, ie, the URI-s defined there. Technically, we could. However... should we? Do we need it? The point is that a user using @prefix is the RDF aware person. He/she will just simply use the right namespaces and may not require any special RDFa processing (yes, another processor can expand through a follow-your-nose, but that is not our problem). However, the use of vocab is to _hide_ all prefixed URI-s because the end user does not want to see them; this mechanism provides, at the end of the day, a possibility to keep that and still get to the 'right' triples via the subtyping mechanism.

My proposal would be to keep to the minimum principle, which is what is there right now...

- In the text I removed the reference to domain and range. I know I raised it in my mail but, keeping to the minimality principle, I would think we should leave it out.

- You ask 

[[[
Should we restrict this entailment to one iteration of expansion, or leave that (explicitly) to the discretion of the processor and/or user thereof? -nl
]]]

That would mean that if, for whatever reasons, the ontology author added

<a> rdfs:subPropertyOf <b>
<b> rdfs:subPropertyOf <c>

and the original source used 

<bla> <a> <hello> .

then

<bla> <c> <hello> .

would _not_ appear in the output.

I think we should simply let the entailment work, and not try to restrict it. In practice this is not a major problem.

- You ask

[[[
While simply referring to RDFS like this reduces the amount of text in the RDFa spec, will it provide enough information for implementers to act on, unless they already grok RDFS semantics? Or should we define a simple, step-by-step expansion here as well? If we do, it should be considered whether also owl:equivalentProperty and owl:equivalentClass could be included in that algorithm. They occur frequently in interlinked vocabularies. -nl
]]]

We restricted the vocabulary to the bare minimum needed for us. Nothing prevents an implementation to do more, possibly a complete OWL RL expansion, but we should keep to the minimum.

Defining the step-by-step expansion would mean re-defining what is in the RDF semantics documents. I do not think we should go down that route; the reference in the text to the explicit rules should be enough in my view.

- I have added a note referring to the fact that this entailment is the minimal for the purposes of RDFa, but implementation may choose to perform RDFS or OWL entailments... I also took over some of the remarks that appeared in the explicit issues.

- Two more notes in the text:

[[[

  <p class="issue">
      A processor may also expand on the subset of inference rules to include more of the entailment regimes of
      RDFS or OWL. -nl
    </p>

      <p class="issue">
      It may also be noted that processing data using vocabulary term information is useful for other use cases,
      e.g. validation. Datatype validation can be done by checking <code>rdfs:range</code> of a
      <code>owl:DatatypeProperty</code> and producing warnings if it doesn’t match the datatype of a given value.
      Deprecation warnings could also be issued if terms with owl:deprecated true are used. -nl
    </p>
]]]

See my previous remarks. This is all true, but we should not go there, in my view, _in this document_.


We are getting there!

Ivan



On Aug 25, 2011, at 08:34 , Gregg Kellogg wrote:

> Niklas and I worked over the last several days to craft spec language to describe RDFa Vocabulary Expansion to the working copy [1]. There is a small change to section 7.5 step 2 that describes inserting rdfa:vocab triples into the doc (I know, not approved, but it was expedient). Also, another short reference at the end of section 7. The meat of the discussion is in section 10, borrowing heavily on Ivan's great beginnings.
> 
> Note that there are a number of issues added for group discussion.
> 
> Gregg
> 
> [1] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview.html


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Thursday, 25 August 2011 12:24:15 UTC