Re: RDFa Core, fragids, 3023bis, and FYN from Henry S. Thompson on 2011-10-20 (www-tag@w3.org from October 2011)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Thu, 20 Oct 2011 12:55:00 +0100
To: Jonathan Rees <jar@creativecommons.org>
Cc: www-tag@w3.org
Message-ID: <f5bwrbznai3.fsf@calexico.inf.ed.ac.uk>
Jonathan Rees <jar@creativecommons.org> writes:

> ACTION-509
>
> The question is what, if anything, would we like for the RDFa
> Core 1.1 specification to say about fragment identifiers.  On Thursday
> I volunteered to come up with more options, and here they are.
>
> Why should it say anything at all?  Because documents utilising RDFa
> Core will use fragment identifiers with the same semantics as in other
> RDF serializations, and this should be documented somehow.  There are
> two reasons to think this even though the spec doesn't call it out.
> The first is examples like the following:
>
>     <p about="#bbq" typeof="cal:Vevent">
>       I'm holding
>       <span property="cal:summary">
>         one last summer barbecue
>       </span>,
>       on
>       <span property="cal:dtstart" content="2015-09-16T16:00:00-05:00"
>             datatype="xsd:dateTime">
>         September 16th at 4pm
>       </span>.
>     </p>
>
> The other reason is that users will simply proceed to assume this
> semantics, and it will be supported by processors.  They won't ask
> permission and won't need encouragement.

I think there is a third reason, and that is that it is _already_
stated explicitly in the RDFa Core 1.1 spec.  If I'm right, that
argues that no action is necessary, except perhaps to make that
statement more obvious with respect to its implications _vis a vis_
fragment identifiers.

Here's my reasoning (I'll focus on the 'about' attribute as used in
the above example, but the argument goes through for all RDFa
attributes whose value is speced to allow anything which ends up with
an IRI in the resulting graph):

  1) "A conforming RDFa Processor must make available to a consuming
      application a single RDF graph containing all possible triples
      generated by using the rules in the Processing Model section." [1]

  2) "when converting RDFa to triples, any relative IRIs must be
      resolved relative to the base IRI" [2]

  3) "As processing progresses, any @about attributes will change the
      current subject. The value of @about is an IRI or a CURIE. If it
      is a relative IRI then it needs to be resolved against the
      current base value." [3]

  4) "we'll ... use [Turtle] throughout this document when we need to
      talk about the RDF that could be generated from some RDFa." [4]

All of this gives a complete normative justification for the treatment
of the example above, in particular to the conclusion that a
conformant RDFa processor must supply at least the following graph
to a consuming application on processing the example in
question (assuming with JAR that the example in question is found in a
representation retrieved from http://example.org/summer)

               http://example.org/summer#bbq
                               /|\
                ______________/ | \__________________________________
               |                |                                    |
               |                |______________________              |
               |                                       |             |
http://www.w3.org/1999/02/22-rdf-syntax-ns#type        |             |
               |                                       |             |
               |        http://www.w3.org/2002/12/cal/ical#summary   |
               |               |                                     |
               |               |     http://www.w3.org/2002/12/cal/ical#dtstart 
               |               |________________                 |
               |                                |                |
http://www.w3.org/2002/12/cal/ical#Vevent       |                |
                                                |                |
                                   "one last summer barbecue"    |
                                                                 |
         "2015-09-16T16:00:00-05:00"^^http://www.w3.org/2001/XMLSchema#dateTime

At this point it seems to me the RDFa Core spec. has done all that it
needs to.  For the interpretation of that graph, and in particular for
the interpretation of the IRIs which appear in it, it quite properly
defers to the RDF spec:

 "All of the triples that are defined by this specification are
  contained in the output graph by an RDFa Processor. For more
  information on graphs and other RDF concepts, see [RDF-CONCEPTS]."

> I think the question can be put as follows:
>
>   Is RDF-style fragid semantics for RDFa Core to be documented
>
>   (a) only in the RDFa Core spec,
>   (b) only in media type (or namespace?) registrations referencing RDFa
>       Core,
>   (c) neither, or
>   (d) both?

If I'm right, the answer is "(c) neither", on the grounds that RDFa
introduces nothing new with respect to the use of fragment identifiers
(that's already in RDF) or with respect to the _interpretation_ of
fragment identifiers, since no RDFa attribute creates an anchor in the
resource corresponding to its host document.

> Note that "documented" could be via a chain of normative references,
> e.g. by reference to RDF Concepts, which is what RFC 3870 does.

The way in which RFC 3870 and RDF Concepts do, or do not, successfully
define the interpretation of fragment identifiers in documents of type
application/rdf+xml is a topic for another email. . .

However for my argument wrt FYN all that's necessary is that we can
get to RDF Concepts from any use of RDFa.  For RDFa + HTML I believe
we all think the responsibility for this lies with the HTML5 spec.

More generally, I think we can identify two classes of RDFa embedding
in a host language/media-type-governed-representation:

 1) *Witting* embedding, in which the specification for the host
    language explicitly allows RDFa attributes with their RDFa
    semantics and references the RDFa Core spec.;

 2) *Unwitting* embedding, in which no such explicit license is to be
    found in the host language spec.  I have recently argued (see [5])
    that this _must_ only be allowed if the RDFa attributes are
    namespace-qualified.

In either case the FYN chain is complete.

Supposing my argument is found to be convincing, what should the RDFa
Core WG do wrt the Note [6] which reads

  "In some of the examples below we have used IRIs with fragment
   identifiers that are local to the document containing the RDFa
   fragment identifiers shown (e.g., 'about="#me"'). This idiom, which
   is also used in RDF/XML [RDF-SYNTAX-GRAMMAR] and other RDF
   serializations, gives a simple way to 'mint' new IRIs for entities
   described by RDFa and therefore contributes considerably to the
   expressive power of RDFa. Unfortunately, this practice is not at
   present covered by the media type registrations that govern the
   meaning of fragment identifiers (see section 3.5 of the URI
   specification [RFC3986], [RFC3023], and [RFC2854]). For more
   information about fragment identifier semantics, see [WEBARCH]
   section 3.2.1."

?

I would suggest simply replacing the last two sentences (from
"Unfortunately, this practice" until the end of the note) with

   The precise meaning of IRIs which include fragment identifiers when
   they appear in RDF graphs is given in Section 7 of [RDF-CONCEPTS].

> . . .

> Answer (c) shouldn't be dismissed out of hand; here's the case: (I
> don't buy any of this, just trying to make sure it's heard)
>
>   RFC 3986 only says the interpretation "depends on" the media type;
>   that is too vague to be construable as a requirement for FYN.
>   Fragid semantics could depend in some obscure way, it could depend
>   on other things as well, and so on.
>
>   RDF Concepts seems to imply that the FYN story for RDF is different
>   from that of 3986, and that media types are not really involved
>   after all.  That is, it says that for RDF, you get an
>   application/rdf+xml representation (or you imagine doing so), then
>   look at the triples to see what they say.

These two points seem to me to be critiques of RDF Concepts, not of
option (c) for addressing FYN.  It's not the RDFa Core wG's job to fix
alleged bug(s) in RDF Concepts (which they have, on my account, gotten
us to wrt FYN)  They are allowed to depend on it just as e.g. the
RDF Syntax, Turtle and N3 specs do.

>   Among existing media types the only one that connects fragids to
>   RDF-style semantics is application/rdf+xml.  text/n3, text/turtle,
>   and other RDF media type registrations say nothing.
>   application/xhtml+xml handles RDFa via the XHTML namespace
>   declaration, but the RDFa specification cited by the nsdoc says
>   nothing about fragid semantics.  (There is a non-normative reference
>   to a GRDDL transform, but the wording is too weak to make a good
>   connection.)

Per my argument above the media type argument is settled -- text/html
is covered by HTML5's reference to RDFa (not currently there, but
assuming that at least gets sorted) and other witting or unwitting
uses are covered as described above.

ht

[1] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html#processorconf
[2] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html#s_curieprocessing
[3] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html#using--about
[4] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html#turtle
[5] http://lists.w3.org/Archives/Public/public-rdfa-wg/2011Oct/0051.html
[6] http://www.w3.org/2010/02/rdfa/sources/rdfa-core/Overview-src.html#s_Syntax_overview
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Thursday, 20 October 2011 11:55:27 UTC