Re: Please review RDF Last Call from Brian McBride on 2003-03-20 (www-rdf-comments@w3.org from January to March 2003)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Thu, 20 Mar 2003 18:41:42 +0000
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, www-rdf-comments@w3.org, eric@w3.org
Message-Id: <5.1.0.14.0.20030320183310.0784ce08@localhost>
Eric,

To track this comment, I've added you as a cosubmitter of reagle 01 and 02:

   http://www.w3.org/2001/sw/RDFCore/20030123-issues/#reagle-01
   http://www.w3.org/2001/sw/RDFCore/20030123-issues/#reagle-02

Please let us know as soon as you can if this does not fully capture your 
comment.

Brian

At 21:35 10/03/2003 +0100, Jeremy Carroll wrote:


>Hi Eric,
>
>I dropped the ball with your message
>http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0240.html
>
>My co-editors have pointed out my mistake ...
>
>I reply inline - but highlight that there is a potential editorial issue of
>clarifying that a DOCTYPE cannot be included with XMLLiterals.
>Please confirm that you do want that treated as a last call issue.
>
>I will copy you on further messages to Joe Reagle concerning reagle-01 and
>reagle-02; I take you as having expressed interest in these issues.
>
>Reagle:
> >> > I'm confused by this because most of the specifications are citing
> >> > Canonical XML (c14n), not Exclusive Canonicalization (exc-c14n).
>
>Carroll:
> >> The process is intended to be two-phase:
> >>
> >> The first phase takes an RDF/XML document and constructs an RDF
> >> graph.  In this phase it is not required to actually canonicalize,
> >> but it is required to retain all the information needed for
> >> exc-c14n.
> >
>Eric:
> >Since identical strings are considered the same object in the RDF
> >model, it may be worth applying exc-c14n as parseType="Literal"s are
> >imported into the graph. This would apply if one were using an API to
> >create XML-encoded nodes.
> >  graph->createLiteral("<html>...</html>", XMLLiteral)
> >If it is being parsed (as opposed to provided by an API or translated
> >from another triples language), the parseType="Literal" data should
> >already canonicalized. (This eases the burden on such parsers as they
> >need not perform any canonicalization, though they may choose to for
> >backword compatibility, as I did for annotea.)
>
>
>I am not sure of the status intended with the above comment.
>It is not dissimilar to some text I am asking the WG to consider,
>viz:
>[[
>Note: For systems which reason about RDF graphs
>it is suggested that the canonicalization be
>performed on XML input. The internal representation
>and non-XML external representations should be
>in canonical form.
>]]
>
> >
> >> The second phase, which many RDF applications don't actually ever do
> >> is from the graph to its formal meaning; for these it concerns the
> >> meaning of the string delivered by the parser. This second stage is
> >> determined by the mapping defined in RDF Concepts. This second stage
> >> uses c14n on the grounds that whatever the parser delivered (which
> >> is intended as implementation dependent) is then preserved.
> >
> >I think this assumption limits the responsibility of the RDF engine to
> >those semantics which are expressed in c14n subset of XML, as opposed
> >to the string that looks like XML. If one uses an API to create a node
> >  <!DOCTYPE PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> >    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> >  <html>...</html>
> >and wishes to preserve the doctype, the node must be entity-encoded
> >and stored as CharData.
>
>The intent is to limit the responsibility as you indicate.
>
>
> >               Perhaps some of this text would serve as a
> >warning in the specification somwhere in the XML Content section [4].
>
>Do you want this comment treated as an issue? Otherwise it will
>get lost (promise!)
>
> >>
> >> The fundamental problem we are addressing is *how* to repesent XML
> >> content within an RDF graph. This XML content originates from an
> >> RDF/XML document, but, that original context gets lost. Thus we face
> >> a number of problems familiar in exc-c14n, what to do about
> >> entities?, what to do about visibly used namespaces? what to do with
> >> namespaces that are present but not visibly used? These issues are
> >> the pressing ones that are addressed by the Last Call docs. A
> >> further issue of making sure that two different implementations get
> >> exactly the same answer was not one that we felt it necessary to
> >> address.  I will ask the WG to reconsider whether this was correct
> >> as part of the LC process.
> >
> >I suspect that the easiest path is to use exc-c14n in the concepts
> >document per issue reagle-02 [1]. This eliminates reagle-01 [2].
>
>This proposal is now before the WG.
>
> >
> >The third issue [3] raised simply requires a clarification.
>
>This has been done.
>
> >
> >> > > This behaviour is conformant but not required.
> >> To the RDF Last Call documents.
> >
> >> Thanks for your comments, Brian should assign an issue number
> >> concerning the implementation variability, Pat should follow up on
> >> the misleading wording about the xsd namespace in semantics.
> >
> >Implementation experience:
> >
> >Annotea has to parse and reproduce plain and XML literals. These are
> >stored in the triple store along with their encoding (PLAIN or
> >XML). When serializing the product of a graph query (like properties
> >of things annotating "http://www.w3.org/": ((annotates ?a
> >http://www.w3.org/)(?p ?a ?o))), it entity-encodes PLAIN literals and
> >wraps XML encoded ones in a parseType="Literal".
> >  <r:Description r:about="foo"><p1>some data</p1></r:Description>
> >and
> >  <r:Description r:about="foo"><p1 parseType="Literal">some
>data</p1></r:Description>
> >do not refer to the same object as the encoding is a key in the
> >Literals table.
>
>
>I am not reading any issue that needs addressing in the
>above implementation experience.
>
>Sorry again for the delay in reply
>
>Jeremy
Received on Thursday, 20 March 2003 13:40:39 UTC