- From: Brian McBride <bwm@hplb.hpl.hp.com>
- Date: Thu, 20 Mar 2003 18:41:42 +0000
- To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, www-rdf-comments@w3.org, eric@w3.org
Eric, To track this comment, I've added you as a cosubmitter of reagle 01 and 02: http://www.w3.org/2001/sw/RDFCore/20030123-issues/#reagle-01 http://www.w3.org/2001/sw/RDFCore/20030123-issues/#reagle-02 Please let us know as soon as you can if this does not fully capture your comment. Brian At 21:35 10/03/2003 +0100, Jeremy Carroll wrote: >Hi Eric, > >I dropped the ball with your message >http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0240.html > >My co-editors have pointed out my mistake ... > >I reply inline - but highlight that there is a potential editorial issue of >clarifying that a DOCTYPE cannot be included with XMLLiterals. >Please confirm that you do want that treated as a last call issue. > >I will copy you on further messages to Joe Reagle concerning reagle-01 and >reagle-02; I take you as having expressed interest in these issues. > >Reagle: > >> > I'm confused by this because most of the specifications are citing > >> > Canonical XML (c14n), not Exclusive Canonicalization (exc-c14n). > >Carroll: > >> The process is intended to be two-phase: > >> > >> The first phase takes an RDF/XML document and constructs an RDF > >> graph. In this phase it is not required to actually canonicalize, > >> but it is required to retain all the information needed for > >> exc-c14n. > > >Eric: > >Since identical strings are considered the same object in the RDF > >model, it may be worth applying exc-c14n as parseType="Literal"s are > >imported into the graph. This would apply if one were using an API to > >create XML-encoded nodes. > > graph->createLiteral("<html>...</html>", XMLLiteral) > >If it is being parsed (as opposed to provided by an API or translated > >from another triples language), the parseType="Literal" data should > >already canonicalized. (This eases the burden on such parsers as they > >need not perform any canonicalization, though they may choose to for > >backword compatibility, as I did for annotea.) > > >I am not sure of the status intended with the above comment. >It is not dissimilar to some text I am asking the WG to consider, >viz: >[[ >Note: For systems which reason about RDF graphs >it is suggested that the canonicalization be >performed on XML input. The internal representation >and non-XML external representations should be >in canonical form. >]] > > > > >> The second phase, which many RDF applications don't actually ever do > >> is from the graph to its formal meaning; for these it concerns the > >> meaning of the string delivered by the parser. This second stage is > >> determined by the mapping defined in RDF Concepts. This second stage > >> uses c14n on the grounds that whatever the parser delivered (which > >> is intended as implementation dependent) is then preserved. > > > >I think this assumption limits the responsibility of the RDF engine to > >those semantics which are expressed in c14n subset of XML, as opposed > >to the string that looks like XML. If one uses an API to create a node > > <!DOCTYPE PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" > > "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> > > <html>...</html> > >and wishes to preserve the doctype, the node must be entity-encoded > >and stored as CharData. > >The intent is to limit the responsibility as you indicate. > > > > Perhaps some of this text would serve as a > >warning in the specification somwhere in the XML Content section [4]. > >Do you want this comment treated as an issue? Otherwise it will >get lost (promise!) > > >> > >> The fundamental problem we are addressing is *how* to repesent XML > >> content within an RDF graph. This XML content originates from an > >> RDF/XML document, but, that original context gets lost. Thus we face > >> a number of problems familiar in exc-c14n, what to do about > >> entities?, what to do about visibly used namespaces? what to do with > >> namespaces that are present but not visibly used? These issues are > >> the pressing ones that are addressed by the Last Call docs. A > >> further issue of making sure that two different implementations get > >> exactly the same answer was not one that we felt it necessary to > >> address. I will ask the WG to reconsider whether this was correct > >> as part of the LC process. > > > >I suspect that the easiest path is to use exc-c14n in the concepts > >document per issue reagle-02 [1]. This eliminates reagle-01 [2]. > >This proposal is now before the WG. > > > > >The third issue [3] raised simply requires a clarification. > >This has been done. > > > > >> > > This behaviour is conformant but not required. > >> To the RDF Last Call documents. > > > >> Thanks for your comments, Brian should assign an issue number > >> concerning the implementation variability, Pat should follow up on > >> the misleading wording about the xsd namespace in semantics. > > > >Implementation experience: > > > >Annotea has to parse and reproduce plain and XML literals. These are > >stored in the triple store along with their encoding (PLAIN or > >XML). When serializing the product of a graph query (like properties > >of things annotating "http://www.w3.org/": ((annotates ?a > >http://www.w3.org/)(?p ?a ?o))), it entity-encodes PLAIN literals and > >wraps XML encoded ones in a parseType="Literal". > > <r:Description r:about="foo"><p1>some data</p1></r:Description> > >and > > <r:Description r:about="foo"><p1 parseType="Literal">some >data</p1></r:Description> > >do not refer to the same object as the encoding is a key in the > >Literals table. > > >I am not reading any issue that needs addressing in the >above implementation experience. > >Sorry again for the delay in reply > >Jeremy
Received on Thursday, 20 March 2003 13:40:39 UTC