W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > July 2003

Re: Comment/testcase I18N-01?

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Fri, 18 Jul 2003 14:13:18 +0100
To: Brian McBride <bwm@hplb.hpl.hp.com>
Cc: rdf core <w3c-rdfcore-wg@w3.org>, Martin Duerst <duerst@w3.org>, i18n <w3c-i18n-ig@w3.org>
Message-Id: <20030718141318.36dd9480.dave.beckett@bristol.ac.uk>

On 18 Jul 2003 13:11:51 +0100
Brian McBride <bwm@hplb.hpl.hp.com> wrote:

> One of the things that has come out of our discussions with I18N has, I
> think, been that we better understand the relationship between sequences
> of characters and markup.  Whilst I18N did not specifically point out
> this flaw in the current syntax spec, I think it should be attributed to
> their input.

I'm not sure what is the problem, it's clear to me and I don't know
where such a relationship needs further explaining.  We defer on XML
matters to where it belongs - XML specifications (XML, Namespaces in
XML, Exclusive XML Canoncialization) and to some degree, Charmod
(although not normatively since that's still a WD).

> Consider the test case:
> 
> <rdf:Description>
>   <eg:prop rdf:parseType="Literal"><em>&lt;br /></em></eg:prop>
> </rdf:Description>
> 
> If my reading of the syntax document [1] is correct, it states that this
> is equivalent to (with a little license in the syntax):
> 
> _:a <eg:prop> "<em><br /></em>"^^rdf:XMLLiteral .
> 
> I believe that should read
> 
> _:a <eg:prop> "<em>&lt;br /></em>^^ rdf:XMLLiteral .
> 
> or some variation on that theme, to preserve the distinction between
> markup and content.  We need to decide exactly what characters get
> escaped.
> 
> DaveB: is my reading of syntax correct?

No.

That string is entirely defined by Exclusive XML Canonicalization text
(with comments) which is linked directly from [1] and since that
normatively uses Canonical XML, which says:

  "Special characters in attribute values and character content are replaced by character references"
  -- http://www.w3.org/TR/2001/REC-xml-c14n-20010315#Terminology

and in detail:
  [[Text Nodes- the string value, except all ampersands are replaced by
  &amp;, all open angle brackets (<) are replaced by &lt;, all closing
  angle brackets (>) are replaced by &gt;, and all #xD characters are
  replaced by &#xD;.]]
  2.3 Processing Model, http://www.w3.org/TR/2001/REC-xml-c14n-20010315#ProcessingModel

That should be:

_:a <eg:prop> "<em>&lt;br /&gt;</em>^^ rdf:XMLLiteral .

> Martin/I18N - would you endorse this comment?
> 
> Brian
> 
> [1]
> http://ilrt.org/discovery/2001/07/rdf-syntax-grammar/#parseTypeLiteralPropertyElt

Dave
Received on Friday, 18 July 2003 09:14:48 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 3 September 2003 09:58:46 EDT