- From: Graham Klyne <gk@ninebynine.org>
- Date: Fri, 06 Jun 2003 14:01:17 +0100
- To: Martin Duerst <duerst@w3.org>, w3c-rdfcore-wg@w3.org
- Cc: w3c-i18n-ig@w3.org, swick@w3.org
This is a complicated issue, and I may have overlooked some of the reasons
we are where we are, but I think Martin's description [1], second part
(starting with "The orgininal RDF spec
(http://www.w3.org/TR/1999/REC-rdf-syntax-19990222)had very few variation
for literals."), is useful and well-presented background.
I think the arguments made reinforce the idea of losing XMLLiteral as a
datatype, which I previously [2] argued in favour of in response to
Martin's earlier comments [5], which were themselves in response to the
decision [3] [4] to drop language tags and <rdf-wrapper> stuff from XML
literals.
So, along the line, it seems we decided:
(a) that XML C14N was a parser issue, and the abstract graph syntax is
presumed to contain only canonical XML literal data,
(b) to drop the language tag from XML literals, hence the need for
<rdf-wrapper> in the abstract syntax.
Martin argues that (1) XML literals shouldn't be gratuitously different
than plain literals, and (2) XML literals really need to have language tags.
Thus, I propose that parsetype='Literal' values be treated as plain
literals, and that the parsetype='Literal' is simply a flag to the parset
to not try to interpret the contained XML data as any form of RDF
description. The value of a literal thus described is simply the sequence
of characters (after C14N applied to the RDF/XML) contained within the
corresponding element, together with the in-scope XML language tag (if any).
The current approved definitions, per [6] [7] make it clear that a plain
literal denotes either a string or a pair of two strings, one of which is a
language tags.
After that, there's one final wrinkle raised by Martin's last message,
namely the status of xsd:string datatyped literal values. My view on these
is that they are syntactically distinct values in the RDF graph, but under
a suitable datatyped interpretation will denote the same thing as a
corresponding plain literal without language tag.
#g
--
[1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Jun/0023.html
[2] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0203.html
[3] Meeting minutes 9-May-2003 (item)
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0200.html
[4] Jeremy's outline proposal, option 4 approved per [3]
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0138.html
[5] Martin's comments in response to this decision:
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0200.html
[6] Meeting minutes 16-May-2003
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0199.html
[7] Jeremy's proposal for revised literal interpretation,
broadly accepted per [6]:
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0151.html
[8] Meeting minutes 28-Mar-2003 (item 13):
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0199.html
[9] C14N proposal accepted per [5]:
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0199.html
At 17:50 05/06/03 -0400, Martin Duerst wrote:
>Dear RDF WG,
>
>I have been actioned by the I18N WG (Core TF) to write to you.
>
>This is partially a Last Call comment, and partially a comment
>on your recently announced post-Last Call changes. It affects
>several of your specifications.
>
>On its recent teleconference, the I18N WG (Core TF) agreed with
>my summary of the situation in
>http://www.w3.org/mid/4.2.0.58.J.20030605145023.06c05ce0@localhost.
>
>In particular, we have looked at the current (both in the Last
>Call as well as in your later proposal) status of string and
>language handling in RDF literals (plain literals, XML literals,
>typed literals of XML Schema Datatype 'string').
>
>The core arguments for our case are contained in the above email,
>but I'll copy them here for your easy reference:
>
> >>>>>>>>
>This situation is not at all satisfactory from the viewpoint
>of I18N because:
>- We have worked hard to eliminate artificial differences between
> text strings that are essentially the same:
> - by basing XML and RDF on Unicode, and therefore eliminating
> differences in character encoding.
> - by working on normalization (NFC) to reduce or avoid accidental
> differences based on remaining encoding choices in Unicode
> It would be very bad if after all that work, we were left with
> gratuitously different ways of representing textual strings due
> to idiosyncrasies of a type system.
>
>- Language tagging is an important aspect of internationalization.
> Also, small-scale markup is important for internationalization
> (multilanguage strings, bidirectionality, ruby, glyph variants,...).
> Both are in many ways natural extensions of plain text strings
> as soon as markup is available.
>
> The current handling of XML literal strings without any actual
> markup, as well as the recent change to ignore xml:lang on XML
> literals, break this natural extension.
>
> In addition, the recent change to ignore xml:lang on XML
> literals makes language tagging more tedious in the prevalent
> case of monolingual or mostly monolingual data.
> >>>>>>>>
>
>
>We think that this is a very important issue for RDF and I18N,
>and strongly urge you to find a better solution. We think the
>proposal given by Ralph is a very good start, but we are sure
>you will have other ideas.
>
>
>With kind regards, Martin.
-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
Received on Friday, 6 June 2003 09:15:48 UTC