- From: Graham Klyne <gk@ninebynine.org>
- Date: Fri, 06 Jun 2003 14:01:17 +0100
- To: Martin Duerst <duerst@w3.org>, w3c-rdfcore-wg@w3.org
- Cc: w3c-i18n-ig@w3.org, swick@w3.org
This is a complicated issue, and I may have overlooked some of the reasons we are where we are, but I think Martin's description [1], second part (starting with "The orgininal RDF spec (http://www.w3.org/TR/1999/REC-rdf-syntax-19990222)had very few variation for literals."), is useful and well-presented background. I think the arguments made reinforce the idea of losing XMLLiteral as a datatype, which I previously [2] argued in favour of in response to Martin's earlier comments [5], which were themselves in response to the decision [3] [4] to drop language tags and <rdf-wrapper> stuff from XML literals. So, along the line, it seems we decided: (a) that XML C14N was a parser issue, and the abstract graph syntax is presumed to contain only canonical XML literal data, (b) to drop the language tag from XML literals, hence the need for <rdf-wrapper> in the abstract syntax. Martin argues that (1) XML literals shouldn't be gratuitously different than plain literals, and (2) XML literals really need to have language tags. Thus, I propose that parsetype='Literal' values be treated as plain literals, and that the parsetype='Literal' is simply a flag to the parset to not try to interpret the contained XML data as any form of RDF description. The value of a literal thus described is simply the sequence of characters (after C14N applied to the RDF/XML) contained within the corresponding element, together with the in-scope XML language tag (if any). The current approved definitions, per [6] [7] make it clear that a plain literal denotes either a string or a pair of two strings, one of which is a language tags. After that, there's one final wrinkle raised by Martin's last message, namely the status of xsd:string datatyped literal values. My view on these is that they are syntactically distinct values in the RDF graph, but under a suitable datatyped interpretation will denote the same thing as a corresponding plain literal without language tag. #g -- [1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Jun/0023.html [2] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0203.html [3] Meeting minutes 9-May-2003 (item) http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0200.html [4] Jeremy's outline proposal, option 4 approved per [3] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0138.html [5] Martin's comments in response to this decision: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0200.html [6] Meeting minutes 16-May-2003 http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0199.html [7] Jeremy's proposal for revised literal interpretation, broadly accepted per [6]: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0151.html [8] Meeting minutes 28-Mar-2003 (item 13): http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0199.html [9] C14N proposal accepted per [5]: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003May/0199.html At 17:50 05/06/03 -0400, Martin Duerst wrote: >Dear RDF WG, > >I have been actioned by the I18N WG (Core TF) to write to you. > >This is partially a Last Call comment, and partially a comment >on your recently announced post-Last Call changes. It affects >several of your specifications. > >On its recent teleconference, the I18N WG (Core TF) agreed with >my summary of the situation in >http://www.w3.org/mid/4.2.0.58.J.20030605145023.06c05ce0@localhost. > >In particular, we have looked at the current (both in the Last >Call as well as in your later proposal) status of string and >language handling in RDF literals (plain literals, XML literals, >typed literals of XML Schema Datatype 'string'). > >The core arguments for our case are contained in the above email, >but I'll copy them here for your easy reference: > > >>>>>>>> >This situation is not at all satisfactory from the viewpoint >of I18N because: >- We have worked hard to eliminate artificial differences between > text strings that are essentially the same: > - by basing XML and RDF on Unicode, and therefore eliminating > differences in character encoding. > - by working on normalization (NFC) to reduce or avoid accidental > differences based on remaining encoding choices in Unicode > It would be very bad if after all that work, we were left with > gratuitously different ways of representing textual strings due > to idiosyncrasies of a type system. > >- Language tagging is an important aspect of internationalization. > Also, small-scale markup is important for internationalization > (multilanguage strings, bidirectionality, ruby, glyph variants,...). > Both are in many ways natural extensions of plain text strings > as soon as markup is available. > > The current handling of XML literal strings without any actual > markup, as well as the recent change to ignore xml:lang on XML > literals, break this natural extension. > > In addition, the recent change to ignore xml:lang on XML > literals makes language tagging more tedious in the prevalent > case of monolingual or mostly monolingual data. > >>>>>>>> > > >We think that this is a very important issue for RDF and I18N, >and strongly urge you to find a better solution. We think the >proposal given by Ralph is a very good start, but we are sure >you will have other ideas. > > >With kind regards, Martin. ------------------- Graham Klyne <GK@NineByNine.org> PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
Received on Friday, 6 June 2003 09:15:48 UTC