Re: 2001-09-07#5 Literals from Jeremy Carroll on 2001-09-26 (w3c-rdfcore-wg@w3.org from September 2001)

From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
Date: Wed, 26 Sep 2001 22:40:23 +0100
To: w3c-rdfcore-wg@w3.org
Message-ID: <3BB24B47.220EE2F7@hplb.hpl.hp.com>
DaveB
> Whoa! #2 - this is far too much to take in.
> It is too early to approve it.

OK.



> After a quick scan:
>
>  * What Issues do all these points address?  Please re-order by
>    issue.  If they address none, why are they there?

I won't do the re-ordering immediately, although that's a good idea.

I will try and enumerate the issues.

From the charter:
> clarifications and improvements to the specification of 
> RDF's abstract model and XML syntax.

and

> the RDF Core WG must provide an account of the relationships between 
> the basic components of RDF (Model, Syntax, Schema) and the larger 
> XML family of recommendations.

From the issue list:
  rdfms-literal-is-xml-structure
  rdfms-xmllang
  rdfms-xml-literal-namespaces
  

I interpret the "clarifications and improvements" part to include
looking at semi-finished parts of M&S and clarifying them, particularly
in the light of charmod which is a useful document, which the M&S
writers appear to have been expecting and wanting to conform to.

Literal equality is a key issue overtly labelled as unfinished in M&S.
M&S fudges equality on both xml literals (where it is explicity silent)
and Unicode strings (where it refers to an unwritten document from I18N
WG - charmod)
Since M&S was written there have been important advances in XML
technology like infoset and canonicalization, and Unicode normalization.

All the normalization stuff is there to support literal equality in
accordance with charmod, in a way that is appropriate for a world-wide
standard.

The stuff to do with URI pairing is there to leave a door open if that
is a good way to go for XML Schema datatypes 



>  * It seems to require changing existing software in unclear ways.

Yes, it seems so, but no, no change is necessary for most software.
Areas where change may appear to be necessary:
  xml:lang
     - no longer optional for RDF/XML processors
     - xml:lang comparison is case insensitive
       It always was, just not highlighted as such.
     - RFC3066 support; yes that is a change which does break some 
       software.
       However that change has been made by XML 1.0 second edition.

  normalization form C
http://www.unicode.org/unicode/reports/tr15/
     " Implementations of Unicode which restrict themselves to a 
       repertoire containing no combining marks (such as those that 
       declare themselves to be implementations at Level 1 as defined 
       in ISO/IEC 10646-1) are already typically using Normalization 
       Form C "

I think all RDF implementations hence are already using NFC ;-)

  early normalization
       This does put a burden on document writers who wish to use a 
       "repertoire containing combining marks".
       Since the current specs mean that such authors can't reliably 
       use RDF, this burden is an improvement.
       Implementators of anything other than systems that create
       RDF documents in "repertoires containing combining marks"
       are not burdened. In particular RDF/XML processors are only 
       restricted by MUST NOT-ing further normalization, which
       actutally prohibits implementators from working harder!

   XML Canonicalization of literals
       This is deliberately not a MUST.


>
>  * No implementations (I want more than one)
       At some level there is only XML Canonicalization to implement.
       (Well, OK, hardly 'only').
       
   
>
>  * Hypothetical future work (stuff like "RDF Core WG
>    may in future ...  One example is ").  Delete these.


Hmmm ...

I am trying hard to leave the xml schema datatypes issue as unresolved.
Paras [2] and [3], which you appear to be referring to are to make that
explicit.

This document is intended for the WG rather than outside consumption;
although I would hope that large chunks of it could be appropriately
copied verbatim into working drafts; I would agree that paras [2] and
[3] are not such text.


>
>  * Repeats things from other documents e.g.  "the usual XML
>    processing rules" then includes some rules - should cite them

I would value feedback as to where I can do that, I take it you refer to
the rather laborious treatment of the range of freedom that an
implementation has in representing an XML Literal. I had a lot of
difficulty with finding documents to cite, because the XML specs
actually specify what you can do, whereas M&S didn't and hence anything
goes.

With early uniform normalization I have tried to articulate what charmod
means in an RDF context.
I don't really think it would suffice to simply say that we follow early
uniform normalization as given by charmod, without thinking it through,
and articulating that process.


--------------

I also remind everyone that xml:lang is a bit of an afterthought,
because the first WG did not consider internationalization issues until
they rushed it under pressure from the I18N WG. I have tried to give a
coherent account of how charmod applies to RDF Literals; I suspect it is
better to do it now, when we can take our leisure (despite my earlier
suggestions!).


Thanks for your feedback, if you can give it a more thorough attack I
would appreciate it.

Jeremy
Received on Wednesday, 26 September 2001 17:35:59 UTC