Re: Charmod review

>>>Jeremy Carroll said:
> I have now reviewed charmod.

I also, looking at RDF impact and Jeremy's comments

> I attach a summary of the things we would need to consider to conform with
> charmod.
> I do not propose we address these in our specs, but postpone this issue to
> RDF 2.

I wouldn't use the term RDF 2; but say, not in this working group
under the present charter.

> Here is my proposed comment on charmod.
> Given that they use some web based system for recording issues the actual
> form of any issues we raise may differ, but I suggest we discuss this text.
> 
> Upto and including the first paragraph of the body of the message looks
> like an e-mail, with the last two paragraphs each adding one issue to their
> issue list.
> 
> =========
> 
> The RDF Core WG has feedback concerning the following sections
> of charmod:
> 
> 1. Introduction
> 2. Conformance
> 3.4 Strings
> 3.5 Reference Processing Model
> 4. Early Uniform Normalization
> 6. String Identity Matching
> 8. Characeter Encoding in URI References
> 9. Referencing the Unicode Standard
> A.2 Other References
> C. Composing Characters
> D. Resources for Normalization
> 
> 
> {{ the other sections are not relevant to RDF }}
> 
> 
> Dave, please review section 9.
> 
> http://www.w3.org/TR/charmod#sec-RefUnicode

The syntax WD doesn't refer to Unicode.  The test cases WD ntriples
section needs updating to use the new wording.

> For the sections 1,2, 3.4, 4, 6, 9, C. D.
> RDF Core fully endorses the last call working draft.

That's rather sweeping.  What does it say about the other sections of
3 and the other sections that you don't mention?

In detail, most things are related to
ntriples http://www.w3.org/TR/rdf-testcases/#ntriples
or syntax http://www.w3.org/TR/rdf-syntax-grammar/

ntriples 
  charmod 1.2 -  Have to change to use U+hhhh format

  charmod 3.4 - the charmod Character String term is already used;
    should we remove this, all references to charmod?

  charmod 3.5 - the right range of Unicode characters is used

  charmod 3.6.1 - ntriples mandates US-ASCII which is forbidden by
    charmod, breaks the MUST in 2nd para. 

    The obvious fix would be to mandate utf-8, which would require an
    extra processing stage for ntriples before doing the processing
    of the resulting unicode characters.  The \u and \U escapes might
    remain?

  charmod 3.7 - mostly conforms here.  But the "escaped chars SHOULD
    be acceptable wherever unescaped chars are" isn't yet met since
    they aren't allowed in names such as blank node ids or comments.
    The last sentence notes this is SHOULD so isn't actually
    REQUIRED.

  charmod 4.2.1 - I have no idea which normalizing phrase we use here 
    - not include-normalized, given ntriples has no inclusion.

  charmod 4.3.2 - should U+0338 also be excluded from characters of an
    extended ntriples indentifer class (if it went to a UTF-8 encoding)?



syntax
  charmod 2 conformance - need we add a conformance statement?  even
     if not conforming to charmod?
  charmod 2:
    [I] Where this specification contains a procedural
	description, it MUST be understood as a way to specify the
	desired external behavior. Implementations MAY use other ways
	of achieving the same results, as long as observable behavior
	is not affected.
   may need emphasising although we already have:
       [[This document illustrates one way to create the N-Triples
         from the XML - any other method that results in the same
         N-Triples (RDF graph) may be used.]]
       -- http://www.w3.org/TR/rdf-syntax-grammar/#section-Introduction

  charmod 3.4 - the Character String term may need to be used in the
    future graph model section

  charmod 3.5 - reference processing model
    we conform to this since we are based on XML

  charmod 4.2.1 - I have no idea which normalizing phrase we use here

  charmod 4.4 -
    "[S] Specifications MUST document any security issues related to
    normalization."
    -- is this stuff we take from the internet draft or do we need to
    demonstrate / point to example of encoding URIs that look the
    same but are different?

  charmod 8 - char encoding in uri referenceso
     several things here to conform to
     do we really need to cite the IRI internet draft?
     I agree with Jeremy that this is too early.

  charmod 8 - fragment identifier (last para)
    since we define the fragment indentifer for RDF, this says we
    need to address the chars outside US-ASCII and how they are
    encoded in it.

    I agree with Jeremy - our specification is in accordance with
    (sufficient for?) handling IRIs but does not explicitly support
    them

  charmod 9 -
    xml 1.1 proposes changes to cites unicode terms for legal name
    chars.  I expect we aren't going to depend on 1.1 (also a WD) so
    does this need spelling out for, say RDF IDs?


General comments

charmod 4.3 is just baffling in the levels of inclusion, entities,
  escaping and normalizing.

charmod 4.3.1 sentences are very complex, some mentioning 3 types of
  normalizing and two parenthetical clauses in one sentence.

charmod 4.3.2 does not explain what '-' means?  No or 'N' to compare
  with 'Y'?

charmod 4.4 - I'd hate to have to explain this.  Some of these
  statements seem web architectural and will need a lot of buy-in

charmod 8 - charmod MUST NOT depend on the IRI I-D, and even if it
  was an RFC, needs lots more buy in before the MUST could be used.
  Charmod #2 maybe.

> For the section 3.5 we note that the language is somewhat offputting for us
> as specification developers given that our specification explicitly does
> not have a processing model. We have no particular suggestions about this,
> nor would we object if the I18N WG chose not to address this issue.

So this section should be labelled 'IF the spec has a processing model...'


> Our main concern is with section 8 (and the IRI reference in A.2),
> partciularly the requirement that specs "SHOULD use Internationalized
> Resource Identifiers (IRI) [I-D IRI]".
> The IRI draft is only a draft, the reference to it is not normative, and
> the strength of this SHOULD dependency appears excessive ("not optional").

+1

> In particular, the IRI draft does not
> adequately address IRI equality (not merely functional equivalence in
> retrieval).

I haven't review this but, it it doesn't at this draft stage we
definitely shouldn't use it, since equality is all we require.

> Moreover, the bidi section presents a learning curve which developers are
> unlikely to want to climb before IRI has consensus around it; We have found
> the text in Xlink section 5.4 and XML Erratum 26 adequately clear for some
> of the IRI questions, particularly those ...

Not sure what bidi impact you mean?  Is that the "logical order" requirement?

> ...
> that are most pressing for RDF and believe that charmod should merely:
> - reiterate such text;
> - reiterate the early uniform normalization model for the iris when
> regarded as unicode strings

Those seem sensible

> -  note that section 8 will be superseded by IRI-draft when it becomes a
> recommendation.

I do not recommend putting such future-guessing statements in docs.
If IRI becomes an RFC, charmod should make a new version (WD --> REC)

> I am independently raising some very minor editorial issues.
> 
> Jeremy

Jeremy - can you take my comments, merge and edit them to give some
consistent summary?

Dave

Received on Friday, 17 May 2002 07:41:47 UTC