LEIRIs (from John Cowan)

[Forwarding to the list for John Cowan who is receiving email,
but not yet able to send.  pbg]

Here's my [John Cowan's] comments on LEIRIs.  

The current Internet-Draft says only this about LEIRIs:

7.1. LEIRI processing


   This section defines Legacy Extended IRIs (LEIRIs).  The syntax of
   Legacy Extended IRIs is the same as that for IRIs, except that the
   ucschar production is replaced by the leiri-ucschar production:

     leiri-ucschar  = " " / "<" / ">" / '"' / "{" / "}" / "|"
                      / "\" / "^" / "`" / %x0-1F / %x7F-D7FF
                      / %xE000-FFFD / %x10000-10FFFF

   Among other extensions, processors based on this specification also
   did not enforce the restriction on bidirectional formatting
   characters in Section 4.1, and the iprivate production becomes
   redundant.

   To convert a string allowed as a LEIRI to an IRI, each character
   allowed in leiri-ucschar but not in ucschar must be percent-encoded
   using Section 3.3.

This is consistent technically with the W3C LEIRI note.  It does not
define a specific LEIRI production, but tells you what LEIRIs look
like and by implication how to validate and parse them.

My only comment, which applies equally to the Note, is that the
characters %x0-%x7, %xB-%xC, and %xE-%x1F, and %x7F are permitted to
appear literally in LEIRIs, but not in XML documents either literally
or by NCRs.  I don't know if it's worth excluding them.  If we decide
to, the leiri-ucschar production would look like this:

     leiri-ucschar  = " " / "<" / ">" / '"' / "{" / "}" / "|"
                      / "\" / "^" / "`" / %x9 / %xA / %xD / %x80-D7FF
                      / %xE000-FFFD / %x10000-10FFFF

Received on Thursday, 1 July 2010 20:16:44 UTC