Re: [Editorial] xml:id LC - 4 Processing xml:id Attributes from Norman Walsh on 2005-01-11 (public-xml-id@w3.org from January 2005)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Tue, 11 Jan 2005 14:47:01 -0500
To: Karl Dubost <karl@w3.org>
Cc: public-xml-id@w3.org
Message-id: <87hdln7npm.fsf@nwalsh.com>

/ Karl Dubost <karl@w3.org> was heard to say:
| Hi,
|
| http://www.w3.org/TR/2004/WD-xml-id-20041109/#processing
|
| [[[
|
|  1. The attributeʼs value is normalized according to the rules
| for attribute-value normalization on attributes of type ID. For xml:id
| processors operating on an infoset or some other output from an XML
| parser, the value will already be normalized, but unless the parser
| normalized it as a value of type ID it will still be necessary for the
| processor to trim leading and trailing space (#x20) characters and
| replace sequences of space characters by a single space.
| ]]]
|
| I encourage Daniel Veillard to translate this sentence in French to
| see the difficulties of understanding it.
|
| Please make clear, short and simple statements. The sentence is too
| long and we loose the meaning of it in the middle.

I've changed the offending paragraph to:

  The attribute's value is normalized according to the rules for
  attribute-value normalization on attributes of type ID. For more
  details, see E Attribute Value Normalization on IDs. The infoset
  [normalized value] property is updated with the normalized value.

And added a new informative appendix spelling out the situation in
more detail:

  Parsers are required to normalize all attribute values.
  Normalization expands character references, expands entity
  references, and cleans up line end characters. Attributes of type ID
  are subject to additional normalization rules: removing leading and
  trailing whitespace and replacing sequences of spaces with a single
  space.

  The xml:id processor has to assure that both kinds of normalization
  are performed all attributes named xml:id. In particular, the parser
  may not have performed the additional normalization required for
  attributes of type ID because the attribute may not be declared or
  may be declared as an ID.

  Consider the following document:

  <!DOCTYPE doc [
  <!ATTLIST doc xml:id ID #IMPLIED>
  ]>
  <doc xml:id="  one
  ">
  <para xml:id="  two
  "></para>
  </doc>

  The initial value of xml:id on doc will be \u201cone\u201d because
  the parser knew that it was an ID. The initial value on para will be
  \u201c two \u201d. Because the parser didn't know it was an ID, it
  will not have performed the additional normalizations required.

  After xml:id processing, the value of the xml:id attributes on doc
  and para will be \u201cone\u201d and \u201ctwo\u201d, respectively.
  These properly normalized values will be stored in the [normalized
  value] property in the infoset. Performing xml:id processing changes
  the infoset if there are incompletely normalized xml:id attributes.

Please let me know if this resolution is satisfactory.

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.

Received on Tuesday, 11 January 2005 19:47:39 UTC