W3C home > Mailing lists > Public > public-tt@w3.org > September 2016

[ttml2] Specify attribute term delimiter as post-normalized space

From: Nigel Megitt via GitHub <sysbot+gh@w3.org>
Date: Thu, 29 Sep 2016 15:27:04 +0000
To: public-tt@w3.org
Message-ID: <issues.opened-180081771-1475162823-sysbot+gh@w3.org>
nigelmegitt has just created a new issue for 
https://github.com/w3c/ttml2:

== Specify attribute term delimiter as post-normalized space ==
See also #185 and #170 for background: the current use of `<lwsp>` 
permits white space even though XML attribute normalization would 
remove leading and trailing white space and replace intermediate 
strings of white space with a single `#x20` character. My proposal for
 this was to replace `<lwsp>` with `<nsp>` where:

`<nsp>: #x20 after applying the normalization rules in [1]`

[1] https://www.w3.org/TR/REC-xml/#AVNormalize

Right now, traversing all the links from 
https://w3c.github.io/ttml2/spec/ttml2.html#reduced-infoset-attribute 
through the term definition and the reference into 
https://www.w3.org/TR/2004/REC-xml-infoset-20040204/#infoitem.attribute
 , we already specify attribute values in terms of normalized values 
in the reduced infoset, so the use of `<lwsp>` is actually rather 
difficult to achieve - anything other than a single #x20 character 
would have to be escaped. However it _is_ possible to escape those 
characters. I do not know why that would be useful.

Some (non-mutually-exclusive) proposals to allow for simpler 
implementations:

* Add an informative note that the processing of XML normalized 
attribute values may limit the type of character that could appear in 
linear white space.
* Add feature designators to indicate that processors handle/do not 
handle escaped whitespace characters that pass through the 
normalization process, and that documents contain/do not contain such 
escaped whitespace characters.
* Add an additional requirement to de-escape escaped whitespace 
characters prior to the XML attribute value normalization process so 
that the resulting information set never has leading or trailing 
whitespace and always has exactly one `#0x20` character between terms.


Please view or discuss this issue at 
https://github.com/w3c/ttml2/issues/191 using your GitHub account
Received on Thursday, 29 September 2016 15:27:14 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:44:01 UTC