- From: Anne van Kesteren <annevk@annevk.nl>
- Date: Fri, 29 Mar 2013 11:07:11 +0000
- To: Jirka Kosek <jirka@kosek.cz>
- Cc: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, www-international@w3.org
On Fri, Mar 29, 2013 at 10:52 AM, Jirka Kosek <jirka@kosek.cz> wrote: > "The storage size is expressed in bytes and is provided along with the > character set encoding and the line break type which will be used when > the content is stored." This does not really tell you if line breaks are normalized. > In XML content all line breaks are normalized to LF so only LF are > considered as line breaks (but in source XML file you can use other > representations of line breaks recognized by version of XML you use). If > you think that this still needs to be explicitly written in spec, we can > add note along those lines. XML 1.0 does not normalize U+0085. Neither does any other sane format. Is it converted here? What if combined with other line break characters? Also, XML 1.0 still allows insertion of CR via 
 which when you get to the DOM-level will be a U+000D (which is what you'd store presumably). As for insane formats, XML 1.1 has U+2028, should that be normalized for the purposes of storing? -- http://annevankesteren.nl/
Received on Friday, 29 March 2013 11:07:40 UTC