W3C home > Mailing lists > Public > www-international@w3.org > January to March 2013

Re: I18N-ISSUE-247: Clarify interpretation of line breaks when calculating storage size [ITS-20]

From: Jirka Kosek <jirka@kosek.cz>
Date: Fri, 29 Mar 2013 14:15:34 +0100
Message-ID: <515593F6.1090804@kosek.cz>
To: Anne van Kesteren <annevk@annevk.nl>
CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, www-international@w3.org
On 29.3.2013 13:55, Anne van Kesteren wrote:
> On Fri, Mar 29, 2013 at 12:42 PM, Jirka Kosek <jirka@kosek.cz> wrote:
>> I probably wasn't understanding what you mean by  &#x0D; in your
>> original message. Of course plain  &#x0D will be considered as a line
>> break.
> 
> No, &#x0D; ends up as U+000D in the DOM, not as U+000A:

Indeed, you are right I shouldn't be trying to think about encodings so
shortly after thick lunch. :-)

> data:text/xml,<html
> xmlns="http://www.w3.org/1999/xhtml"><i>&%23x0D;</i><script>alert(encodeURI(document.querySelector("i").textContent))</script></html>

So in this case &#x0D; (represented as U+000D) will not be considered as
a line break in my proposal.

> A raw U+000D does become U+000A:
> 
> data:text/xml,<html
> xmlns="http://www.w3.org/1999/xhtml"><i>%0D</i><script>alert(encodeURI(document.querySelector("i").textContent))</script></html>

This will be considered as a line break. Which I think is consistent
with what I have proposed:

"For purposes of storage size calculations ITS processor MUST behave as
if line ends were normalized accordingly to
http://www.w3.org/TR/REC-xml/#sec-line-ends (or to
http://www.w3.org/TR/xml11/#sec-line-ends if XML 1.1 is used) and only
LINE FEED (U+000A) character is then considered as a line break."

So now I'm confused what's not clear with this. XML document with ITS
markup is parsed as XML document and what is U+000A in this parsed
document is considered as a line break for purposed of storage size
calculations.

				Jirka

-- 
------------------------------------------------------------------
  Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz
------------------------------------------------------------------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
------------------------------------------------------------------
    Bringing you XML Prague conference    http://xmlprague.cz
------------------------------------------------------------------


Received on Friday, 29 March 2013 13:15:59 UTC

This archive was generated by hypermail 2.3.1 : Friday, 29 March 2013 13:16:00 UTC