W3C home > Mailing lists > Public > public-html-commits@w3.org > October 2011

html5/spec Overview.html,1.5331,1.5332

From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
Date: Thu, 06 Oct 2011 23:37:59 +0000
To: public-html-commits@w3.org
Message-Id: <E1RBxVn-0005YS-BB@lionel-hutz.w3.org>
Update of /sources/public/html5/spec
In directory hutz:/tmp/cvs-serv21329

Modified Files:
	Overview.html 
Log Message:
Define 'Unicode code point'. (whatwg r6650)

Index: Overview.html
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.5331
retrieving revision 1.5332
diff -u -d -r1.5331 -r1.5332
--- Overview.html	6 Oct 2011 23:33:20 -0000	1.5331
+++ Overview.html	6 Oct 2011 23:37:54 -0000	1.5332
@@ -2716,14 +2716,16 @@
   specification: a 16 bit unsigned integer, the smallest atomic
   component of a <code>DOMString</code>. (This is a narrower
   definition than the one used in Unicode.) <a href="#refsWEBIDL">[WEBIDL]</a><p>The term <dfn id="unicode-character">Unicode character</dfn> is used to mean a <i title="">Unicode scalar value</i> (i.e. any Unicode code point that
-  is not a surrogate code point). <a href="#refsUNICODE">[UNICODE]</a><p>The term <dfn id="character">character</dfn>, when not qualified as
-  <em>Unicode</em> character, means a <a href="#unicode-character">Unicode character</a>
-  where possible, or a surrogate code point when not: when an
-  algorithm that processes strings is defined in terms of characters,
-  a pair of <a href="#code-unit" title="code unit">code units</a> consisting of a
-  high surrogate followed by a low surrogate must be treated as a
-  single character, but isolated surrogates must each be treated as a
-  single character also.<p>The <dfn id="code-point-length">code-point length</dfn> of a string is the number of
+  is not a surrogate code point). <a href="#refsUNICODE">[UNICODE]</a><p>The term <dfn id="unicode-code-point">Unicode code point</dfn> means a <a href="#unicode-character">Unicode
+  character</a> where possible, and an isolated surrogate code
+  point when not. When a conformance requirement is defined in terms
+  of characters or Unicode code points, a pair of <a href="#code-unit" title="code
+  unit">code units</a> consisting of a high surrogate followed by a
+  low surrogate must be treated as the single code point represented
+  by the surrogate pair, but isolated surrogates must each be treated
+  as the single code point with the value of the surrogate.<p>In this specification, the term <dfn id="character">character</dfn>, when not
+  qualified as <em>Unicode</em> character, is synonymous with the term
+  <a href="#unicode-code-point">Unicode code point</a>.<p>The <dfn id="code-point-length">code-point length</dfn> of a string is the number of
   <a href="#code-unit" title="code unit">code units</a> in that string.<p class="note">This complexity results from the historical decision
   to define the DOM API in terms of 16 bit (UTF-16) <a href="#code-unit" title="code
   unit">code units</a>, rather than in terms of <a href="#unicode-character" title="Unicode character">Unicode characters</a>.<h3 id="conformance-requirements"><span class="secno">2.2 </span>Conformance requirements</h3><p>All diagrams, examples, and notes in this specification are
Received on Thursday, 6 October 2011 23:38:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 6 October 2011 23:38:01 GMT