- From: <bugzilla@jessica.w3.org>
- Date: Fri, 05 Aug 2011 02:09:02 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=13676 KangHao Lu <kennyluck@w3.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kennyluck@w3.org --- Comment #1 from KangHao Lu <kennyluck@w3.org> 2011-08-05 02:09:01 UTC --- Assuming the intension of the current text is to count "\ud840\udc87" // as code-point length = 1 "\ud840+\udc87" // as code-point length = 3 (which isn't very clear as far as I can tell), I would suggest the spec to include a sentence like "Unpaired surrogates count as one code point each." (wording from [1]) Alternatively, it might be clearer to replace the sentence # The code-point length of a string is the number of Unicode code points in that string. by | The code-point length of a string is the number of Unicode characters after the string is converted to a sequence of Unicode characters[2]. This will then work for both a string of Unicode characters(theory) and a DOMString(reality), before the internal representation of the value of an input element[3] is made clear. Having said that, I am not convinced that defining @maxlength this way is the best, I tried to analyze other possibilities[4] but wasn't confident enough to file a bug (my preference is to count 16 bits) [1] http://download.oracle.com/javase/1,5.0/docs/api/java/lang/String.html (definition of String.codePointCount) [2] http://dev.w3.org/2006/webapi/WebIDL/#dfn-obtain-unicode [3] http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#concept-fe-value [4] http://lists.w3.org/Archives/Public/www-international/2011AprJun/0105 -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Friday, 5 August 2011 02:09:03 UTC