- From: Jirka Kosek <jirka@kosek.cz>
- Date: Fri, 27 Jun 2014 10:00:38 +0200
- To: Paul Grosso <paul@paulgrosso.name>, public-xml-core-wg@w3.org
- Message-ID: <53AD24A6.3080402@kosek.cz>
On 26.6.2014 17:44, Paul Grosso wrote: > Has anybody--or is anybody willing to--review this document? I have read document, although I haven't checked all boring decoding/encoding algorithms written in a HTML5 spec way. From XML point of view I see one possible problem here. Encoding spec says: "In particular, this specification defines the encodings, their algorithms to go from bytes to code points and back, and their canonical names and identifying labels. ... Historically encodings and their specifications (if any) were kept track of by the IANA Character Sets registry. This specification renders that registry obsolete." XML specification is not explicit about how encoding/decoding works. For UTF-* encodings it can be found in referenced Unicode standard, for other encodings (like ISO-8859-*, windows-125*) this is largely undefined. If we in a future decide to reference this Encoding spec in order to fix this, there is a problem -- Encoding spec defined both decoding and encoding. For encoding there is a special support for HTML in step 5 of encoding process: http://www.w3.org/TR/encoding/#concept-encoding-process This step guarantees that if some character is not available in output encoding it is replaced by numeric character reference. If we think that we might use Encoding in future, we can ask for similar feature also for XML. Second issue I have found is that "us-ascii" encoding is just alias for "windows-1252". This is correct mapping for decoding, but for encoding it's wrong, "us-ascii" encoder should halt once character with code point 128 and higher is present. In windows-1252 there 128 such characters. This problem was already marked as WONTFIX in Bugzilla, I reopened this issue (https://www.w3.org/Bugs/Public/show_bug.cgi?id=23646). Jirka -- ------------------------------------------------------------------ Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz ------------------------------------------------------------------ Professional XML consulting and training services DocBook customization, custom XSLT/XSL-FO document processing ------------------------------------------------------------------ OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep. ------------------------------------------------------------------ Bringing you XML Prague conference http://xmlprague.cz ------------------------------------------------------------------
Received on Friday, 27 June 2014 08:01:17 UTC