- From: John Lumley <john@saxonica.com>
- Date: Mon, 05 Aug 2013 14:30:03 +0100
- To: EXPath ML <public-expath@w3.org>
- Message-ID: <51FFA8DB.3080100@saxonica.com>
There is an outstanding issue about handling decoding errors when decoding strings which will need some addressing. Such errors can occur under the following circumstances: 1. The encoding is known but defined incorrectly (e.g. using UTF-8 when UTF-16 was used to encode) 2. The length to decode wasn't 'complete', i.e. some hanging multi-octet characters were incomplete 3. There was a phasing error at the start, i.e. the start point was not at a code-point boundary. We must assume that the decoding error can be detected of course. The question then is what should be done, and whether any form of recovery should be supported. The simplest of course is to thow an error (which try/catch can field) - but do we want to try and tell what the error is? In some cases the 'replacement character' can be substituted - this is especially true with self-synchonising encodings such as UTF-8. But even then do we want to signal the error, and if so, to where does the 'decode with replacement character' string get returned? (In XSLT 3.0 we could build a reporting structure that was bound to the $err:value variable... XSLT-2.0 of course doesn't have a try) Others in this community will have far more experience of this issue than I, so I'd welcome your thoughts. Decoding error management does need to be defined for this function -- *John Lumley* MA PhD CEng FIEE john@saxonica.com <mailto:john@saxonica.com> on behalf of Saxonica Ltd
Received on Monday, 5 August 2013 13:30:25 UTC