- From: John Lumley <john@saxonica.com>
- Date: Mon, 05 Aug 2013 14:30:03 +0100
- To: EXPath ML <public-expath@w3.org>
- Message-ID: <51FFA8DB.3080100@saxonica.com>
There is an outstanding issue about handling decoding errors when
decoding strings which will need some addressing. Such errors can occur
under the following circumstances:
1. The encoding is known but defined incorrectly (e.g. using UTF-8 when
UTF-16 was used to encode)
2. The length to decode wasn't 'complete', i.e. some hanging
multi-octet characters were incomplete
3. There was a phasing error at the start, i.e. the start point was not
at a code-point boundary.
We must assume that the decoding error can be detected of course. The
question then is what should be done, and whether any form of recovery
should be supported.
The simplest of course is to thow an error (which try/catch can field) -
but do we want to try and tell what the error is? In some cases the
'replacement character' can be substituted - this is especially true
with self-synchonising encodings such as UTF-8. But even then do we want
to signal the error, and if so, to where does the 'decode with
replacement character' string get returned? (In XSLT 3.0 we could build
a reporting structure that was bound to the $err:value variable...
XSLT-2.0 of course doesn't have a try)
Others in this community will have far more experience of this issue
than I, so I'd welcome your thoughts. Decoding error management does
need to be defined for this function
--
*John Lumley* MA PhD CEng FIEE
john@saxonica.com <mailto:john@saxonica.com>
on behalf of Saxonica Ltd
Received on Monday, 5 August 2013 13:30:25 UTC