Re: Embedded Content

Indeed, UTF-16.  I'm behind the times :)  s/UTF-8/UTF16/ in my concern 4.

Please correct me if I'm wrong, but you can't embed UTF-16 inside a
serialization that's otherwise UTF-8 or another encoding.
So my concern 3 is that even if it's specified, it's almost certainly
*wrong* as which ever system created the serialization that the client is
seeing at that point, must have encoded the content somehow.

What happens when there's multiple characterEncodings, one for UTF-8, one
for UTF-16 and one for some archaic windows code page?

Rob


On Tue, Oct 14, 2014 at 9:58 AM, Doug Schepers <schepers@w3.org> wrote:

> Hi, Rob–
>
> Small comment...
>
> On 10/14/14 12:08 PM, Robert Sanderson wrote:
>
>>
>> And my concerns with characterEncoding:
>>
>> 4.  The encoding should be UTF-8 regardless.
>>
>
> UTF-8 is not always sufficient. For some Asian scripts, you sometimes need
> UTF-16.
>
> I'm not sure where that fits into your conceptual model, but we need to
> take it into account.
>
> Regards-
> -Doug
>



-- 
Rob Sanderson
Technology Collaboration Facilitator
Digital Library Systems and Services
Stanford, CA 94305

Received on Tuesday, 14 October 2014 17:03:15 UTC