SV: Z39.50 character encoding

Let's assume we just talk about the InternationalString PDU and it's
characterset, i.e. not anything in scope of records in the response of a
PresentRequest. What do the rest of you think of an idea of simply embedding
an XML document in the value as e.g.:

<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
Finally the discussion on charactersets is over as the solution to the
problem is handled by ordinary means in scope of XML

In this way all the charactersets supported by XML may be used and the
discussion on how to handle charactersets is over as it's just a matter of
the standard XML possibilities.

Best regards,

Henrik Dahl

-----Oprindelig meddelelse-----
Fra: []På vegne af
Sendt: Friday, March 01, 2002 2:49 PM
Emne: RE: Z39.50 character encoding

UTF-8 is the default characterset for XML.  It is possible to specify a
different characterset.


> -----Original Message-----
> From: Alan Kent []
> Sent: Thursday, February 28, 2002 7:35 PM
> To:
> Subject: Re: Z39.50 character encoding
> On Thu, Feb 28, 2002 at 09:13:20AM -0500, Johan Zeeman wrote:
> > DC by itself is not a record syntax; it is a list of data
> elements.  To be a
> > record syntax, the data elements need to be encoded using
> some scheme.  The
> > one I know about is XML.  And XML explicitly uses UTF-8.
> >
> > j.
> Just to clarify, do you mean the XML record syntax in Z39.50
> explicitly
> uses UTF-8? XML itself certainly *does not* explicitly use UTF-8.
> That is simply what is common. People do use other encodings with
> XML (UTF-16 for example is completely valid and in usage - for
> example when using Chinese or other scripts, UTF-16 encoded files
> are much smaller than the same UTF-8 encoded files).
> I was just curious (without re-reading the XML record syntax) whether
> it was a Z39.50 decree that the XML record syntax mandates
> UTF-8 encoding.
> Thanks,
> Alan

Received on Friday, 1 March 2002 10:09:41 UTC