- From: Johan Zeeman <joe.zeeman@tlcdelivers.com>
- Date: Thu, 28 Feb 2002 09:13:20 -0500
- To: <www-zig@w3.org>
----- Original Message ----- From: "Liv Aasa Holm" <Liv.A.Holm@jbi.hio.no> To: <www-zig@w3.org> Sent: Thursday, February 28, 2002 1:59 AM Subject: RE: Z39.50 character encoding > Most MARC formats do NOT specify a character set. Does DC? I have at least > not seen it, but perhaps it is implisit? Which makes "most" MARCs even more broken than MARC21, with which at least you know what the character set is. Otherwise you have to intuit the character set, and computers have not generally been praised for their intuitive powers. UNIMARC certainly specifics character sets in considerable detail (basically, unless the record specifies something else using ISO 2022 mechanisms, it's ASCII [ISO 646 IRV, actually]). In fact, I suspect that the assertion that "most" MARC formats do not specify a character set is incorrect. They may specify one by reference to some other MARC, but they will specify a character set. It may also be that ISO 2709 specifies a default character set--I don't have the standard to hand to check. DC by itself is not a record syntax; it is a list of data elements. To be a record syntax, the data elements need to be encoded using some scheme. The one I know about is XML. And XML explicitly uses UTF-8. j. > > Liv > > ===== Original Message from Ray Denenberg <rden@loc.gov> at 27.02.02 19:12 > >Mike Taylor wrote: > > > >> Some kinds of object (e.g. USMARC) specify a character set, and > >> others (GRS-1) do not. Those which do, we must respect. > > > >True, some do and some don't. > > > >Two questions we need to answer before we go much further (and I think we > >need help from the experts on these): > > > >(1) Is is clear exactly which do and which don't? > >(2) For those which "do", is it always the case that these will be > >transfered according to the native character encoding or is it likely that > >clients will want records in utf-8, even in the case where the format > >specifies a native encoding? > > > >And I think (1) is the more important question. We can address (2) later. > > > >In other words for any given format, is it always implicitly known to both > >parties (client and server) whether or not the format comes with a native > >encoding. If so then our problem is simplified. But if not, then I'm afraid > >Mike's philosophy "Those which do, we must respect" isn't going to work in > >practice. > > > >--Ray > ===== Comments by Liv.A.Holm@jbi.hio.no (Liv Aasa Holm) at 28.02.02 07:58 > > ******************************************************* > Liv A. Holm > associate professor > Oslo University college > faculty of journalism, library and information science > tel. +47-22-45-27-77 > fax.:+47-22-45-26-05 > *******************************************************
Received on Thursday, 28 February 2002 09:14:33 UTC