- From: LeVan,Ralph <levan@oclc.org>
- Date: Thu, 7 Mar 2002 13:42:09 -0500
- To: www-zig@w3.org
The difference is that those other strings aren't important. They are icing, not cake. They can be dumbed down to 7-bit ASCII with no loss of functionality. Ralph > -----Original Message----- > From: Pieter Van Lierop [mailto:pvanlierop@geac.fr] > Sent: Thursday, March 07, 2002 11:41 AM > To: 'LeVan,Ralph'; www-zig@w3.org > Subject: RE: character encoding assumptions and approaches > > > I agree with you that these are the most important, but why make a > difference with other strings? > Examples: > ImplementationName in the Init = "Bibliothèque Française" > DatabaseName = "Périodiques" (this is French for Serials) > How should I interpret that string when the option bit is on? > Is this utf-8 > or not? If not, why not? And what is it then? > > Pieter > > > -----Message d'origine----- > > De : LeVan,Ralph [mailto:levan@oclc.org] > > Envoyé : jeudi 7 mars 2002 17:28 > > À : www-zig@w3.org > > Objet : RE: character encoding assumptions and approaches > > > > > > It should apply to the general Term in AttributesPlusTerm and the > > characterString Term in AttributesPlusTerm. > > > > The general Term is a special case and needs to be recognized > > as such in the > > description. General is an OctetString and could contain any > > random binary > > data. We must agree that when the utf-8 bit is on, that > > general will only > > be used for character data. If that isn't acceptable, then > > we're stuck with > > just characterString. > > > > The same use of general and characterString Terms applies in > > AttributesPlusTerm in the Scan request and TermInfo in the > > Scan response. > > It should also apply to the displayTerm and alternativeTerm > > in TermInfo. > > > > I'm open to other suggestions, but I believe this is sufficient. > > > > Ralph > > > > > -----Original Message----- > > > From: Ray Denenberg [mailto:rden@loc.gov] > > > Sent: Thursday, March 07, 2002 11:16 AM > > > To: www-zig@w3.org > > > Subject: Re: character encoding assumptions and approaches > > > > > > > > > "LeVan,Ralph" wrote: > > > > > > > Let's change the question slightly. Why should the > > > application know what > > > > kind of data it is returning? Why should it behave > > > differently for one kind > > > > of data than another? Did you know that there is text > > > embedded in JPEG > > > > files? > > > > > > Actually no, my format experts here tell me that jpeg > > > represents text as bits, > > > but they might be mistaken. In any case, certainly we > > > wouldn't expect conversion > > > to utf-8 in mixed-content or print-format (e.g. pdf, > > > postscript) files. > > > > > > > > > > I > > > > think you assume too much knowlege about MARC records and > > > should treat them > > > > like any other record format. > > > > > > If we define a utf-8 option bit, what do you think it should > > > apply to then? > > > > > > --Ray > > > > > >
Received on Thursday, 7 March 2002 13:42:49 UTC