- From: Johnston, Patrick - Hoboken <pjohnston@wiley.com>
- Date: Fri, 4 Dec 2015 18:38:29 +0000
- To: Sebastian Heath <sebastian.heath@gmail.com>, W3C Scholarly HTML CG <public-scholarlyhtml@w3.org>
- Message-ID: <463F4610-7948-4193-8DE5-5654C94D20D0@wiley.com>
I am not a big fan of SHOULDs, but I agree we should consider that UTF-8 perhaps doesn’t cover the breadth of scholarly research, in particular in the case of ancient or fictional languages (though Klingon is apparently unofficially supported). Rather than making it a SHOULD, I would say MUST unless a UTF-8 encoding is not openly available. An ancillary issue is that even though there are UTF-8 encodings for a lot of ancient languages, browsers don’t do much of a job of supporting them: http://www.fileformat.info/info/unicode/block/egyptian_hieroglyphs/utf8test.htm, so I assume that some consideration of polyfills is needed. (What is surprising, considering the geek chic factor, is that Klingon doesn’t get much traction either: http://www.wazu.jp/gallery/Test_Klingon.html.) p From: Sebastian Heath <sebastian.heath@gmail.com<mailto:sebastian.heath@gmail.com>> Date: Friday, December 4, 2015 at 11:53 AM To: W3C Scholarly HTML CG <public-scholarlyhtml@w3.org<mailto:public-scholarlyhtml@w3.org>> Subject: Re: Support for XHTML5 Resent-From: <public-scholarlyhtml@w3.org<mailto:public-scholarlyhtml@w3.org>> Resent-Date: Friday, December 4, 2015 at 11:53 AM My only further comment on the encoding issue is that I work in and with scholarly communities that have had trouble getting their glyphs into the unicode standard. Those are long stories with legit concerns on both sides; with the true obscurity of long "dead" alphabets being a factor, of course. Meaning, "SH SHOULD be UTF-8 and this document assumes that is the case in its examples and discussion" is a more welcoming approach than MUST. -Sebastian On Fri, Dec 4, 2015 at 11:47 AM, Silvio Peroni <silvio.peroni@unibo.it<mailto:silvio.peroni@unibo.it>> wrote: Hi Ivan, But I seem to be the only one worrying about that, so I don't mind backing away from it if it means we can make progress on the rest. No, you are not the only one worrying about that. I think it is perfectly fine to require that an SH would be in Unicode, and probably UTF-8 is the right way to go due to its widespread use. Yes, please! I don’t really care about HTML syntax vs. XHTML syntax compared with the encoding issue… The use of a mandatory encoding like UTF-8 is a very good requirement for having the minimum amount of troubles when processing SH documents – a.k.a., handling different encodings is a real nightmare. Brrr… Have a nice day :-) S. ---------------------------------------------------------------------------- Silvio Peroni, Ph.D. Department of Computer Science and Engineering University of Bologna, Bologna (Italy) Tel: +39 051 2094871<tel:%2B39%20051%202094871> E-mail: silvio.peroni@unibo.it<mailto:silvio.peroni@unibo.it> Web: http://www.essepuntato.it Twitter: essepuntato
Received on Friday, 4 December 2015 18:39:05 UTC