- From: Jeff Schiller <codedread@gmail.com>
- Date: Wed, 27 Aug 2008 12:49:42 -0500
- To: "Robert J Burns" <rob@robburns.com>
- Cc: "HTML WG" <public-html@w3.org>
Thanks Robert. Can you share more thoughts and/or address my other question concerning XHTML5 adopting all HTML entities? Regards, Jeff On 8/27/08, Robert J Burns <rob@robburns.com> wrote: > HI Jeff, > > On Aug 27, 2008, at 4:14 PM, Jeff Schiller wrote: > > > > Hi Robert, > > > > On Wed, Aug 27, 2008 at 2:04 AM, Robert J Burns <rob@robburns.com> wrote: > > > > > > > > > > > > > I'd appreciate some insight. Yes, I can continue to hack on WordPress > > > > and get it to emit " " instead of " " and then go through my > > > > database and replace all instances for the last several years, but... > > > > > > > > > > Can't you have WordPress emit U+00A0, or are you using a charset > encoding > > > other than a UTF encoding. > > > > > > > > > > Again, maybe I don't understand what you're suggesting. > > > > I'm using UTF-8. I can go through the WordPress source and change all > > their PHP files that use » and « to their > > equivalent numeric references but there are over 100 instances of > > this. > > > > I can create a ticket and submit a 100-line patch to the WP project, > > but I'm worried that getting this accepted by the WordPress > > powers-that-be will be challenging, especially considering my last few > > patches that languished for months (and those patches prevented Yellow > > Screens of Death - the XHTML equivalent of a 'segfault'). What are > > the chances of a 100-line patch that has no observable user benefit > > (since declaring these entities is a quick 3-line fix that can be done > > by the theme creator)? > > > > So if that patch doesn't get accepted (or it takes a long chunk of > > time), then next time I upgrade to the new version of WP (happens > > every 6 months or so), I have to remember to manually search/replace > > those three entities. > > > > Well, this isn't really the list to discuss WordPress development issues. > However, this is a problem that should be solved by WordPress by emitting > Unicode characters rather than named or numbered character entity > references. The reason to use character entity references is to facilitate > documents in non-UTF encodings (or perhaps where the author is concerned the > document will be converted or round-tripped through non-UTF encodings). For > pure UTF charset documents, it's advisable to simply use the literal > characters (and not references to them). Some like the source readability of > named character references, but that readability depends solely on the > reader's familiarity with the characters. If I'm a reader of a Cyrillic > script based language, I'm not going to find reading the source easier if > all of the characters are replaced with named references to the characters. > > In terms of your present problem, I don't know enough about WordPress. If > it cannot be fixed through configuration tweaks, it still is something that > is better handled in the long-term by WordPress through literal characters > rather than references. > > Take care, > Rob >
Received on Wednesday, 27 August 2008 17:50:24 UTC