W3C home > Mailing lists > Public > public-html@w3.org > August 2008

Re: Validating XHTML5 with XML entities

From: Jeff Schiller <codedread@gmail.com>
Date: Wed, 27 Aug 2008 08:14:51 -0500
Message-ID: <da131fde0808270614y132c3de9j5807de1bb78f5e59@mail.gmail.com>
To: "Robert J Burns" <rob@robburns.com>
Cc: "HTML WG" <public-html@w3.org>

Hi Robert,

On Wed, Aug 27, 2008 at 2:04 AM, Robert J Burns <rob@robburns.com> wrote:
>>
>> Doing this seems to prohibit me from being able to generate valid
>> XHTML5 (at least the experimental portion of the W3C validator that
>> was recently added - html5.validator.nu does not complain).  Or maybe
>> XHTML5 is XML entites (among other things)?  Or maybe the DOCTYPE
>> definition could be expanded to optionally allow XML entities?
>
> Ideally, HTML5 will require XHTML5 UAs to support HTML character references
> without any entity declarations. Several in the WG expressed their
> opposition to such a norm, but have shown no research nor provided any
> arguments to support their assertions. I think they may just misunderstand
> the XML recommendation.

Which HTML character references are you talking about?  All 253 or whatever?

Since XML only defines 5 predefined entities, it seems that doing what
you're suggesting would turn XHTML5 into 'not XML'.  Even if all the
UAs patch to handle this (code is already there for HTML obviously),
wouldn't this be a big problem for other consumers that expect XHTML
to be XML?  I might be one of the people who misunderstand the XML
recommendation.

>
>> I'd appreciate some insight.  Yes, I can continue to hack on WordPress
>> and get it to emit "&#160;" instead of "&nbsp;" and then go through my
>> database and replace all instances for the last several years, but...
>
> Can't you have WordPress emit U+00A0, or are you using a charset encoding
> other than a UTF encoding.
>

Again, maybe I don't understand what you're suggesting.

I'm using UTF-8.  I can go through the WordPress source and change all
their PHP files that use &nbsp; &raquo; and &laquo; to their
equivalent numeric references but there are over 100 instances of
this.

I can create a ticket and submit a 100-line patch to the WP project,
but I'm worried that getting this accepted by the WordPress
powers-that-be will be challenging, especially considering my last few
patches that languished for months (and those patches prevented Yellow
Screens of Death - the XHTML equivalent of a 'segfault').  What are
the chances of a 100-line patch that has no observable user benefit
(since declaring these entities is a quick 3-line fix that can be done
by the theme creator)?

So if that patch doesn't get accepted (or it takes a long chunk of
time), then next time I upgrade to the new version of WP (happens
every 6 months or so), I have to remember to manually search/replace
those three entities.

Regards,
Jeff
Received on Wednesday, 27 August 2008 13:15:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:22 GMT