Re: Best way to markup standards compliant symbols from Christophe Strobbe on 2009-03-20 (w3c-wai-ig@w3.org from January to March 2009)

From: Christophe Strobbe <christophe.strobbe@esat.kuleuven.be>
Date: Fri, 20 Mar 2009 20:38:00 +0100
To: w3c-wai-ig@w3.org
Message-Id: <6.2.5.6.2.20090320200310.03b70ae0@esat.kuleuven.be>
Hi,

At 19:19 20/03/2009, Harry Loots wrote:
> > I disagree. You only need entity references for characters that have
> > a special meaning in markup, as David Dorward pointed out.
>
>Here's another use for entities:
>To display the Euro symbol if you do not have a Euro symbol on your keyboard
>(most of us don't; and the same applies to the GB Pound Sterling symbol).

That's an input issue but not a reason to say that web pages need to 
contain these characters as character references.
My keyboard doesn't have keys for Chinese characters but I've been 
happily writing Chinese without numeric entity references for some 
time now. All you need is an input method. If something isn't on your 
keyboard, you need another method but the output of that method 
needn't be an entity reference. My input method generates Chinese 
characters directly, not character references, and that works just fine.


>There are only four character entities that exist within the seven-bit ASCII
>range - the HTML reserved characters, < (&lt;), > (&gt;), & (&amp;), "
>(&quot;). The remainder, unless you specify 'charset' will at best be a
>hit-and-miss affair. And most people's pages do not include the Content-Type,
>charset attributes.

Yep, authors need to be educated about this.



> > If it were true that every character outside the seven-bit ASCII
> > range, then millions of web pages in writing systems other that Latin
> > would be encoded incorrectly.
>
>This is entirely possible for one person or another. If my default browser
>setting is ISO-8859 and the page was saved as UTF-8 or Windows-1252 or
>whatever (without the charset being specified in the code), then it is likely
>that i will have strange characters appear in my browser. The answer is to
>specify character set; then also do your audience a favour and convert
>non-ASCII characters, including the four mentioned above to entities (HTML-Kit
>and other programes will do this on your behalf). That way you can be certain
>that the end user see the quote and the pound symbol where intended.

The most common browsers (and some less common ones) can auto-detect 
encodings. If the document encoding and the charset in the HTTP 
headers match up, do you still encounter any problems? (I've played 
around with encodings etc too much to remember what any browser does 
by default.)
I've noticed that when non-ASCII characters are encoded as character 
references, those characters won't be misrepresented when, for 
example, UTF-8 content is interpreted as Windows-1252. But if 
encoding all those characters as character references were a 
guideline, then all content in, for example, Chinese would need to be 
encoded, which is not what authors of Chinese web content do. That's 
why I found your advice out of sync with current practice.

By the way, the I18N FAQ on Using Character Entities and NCRs states:
"It is almost always preferable to use an encoding that allows you to 
represent the characters in their normal form, rather than using 
character entity references or NCRs."
<http://www.w3.org/International/questions/qa-escapes#not>


(I used to be a fan of HTML-Kit but stopped using it because I wanted 
Unicode support; I might reconsider it now.)

> > >It is not irrelevant to accessibility as lack of 
> inter-operability may lead to
> > >inaccessible pages.
> >
> > It would be more precise to say that it is a problem that affects
> > every type of user; it's not specific to people with disabilities.
>
>When did accessibility become the sole property of people with disabilities?
>If i am unable to view the page, for whatever reason, then the page is by
>definition inaccessible to me as a user.

On a list like this one, the primary meaning of accessibility is 
"accessibility for people with disabilities", at least in my mind. 
Hence the mismatch.

Best regards,

Christophe Strobbe



>Warm regards
>Harry

-- 
Christophe Strobbe
K.U.Leuven - Dept. of Electrical Engineering - SCD
Research Group on Document Architectures
Kasteelpark Arenberg 10 bus 2442
B-3001 Leuven-Heverlee
BELGIUM
tel: +32 16 32 85 51
http://www.docarch.be/
---
Please don't invite me to LinkedIn, Facebook, Quechup or other 
"social networks". You may have agreed to their "privacy policy", but 
I haven't.


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Received on Friday, 20 March 2009 19:38:57 UTC