Re: Entity for apostrophe?

Holger Wahlen (wahlen@ph-cip.Uni-Koeln.DE)
Mon, 28 Jul 1997 18:09:44 +0200


Date: Mon, 28 Jul 1997 18:09:44 +0200
Message-Id: <199707281609.AA12625@jupiter.ph-cip.Uni-Koeln.DE>
To: www-html@w3.org
From: wahlen@ph-cip.Uni-Koeln.DE (Holger Wahlen)
Subject: Re: Entity for apostrophe?

As a reaction to my thoughts about an entity for apostrophes,
Walter Ian Kaye <walter@natural-innovations.com> wrote:

| Is there a problem you are trying to solve? If so, what is it?

Let's take a short example:
	He said: 'It's 5' long.'
It contains "'" (&#39;) four times, but in each case this is
used to represent a different character: left and right
single quotation mark, apostrophe, and symbol for "feet". In
good typography, there are three or four different glyphs
used for these (depending on whether r.s.q.m. and apostrophe
share one). Nevertheless, while the 4.0 draft defines
	- "&lsquo;" and "&rsquo;", so that authors can tell the
browser where a "'" in the source is just a workaround for a
single quotation mark,
	- as well as "&prime;" to achieve the same for the foot
symbol,
there is no such thing for an apostrophe. That wouldn't be a
problem if browsers used the `curly' rendering by default
(which would probably have other disadvantages in turn), but
they don't; unsurprisingly, it's displayed as character 39,
the `straight' one, in the respective font. Problem thus: How
can I tell the browser, "this is an apostrophe, so if you
have it, take the curly glyph"? My idea is to define an
entity for apostrophes (just like it's done for other
punctuation marks already), so that this can be used for that
purpose:
	He said: &lsquo;It&apo;s 5&prime; long.&rsquo;


| That is purely a matter of presentation; HTML is about 
| structure.

Well, strictly speaking, it's also a matter of presentation
how questions are styled - still I have to write
	"&iquest;Si?"
instead of a construction like
	"<QUESTION LANG=es>Si</QUESTION>".
It's another matter of presentation how long dashes are and
whether they are surrounded by spaces - still the writer has
to type
	"ja&nbsp;&endash; nein"
in German or
	"yes&emdash;no"
in English himself, instead of being able to use something
like
	"<BODY LANG=de> ... ja <PAUSE> nein".
No, when it comes to punctuation, I wouldn't say that HTML in
its present form is only about structure. Perhaps it should
develop in that direction (if that's possible at all), okay -
but that's another question.

However, I don't even agree that the apostrophe thing is only
something presentational. I want a good way to tell the
browser which characters my text really consists of (as
opposed to those I'm restricted to when storing it in a
file), and I rather see this as information about the
document text itself than only about how it should be
displayed. Granted this also has an effect on presentation,
but that's just something implicit: I can't tell the browser
that my text contains a certain word without also giving the
information what characters it consists of (thus what glyphs
should be used for its presentation), can I?

Maybe another example helps to make clear what I'm aiming at.
"&auml;" in a file is my way to tell the browser, "it's not
actually contained `in person' in this file (because I can't
or don't want to use it), but the character at this position
here is an `a' umlaut". Now, it's common in German to write
"ae" on systems where &auml; isn't available; doing so in an
HTML file, though, means that nobody gets to see &auml;, not
even people with accent-capable systems. On the other hand,
if I use "&auml;", some systems can display the correct
character, while the others still have the possibility to use
"ae" instead - bingo!


|  > advantage, this also wouldn't cause problems for syntax
|  > checks that might compare the numbers of occurrences of
|  > "&lsquo;" and "&rsquo;" (as long as the entities are used
|  > `correctly', that is, of course).
| 
| And your use of "`" is correct? To me (and Adobe), that is a
| grave accent which you are [mis]using as a left single quote.
| That makes me shudder like fingernails on a chalkboard do. 
| Ick.

I'm not too happy about it either, but seeing that I can't
use left, right, or "low-9" single quote marks in e-mail
without problems, what's so bad about trying to get as close
as possible to how they look within the range of 7-bit
characters? Is that any worse than your typing "--" as a
replacement for a dash? Or than my using "'" both for
apostrophe and right s.q.m., which you haven't complained
about? Or than using it for these two and for left s.q.m. as
well, which would be the only other solution I can think of?
Or, finally, than using the same character, &quot;, both for
left and right double quotation marks?

Wondering,
	Holger
____  |__|   / Holger   //       mailto:wahlen@ph-cip.uni-koeln.de  ____
      |  |/|/  Wahlen  //  http://www.ph-cip.uni-koeln.de/~wahlen/