- From: Smylers <Smylers@stripey.com>
- Date: Mon, 21 Apr 2008 17:20:31 +0100
- To: whatwg@lists.whatwg.org, public-html@w3.org
Jens Meiert writes: > > The point of <abbr> is to expand the acronym, not to just mark up > > what is an acryonym or abbreviation. > > Doesn't this claim that the general information that some text is an > abbreviation (w/o an expanded form) is basically useless? Well it's very close to being useless. In that if browsers don't do anything with some mark-up, there's no point in having it (and indeed no incentive for authors to provide it). The point of annotating an abbreviation with its expansion is not to mark up the abbreviation _per se_; it's to provide browsers with what the expansion is, so that they can display it. Sure, all instances of just using abbreviations _could_ be marked up. Equally we could mark up verbs, proper nouns, words that score over 30 in Scrabble, palindromes, words that can be written upside-down on calculators, words defined in the Oxford English Dictionary ... There's almost no limit to how text could be marked up to have _some_ use in a particular niche. But that isn't what HTML 5 is going to cater for. > And is "<abbr>ISS</abbr>" not more useful since less ambiguous than > "ISS" (same abbreviation) and "ISS" (German imperative for "to eat" in > capitals) Yes, that is potentially ambiguous. But it's the same in books, newspapers, and so on, where it turns out not to be much of a problem. Human beings tend to be pretty good at working things out from context. For example in an article which has previously mentioned the International Space Station (and possibly also put "ISS" in brackets after it) readers are going to recognize further uses of "ISS". Parts of speech also provide a clue ("iss" being an imperative only makes sense in certain places in a sentence), as does its being in all-caps -- yes, any word _can_ be written in upper-case, but it's unusual to find one in the middle of a sentence; humans are used to it being an indicator of an abbreviation. Further, distinguishing abbreviations from upper-case-words is far from the only ambiguity in writing: * Words are quite capable of being ambiguous on their own, without any abbreviations in the vicinity. For example "entrance" can be the place where one enters a building, or the action of putting somebody in a trance. * The same abbreviation is often used for different terms (though often in quite distinct fields). Marking something up as being an abbreviation without giving the expansion wouldn't be any use here. Why should HTML 5 bother to solve the very narrow case of disambiguating words from abbreviations, but not solve it more generally to include the other cases? > and be it just for AT, (See, you just used "AT" there! That _could_ be the English word "at" written in capitals. It _could_ be a reference to automatic transmission. But readers of this list successfully work out what you were referring to; in practice it isn't ambiguous.) What in practice would you expect AT to do with this knowledge? Remember that most abbreviations that aren't being tagged with expansions won't be marked up, so AT is going to have to deal sensibly with that case anyway. > pronunciation Human languages already have many quirks of pronunciation. Speaking browsers have to cope with these heuristically, without help from the mark-up indicating how to pronounce, say, "entrance". (As is speaking software that reads out, say, e-mails or word processor documents -- text which doesn't have any underlying mark-up.) Also note that an ordinary word such as 'iss' likely shouldn't be in capitals in the HTML source anyway. If the capitals are wanted for emphasis then it should be written <em>iss</em>, with CSS being used to remove the italics and up-case the text. Are mis-pronounced abbreviations really a significant proportion of mis-pronounced words by speaking browsers? > and a scent of semantics? And, what would the point of such a scent be? Why would it be more useful than the scent provided by tagging all verbs with <verb>? Smylers
Received on Monday, 21 April 2008 16:21:00 UTC