[whatwg] sic element

?ann ?ri  2.?g? 2011 09:04, skrifa?i Henri Sivonen:
> On Fri, 2011-07-29 at 22:39 +0000, Ian Hickson wrote:
>>> Presentational markup may convey useful information, for example that a
>>> quotation from printed matter contains an underlined word.
>>
>> HTML is the wrong language for this kind of thing.
>
> I disagree. From time to time, people want to take printed matter an
> publish it on the Web. In practice, the formats available are PDF and
> HTML. HTML works more nicely in browsers and for practical purposes
> works generally better when the person taking printed matter to the Web
> decides that the exact line breaks and the exact font aren't of
> importance. They may still consider it of importance to preserve bold,
> italic and underline and maybe even delegate that preservation to OCR
> software that has no clue about semantics. (Yes, bold, italic and
> underline are qualitatively different from line breaks and the exact
> font even if you could broadly categorize them all as presentational
> matters.)
>
> I think it's not useful for the Web for you to decree that HTML is the
> wrong language for this kind of thing. There's really no opportunity to
> launch a new format precisely for that use case. Furthermore, in
> practice, HTML already works fine for this kind of thing. The technical
> solution is there already. You just decree it "wrong" as a matter of
> principle. When introducing new Web formats is prohibitively hard and
> expensive, I think it doesn't make sense to take the position that
> something that already works is "the wrong language".
>
So you're arguing that a subset of HTML should be favored over 
presentational markup languages for marking up digital retypes of 
printed matter, with <b>, <i>, <u>, <font>, <small> and <big> be 
redefined to their HTML 3 typographical meanings. And perhaps 
<blockquote> standardized to mean indent.
If you simply retype print without any interpretation of the typography 
used, a valid speech rendering would e.g. cue bold text with "bold" and 
"unbold" marks to convey the meaning: this text was bold. The current 
definition of <b> does not exactly hint at such renderings.
If all you want is to suggest original typographic rendering, then (save 
for Excerpt/Blockquote, Nofill/Pre and Lang/@lang) CSS does the job, 
better - and is vastly more powerful.
> I think the reason why Jukka and others seem to be confused about your
> goals is that your goals here are literally incredible from the point of
> view of other people. Even though you've told me f2f what you believe
> and I want to trust that you are sincere in your belief, I still have a
> really hard time believing that you believe what you say you believe
> about the definitions of<b>,<i>  and<u>. When after discussing this
> with you f2f, I still find your position incredible, I think it's not at
> all strange if other people when reading the spec text interpret your
> goals inaccurately because your goals don't seem like plausible goals to
> them.
>
> If if the word "presentational" carries too much negative baggage, I
> suggest defining<b>,<i>  and<u>  as typographic elements on visual
> media (and distinctive elements on other media) and adjusting the
> rhetoric that HTML is a semantic markup language to HTML being a mildly
> semantic markup language that also has common phrase-level typographic
> features.
>
The problem is that the facts that something was written underlined, 
spoken with a stress and that styles guides recommend underlining the 
text when printed to convey it's semantics are not all equal. They might 
all be conveyed in print by underlining the text, but the semantics 
differ and thus each needs an element of it's own. Much as authors must 
use <ol>, <ul> and <blockquote> to convey their defined meanings, even 
though some UAs might render all of them the same way.

Received on Tuesday, 2 August 2011 06:10:01 UTC