Sentence element (Was: [XHTML2] Unicode line and paragraph separators)

Philip wrote on Monday, April 7, 2003 at 12:10:12 PM:

> Thus I would support M. Cline's argument that <sentence> ...
> </sentence> is arguably as important as <p> ... </p>, even though
> HTML has ignored the concept ever since its inception.

(The following paragraph is a bit hard to read, but that's sort of the
point.)

<p><s>The problem is partly practical in nature.</s> <s>Even with a
minimum element name (in this case "s"), a sentence element inflates
document size significantly (as opposed to insignificantly), and more
importantly, vastly increases the complexity of writing a simple
paragraph.</s> <s>The difficulty of editing by hand is at least
doubled, and if you think I'm exaggerating you should give it a
try.</s> <s>The difficulty in (correctly) marking up every sentence in
even a short paragraph, not to mention the difficulty in reading, is
enough to rule out a required sentence element.</s> <s>Should we force
authors to use special software just to read and write a supposedly
simple language well?</s> <s>I hope not.</s> <s>What about an
<em>optional</em> sentence element?</s> <s>There's also a conflict
with (X)HTML's stated purpose, which is to be a <em>simple</em>
language.</s> <s>A sentence element has no place in simple language,
because it complicates things greatly without adding proportionate
value.</s> <s>If you absolutely require a sentence element, I humbly
suggest XHTML is not the language for you.</s></p>

Even if XML didn't require braindead end tags, and even without
considering the disproportionately small value a sentence element
brings, the above would be far too unwieldy and complex for XHTML. The
usual method of ending sentences in normal text (which isn't marked
up) is all a human requires to determine the end of a sentence. For
machines the problem is still not perfectly solvable, but that doesn't
matter. The only benefit so far put forth is the ability to simulate
the oft-ridiculed practice of separating sentences with two spaces
instead of one. That's just not a compelling enough reason to add a
sentence element. I can think of a few fun CSS tricks, but nothing
more than trivial. To my knowledge, there is not a single printed book
on my shelf that treats sentences any differently than if they were
not marked up in an HTML document. Even if there is a small amount of
utility to be gained, it is absolutely not common, so it shouldn't be
in XHTML.

If you were really determined, you could mark up sentences with span
elements. It would be valid since there is no more appropriate
element; although the markup would be truly mammoth with a class name.
I can't recall more than once seeing a document marked up in such a
way. While I don't think such a practice would be exactly common,
since it's even more unwieldy than a minimum-length element name, I
wouldn't expect it to be unheard of.

-- 
John Lewis

Received on Wednesday, 9 April 2003 01:41:28 UTC