- From: T.V Raman <raman@google.com>
- Date: Wed, 12 Dec 2007 08:16:06 -0800
- To: davidc@nag.co.uk
- Cc: public-html@w3.org
As someone who came to html markup from the world of LaTeX, I strongly agree with David. We authored the XForms spec in xmlspec, and not having the bizarre HTML restriction of li elements not containing paragraphs made life a lot easier. David Carlisle writes: > > > Ian > > > However, this is far from a resolved issue. What would be especially > > useful is a study of use cases -- occurances where people mix inlines and > blocks, and why unmixed alternatives don't really address the needs of the > author. (Your blog would be a great start to look for such cases.) > > HTML has always stood out amongst marked up document formats in having a > very restricted content model for paragraphs that doesn't allow block > level markup. I always viewed div as "p with a fixed content model" > (which isn't really the intention of div, but a very plausible way of > using it.) > > docbook, TEI, the W3C's xmlspec markup all allow block level markup in > paragraphs, as does (La)TeX. > > Consider the following two paragraphs: > > The subject of this paragraph is the equation > E=mc^2 > where c denotes the speed of light. > > > I have a list of three things > 1, the first thing, > 2, the second thing and > 3, the third thing. > This list is not very interesting. > > > In the first case the paragraph consists of a single sentence: the > "where..." is not a new paragraph it doesn't want to be marked up as > <p class="no-indentation">.. > It is just the end of the sentence and the end of the paragraph, so > should be in the same block as the start of the sentence: > <p>The subject... > > > > The second case with a list is similar, although there at least the text > following the list is a different sentence. > > > Apart from forcing the users to mark up the text in a way that is at > variance with the intended meaning (you can't have a sentence that spans > two paragraphs, even if that sentence contains a quotation that itself > has block structure) this restricted content model causes many problems > when mapping from other markup languages to HTML (If html p is used to > model paragraphs. > > > See for example a recent comment on coming from the w3c's xmlspec markup. > > http://lists.w3.org/Archives/Public/spec-prod/2007OctDec/0002.html > > I made a couple of other fixes to the standard XSLT but the one that > probably needs doing the most that I haven't done is that a list inside > a paragraph in xmlspec generates invalid XHTML. > > > A lightning survey of other document types: > > DoocBook paragraphs > > http://www.docbook.org/tdg/en/html/para.html > > A Para is a paragraph. Paragraphs in DocBook may contain almost all > inlines and most block elements. > > > TEI pargraphs > > http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-p.html > > [ a bit inscrutable but note that the content model contains (at least) > lists as well as inline text] > > > XHTML2 paragraphs > > http://www.w3.org/TR/xhtml2/mod-structural.html#sec_8.6. > > In comparison with earlier versions of HTML, where a paragraph could > only contain inline text, XHTML2's paragraphs represent the > conceptual idea of a paragraph, and so may contain lists, > blockquotes, pre's and tables as well as inline text. > > > LaTeX paragraphs > [couldn't find a good URI to cite, but trust me the LaTeX system > goes to some lengths to support nested block structures such as > displayed mathematics and lists within a paragraph] > > > All of the above document types are commonly used for authoring and > converted to HTML for display. the propsoal to restrict the content > model for div but not follow XHTML2 in opening up the content model for > p makes that conversion significantly harder, and makes the resulting > HTML significantly less structurally useful as you need to introduce > spurious paragraphs together with extra CSS styling to supress any > typographic display that would normally be associated with a paragraph. > > > > David > [Ian, apologies, initial mail to you cc'ed list had mistyed address] > > ________________________________________________________________________ > The Numerical Algorithms Group Ltd is a company registered in England > and Wales with company number 1249803. The registered office is: > Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. > > This e-mail has been scanned for all viruses by Star. The service is > powered by MessageLabs. > ________________________________________________________________________ -- Best Regards, --raman Title: Research Scientist Email: raman@google.com WWW: http://emacspeak.sf.net/raman/ Google: tv+raman GTalk: raman@google.com, tv.raman.tv@gmail.com PGP: http://emacspeak.sf.net/raman/raman-almaden.asc
Received on Wednesday, 12 December 2007 16:17:21 UTC